Skip to content

xds: implement :authority header rewriting in xds_cluster_impl LB policy (gRFC A81)#8779

Merged
Pranjali-2501 merged 9 commits intogrpc:masterfrom
Pranjali-2501:clusterimpl
Jan 5, 2026
Merged

xds: implement :authority header rewriting in xds_cluster_impl LB policy (gRFC A81)#8779
Pranjali-2501 merged 9 commits intogrpc:masterfrom
Pranjali-2501:clusterimpl

Conversation

@Pranjali-2501
Copy link
Contributor

@Pranjali-2501 Pranjali-2501 commented Dec 18, 2025

This PR implements the xDS :authority header rewriting feature as specified in gRFC A81

Key Changes:

  • xds_cluster_impl LB Policy:

    • Updated the Picker to check for the auto_host_rewrite flag (passed via ConfigSelector).
    • If enabled, the picker retrieves the hostname attribute from the subchannel .
    • The picker populates the Metadata field in PickResult with the new :authority value.
  • changes in stream.go:

    • Updated stream.go to inspect the PickResult metadata. If an :authority override is present and the user has not explicitly set an authority via CallOption, the callHdr.Authority is updated with hostname.
  • PR relies on the following changes already merged:

RELEASE NOTES:

  • xDS: Added support for the :authority rewriting (gRFC A81). When autoHostRewrite is enabled in the xDS RouteConfiguration, the client will rewrite the HTTP/2 :authority header to the value of the selected endpoint's hostname.

@Pranjali-2501 Pranjali-2501 added this to the 1.79 Release milestone Dec 18, 2025
@Pranjali-2501 Pranjali-2501 added Type: Feature New features or improvements in behavior Area: xDS Includes everything xDS related, including LB policies used with xDS. labels Dec 18, 2025
@codecov
Copy link

codecov bot commented Dec 18, 2025

Codecov Report

❌ Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.40%. Comparing base (6ed8acb) to head (0856922).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
internal/xds/balancer/clusterimpl/picker.go 80.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master    #8779    +/-   ##
========================================
  Coverage   83.39%   83.40%            
========================================
  Files         419      418     -1     
  Lines       32566    32910   +344     
========================================
+ Hits        27159    27449   +290     
- Misses       4023     4072    +49     
- Partials     1384     1389     +5     
Files with missing lines Coverage Δ
internal/testutils/xds/e2e/bootstrap.go 70.11% <100.00%> (+0.34%) ⬆️
internal/testutils/xds/e2e/clientresources.go 97.80% <100.00%> (+<0.01%) ⬆️
internal/transport/transport.go 88.70% <ø> (ø)
internal/xds/balancer/clusterimpl/clusterimpl.go 83.87% <100.00%> (+0.15%) ⬆️
internal/xds/resolver/serviceconfig.go 88.14% <100.00%> (+0.08%) ⬆️
...nternal/xds/xdsclient/xdsresource/unmarshal_eds.go 94.73% <100.00%> (+1.97%) ⬆️
stream.go 81.63% <100.00%> (+0.06%) ⬆️
internal/xds/balancer/clusterimpl/picker.go 95.12% <80.00%> (-2.29%) ⬇️

... and 32 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@easwars
Copy link
Contributor

easwars commented Dec 19, 2025

Since this PR completes the implementation for the gRFC, this should probably contain a release note.

@easwars easwars assigned Pranjali-2501 and unassigned easwars Dec 19, 2025

// The authority specified via the `CallAuthority` CallOption takes the
// highest precedence when determining the `:authority` header.
userAuthorityOverride := "user-override.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

defer server.Stop()

const xdsAuthorityOverride = "rewritten.example.com"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: nix newline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

tlsConfig := &tls.Config{
Certificates: []tls.Certificate{serverCert},
ClientCAs: roots,
InsecureSkipVerify: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to set this to true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the test to use xdsCreds instead of tlsCreds.

Comment on lines 1472 to 1473
Certificates: []tls.Certificate{serverCert},
ClientCAs: roots,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. Why is the client presenting the serverCert and using x509/client_ca_cert.pem as the root cert to validate the server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not correct.

I have updated the test to use xdsCreds instead of directly using tlsConfig.

Comment on lines 1497 to 1498
const xdsAuthorityOverride = "x.test.example.com"
configureXDSResources(ctx, t, mgmtServer, nodeID, f.Address, xdsAuthorityOverride)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also actually check if the RPC fails when:

  • the host name override is not specified (as the test does not specify a serverNameOverride in the TLS creds)
  • the host name override is invalid (i.e., is expected to be rejected by the TLS creds)

// Rewriting and TLS Secure Naming. It ensures that when the :authority header
// is rewritten by the clusterimpl picker, the new authority is correctly
// validated against the server's TLS certificate before the RPC proceeds.
func (s) TestAuthorityOverridingWithTLS(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you might need is the following:

  • A server TLS credentials, where the certificate is presented for a domain name like x.test.exmaple.com
  • A client TLS credentials, where the roots are set to x509/server_ca_cert.pem, and the serverNameOverride is set to something like *.test.example.com
  • Send an xDS resource where the authority overwrite is set to x.test.exmaple.com and verify the connection works
  • Send an xDS resource where the authority overwrite is set to y.test.exmaple.com and verify the connection fails. This failure should be because the TLS handshake fails.
  • Send an xDS resource where the authority overwrite is set to xyz.exmaple.com and verify the connection fails. This failure should be because the TLS creds rejects the authority override.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the test and check both cases with valid authority and invalid authorityxyz.exmaple.com.

Send an xDS resource where the authority overwrite is set to y.test.exmaple.com and verify the connection fails. This failure should be because the TLS handshake fails.

Test is using the xdsCreds which will use InsecureSkipVerify while configuring tlsConfig to skip the tls handshake check. So, I don't think we can test that case now.

Copy link
Contributor

@arjan-bal arjan-bal Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the authority override configuration should/will cause TLS handshakes to fail. The authority header is part of the HTTP/2 header frame that is sent after the transport credentials have already completed hanshaking.

The interaction b/w the transport credentials and the authority override is this: The transport credentials must validate the authority override header:

if callHdr.Authority != "" {
auth, ok := t.authInfo.(credentials.AuthorityValidator)
if !ok {
return nil, &NewStreamError{Err: status.Errorf(codes.Unavailable, "credentials type %q does not implement the AuthorityValidator interface, but authority override specified with CallAuthority call option", t.authInfo.AuthType())}
}
if err := auth.ValidateAuthority(callHdr.Authority); err != nil {
return nil, &NewStreamError{Err: status.Errorf(codes.Unavailable, "failed to validate authority %q : %v", callHdr.Authority, err)}
}
newCallHdr := *callHdr
newCallHdr.Host = callHdr.Authority
callHdr = &newCallHdr
}

When using xDS, we don't directly use TLS credentials on the client. Instead we use xdscredentials. The config for TLS needs to be set in the CDS resource.

See the following test as an example of configuring TLS using xDS:

func (s) TestGoodSecurityConfig(t *testing.T) {

On the server side, we can use the regular tlscreds like the CDS test above. We can can also use the xds creds and configure TLS in the LDS resource, similar to the following test:

// TestServerSideXDS_FileWatcherCerts is an e2e test which verifies xDS
// credentials with file watcher certificate provider.
//
// The following sequence of events happen as part of this test:
// - An xDS-enabled gRPC server is created and xDS credentials are configured.
// - xDS is enabled on the client by the use of the xds:/// scheme, and xDS
// credentials are configured.
// - Control plane is configured to send security configuration to both the
// client and the server, pointing to the file watcher certificate provider.
// We verify both TLS and mTLS scenarios.
func (s) TestServerSideXDS_FileWatcherCerts(t *testing.T) {


It looks like the xDS creds don't implement the AuthorityValidator interface, so authority overriding will probably fail when using TLS with xDS.

// reloadingCreds is a credentials.TransportCredentials for client
// side mTLS that reloads the server root CA certificate and the client
// certificates from the provider on every client handshake. This is necessary
// because the standard TLS credentials do not support reloading CA
// certificates.
type reloadingCreds struct {
provider certprovider.Provider
}
func (c *reloadingCreds) ClientHandshake(ctx context.Context, authority string, rawConn net.Conn) (net.Conn, credentials.AuthInfo, error) {
km, err := c.provider.KeyMaterial(ctx)
if err != nil {
return nil, nil, err
}
var config *tls.Config
if km.SPIFFEBundleMap != nil {
config = &tls.Config{
InsecureSkipVerify: true,
VerifyPeerCertificate: buildSPIFFEVerifyFunc(km.SPIFFEBundleMap),
Certificates: km.Certs,
}
} else {
config = &tls.Config{
RootCAs: km.Roots,
Certificates: km.Certs,
}
}
return credentials.NewTLS(config).ClientHandshake(ctx, authority, rawConn)
}
func (c *reloadingCreds) Info() credentials.ProtocolInfo {
return credentials.ProtocolInfo{SecurityProtocol: "tls"}
}
func (c *reloadingCreds) Clone() credentials.TransportCredentials {
return &reloadingCreds{provider: c.provider}
}
func (c *reloadingCreds) OverrideServerName(string) error {
return errors.New("overriding server name is not supported by xDS client TLS credentials")
}
func (c *reloadingCreds) ServerHandshake(net.Conn) (net.Conn, credentials.AuthInfo, error) {
return nil, nil, errors.New("server handshake is not supported by xDS client TLS credentials")
}

We would need to update the reloadingCreds to generate the required TLS credentials, similar to the ClientHandshake method and call ValidateAuthority on the tls creds object for things to work.

Once the xDS creds implement the AuthorityValidator interface, this test should use the CDS config along with xds credentials to test the authority override behaviour.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pranjali-2501 apologies for the confusion. I misunderstood and thought that Credentials needed to implement the AuthorityValidator interface, but it is actually AuthInfo. Since XDSCredentials.ClientHandshake delegates to the TLS credentials (which return TLSAuthInfo), AuthorityValidator is effectively implemented when using xDS. The existing test seems correct.

t.Fatalf("Failed to load server key pair: %v", err)
}

pemData, err := os.ReadFile(testdata.Path("x509/client_ca_cert.pem"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have functions that do all this work of reading from a file etc, right? Somewhere in testutils?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, made the changes.

@easwars easwars assigned Pranjali-2501 and unassigned easwars Dec 19, 2025
NodeID: nodeID,
Host: "localhost",
Port: testutils.ParsePort(t, serverAddr),
SecLevel: e2e.SecurityLevelMTLS,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, this needs to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@mbissa
Copy link
Contributor

mbissa commented Dec 29, 2025

It would help if the description contains details of previous PRs that have already been merged.

Copy link
Contributor

@mbissa mbissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of nits, LGTM otherwise.

Copy link
Contributor

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that with the A81 changes, the config selector creates a new context object for each RPC, even if host rewriting is disabled.

lbCtx = clusterimpl.SetAutoHostRewrite(lbCtx, rt.autoHostRewrite)

This results in one extra heap allocation per RPC for users who don't use the new feature. We should ensure there is no performance impact for users who don't opt-in.

Since the config selector only needs to enable rewriting, can we change SetAutoHostRewrite(ctx, bool) to EnableAutoHostRewrite(ctx)? This way, we can skip the call entirely (avoiding the allocation) when rt.autoHostRewrite is false.

@Pranjali-2501
Copy link
Contributor Author

I noticed that with the A81 changes, the config selector creates a new context object for each RPC, even if host rewriting is disabled.

lbCtx = clusterimpl.SetAutoHostRewrite(lbCtx, rt.autoHostRewrite)

This results in one extra heap allocation per RPC for users who don't use the new feature. We should ensure there is no performance impact for users who don't opt-in.

Since the config selector only needs to enable rewriting, can we change SetAutoHostRewrite(ctx, bool) to EnableAutoHostRewrite(ctx)? This way, we can skip the call entirely (avoiding the allocation) when rt.autoHostRewrite is false.

Right, it is creating an extra allocation.
I have made the changes to call it only when rt.autoHostRewrite is true.

Comment on lines 1324 to 1327
// Create an xDS resolver with the above bootstrap configuration.
if internal.NewXDSResolverWithConfigForTesting == nil {
t.Fatalf("internal.NewXDSResolverWithConfigForTesting is nil")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We can omit this check and let the test panic. This function is not expected to be nil so we don't need to check for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been using this nil check in all test. Do you want me remove it from everywhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with removing the nil check from the other tests, though it's not strictly necessary.

// Rewriting and TLS Secure Naming. It ensures that when the :authority header
// is rewritten by the clusterimpl picker, the new authority is correctly
// validated against the server's TLS certificate before the RPC proceeds.
func (s) TestAuthorityOverridingWithTLS(t *testing.T) {
Copy link
Contributor

@arjan-bal arjan-bal Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the authority override configuration should/will cause TLS handshakes to fail. The authority header is part of the HTTP/2 header frame that is sent after the transport credentials have already completed hanshaking.

The interaction b/w the transport credentials and the authority override is this: The transport credentials must validate the authority override header:

if callHdr.Authority != "" {
auth, ok := t.authInfo.(credentials.AuthorityValidator)
if !ok {
return nil, &NewStreamError{Err: status.Errorf(codes.Unavailable, "credentials type %q does not implement the AuthorityValidator interface, but authority override specified with CallAuthority call option", t.authInfo.AuthType())}
}
if err := auth.ValidateAuthority(callHdr.Authority); err != nil {
return nil, &NewStreamError{Err: status.Errorf(codes.Unavailable, "failed to validate authority %q : %v", callHdr.Authority, err)}
}
newCallHdr := *callHdr
newCallHdr.Host = callHdr.Authority
callHdr = &newCallHdr
}

When using xDS, we don't directly use TLS credentials on the client. Instead we use xdscredentials. The config for TLS needs to be set in the CDS resource.

See the following test as an example of configuring TLS using xDS:

func (s) TestGoodSecurityConfig(t *testing.T) {

On the server side, we can use the regular tlscreds like the CDS test above. We can can also use the xds creds and configure TLS in the LDS resource, similar to the following test:

// TestServerSideXDS_FileWatcherCerts is an e2e test which verifies xDS
// credentials with file watcher certificate provider.
//
// The following sequence of events happen as part of this test:
// - An xDS-enabled gRPC server is created and xDS credentials are configured.
// - xDS is enabled on the client by the use of the xds:/// scheme, and xDS
// credentials are configured.
// - Control plane is configured to send security configuration to both the
// client and the server, pointing to the file watcher certificate provider.
// We verify both TLS and mTLS scenarios.
func (s) TestServerSideXDS_FileWatcherCerts(t *testing.T) {


It looks like the xDS creds don't implement the AuthorityValidator interface, so authority overriding will probably fail when using TLS with xDS.

// reloadingCreds is a credentials.TransportCredentials for client
// side mTLS that reloads the server root CA certificate and the client
// certificates from the provider on every client handshake. This is necessary
// because the standard TLS credentials do not support reloading CA
// certificates.
type reloadingCreds struct {
provider certprovider.Provider
}
func (c *reloadingCreds) ClientHandshake(ctx context.Context, authority string, rawConn net.Conn) (net.Conn, credentials.AuthInfo, error) {
km, err := c.provider.KeyMaterial(ctx)
if err != nil {
return nil, nil, err
}
var config *tls.Config
if km.SPIFFEBundleMap != nil {
config = &tls.Config{
InsecureSkipVerify: true,
VerifyPeerCertificate: buildSPIFFEVerifyFunc(km.SPIFFEBundleMap),
Certificates: km.Certs,
}
} else {
config = &tls.Config{
RootCAs: km.Roots,
Certificates: km.Certs,
}
}
return credentials.NewTLS(config).ClientHandshake(ctx, authority, rawConn)
}
func (c *reloadingCreds) Info() credentials.ProtocolInfo {
return credentials.ProtocolInfo{SecurityProtocol: "tls"}
}
func (c *reloadingCreds) Clone() credentials.TransportCredentials {
return &reloadingCreds{provider: c.provider}
}
func (c *reloadingCreds) OverrideServerName(string) error {
return errors.New("overriding server name is not supported by xDS client TLS credentials")
}
func (c *reloadingCreds) ServerHandshake(net.Conn) (net.Conn, credentials.AuthInfo, error) {
return nil, nil, errors.New("server handshake is not supported by xDS client TLS credentials")
}

We would need to update the reloadingCreds to generate the required TLS credentials, similar to the ClientHandshake method and call ValidateAuthority on the tls creds object for things to work.

Once the xDS creds implement the AuthorityValidator interface, this test should use the CDS config along with xds credentials to test the authority override behaviour.

@arjan-bal arjan-bal assigned Pranjali-2501 and unassigned arjan-bal Jan 2, 2026
@Pranjali-2501
Copy link
Contributor Author

Pranjali-2501 commented Jan 3, 2026

When using xDS, we don't directly use TLS credentials on the client. Instead we use xdscredentials. The config for TLS needs to be set in the CDS resource.
See the following test as an example of configuring TLS using xDS:

I think we are already configuring TLS with xDS in this test TestAuthorityOverridingWithTLS. Additionally I have added a check for certificate name.

It looks like the xDS creds don't implement the AuthorityValidator interface, so authority overriding will probably fail when using TLS with xDS.
grpc-go/internal/xds/bootstrap/tlscreds/bundle.go

The test never call reloadingCreds. ClientHandshake. Infact it make a call to credsImpl.ClientHandshake here.

Sorry, I didn't get what exactly you want me to check?

Comment on lines 1324 to 1327
// Create an xDS resolver with the above bootstrap configuration.
if internal.NewXDSResolverWithConfigForTesting == nil {
t.Fatalf("internal.NewXDSResolverWithConfigForTesting is nil")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with removing the nil check from the other tests, though it's not strictly necessary.

// Rewriting and TLS Secure Naming. It ensures that when the :authority header
// is rewritten by the clusterimpl picker, the new authority is correctly
// validated against the server's TLS certificate before the RPC proceeds.
func (s) TestAuthorityOverridingWithTLS(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pranjali-2501 apologies for the confusion. I misunderstood and thought that Credentials needed to implement the AuthorityValidator interface, but it is actually AuthInfo. Since XDSCredentials.ClientHandshake delegates to the TLS credentials (which return TLSAuthInfo), AuthorityValidator is effectively implemented when using xDS. The existing test seems correct.

t.Fatalf("Timeout waiting for successful RPC after authority rewriting.")
}
} else {
if err == nil || !strings.Contains(err.Error(), "invalid authority") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid checking error messages in tests since they may change. See https://google.github.io/styleguide/go/decisions#test-error-semantics

I think checking the status code should be sufficient here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@arjan-bal
Copy link
Contributor

Mostly LGTM, some minor comments.

@arjan-bal arjan-bal assigned Pranjali-2501 and unassigned arjan-bal Jan 5, 2026
Copy link
Contributor

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arjan-bal arjan-bal assigned Pranjali-2501 and unassigned arjan-bal Jan 5, 2026
@Pranjali-2501 Pranjali-2501 merged commit 319a0fa into grpc:master Jan 5, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants