xds: implement :authority header rewriting in xds_cluster_impl LB policy (gRFC A81)#8779
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8779 +/- ##
========================================
Coverage 83.39% 83.40%
========================================
Files 419 418 -1
Lines 32566 32910 +344
========================================
+ Hits 27159 27449 +290
- Misses 4023 4072 +49
- Partials 1384 1389 +5
🚀 New features to boost your workflow:
|
|
Since this PR completes the implementation for the gRFC, this should probably contain a release note. |
|
|
||
| // The authority specified via the `CallAuthority` CallOption takes the | ||
| // highest precedence when determining the `:authority` header. | ||
| userAuthorityOverride := "user-override.com" |
| defer server.Stop() | ||
|
|
||
| const xdsAuthorityOverride = "rewritten.example.com" | ||
|
|
| tlsConfig := &tls.Config{ | ||
| Certificates: []tls.Certificate{serverCert}, | ||
| ClientCAs: roots, | ||
| InsecureSkipVerify: true, |
There was a problem hiding this comment.
Why do you need to set this to true?
There was a problem hiding this comment.
Update the test to use xdsCreds instead of tlsCreds.
| Certificates: []tls.Certificate{serverCert}, | ||
| ClientCAs: roots, |
There was a problem hiding this comment.
I don't understand this. Why is the client presenting the serverCert and using x509/client_ca_cert.pem as the root cert to validate the server?
There was a problem hiding this comment.
This was not correct.
I have updated the test to use xdsCreds instead of directly using tlsConfig.
| const xdsAuthorityOverride = "x.test.example.com" | ||
| configureXDSResources(ctx, t, mgmtServer, nodeID, f.Address, xdsAuthorityOverride) |
There was a problem hiding this comment.
Can we also actually check if the RPC fails when:
- the host name override is not specified (as the test does not specify a serverNameOverride in the TLS creds)
- the host name override is invalid (i.e., is expected to be rejected by the TLS creds)
| // Rewriting and TLS Secure Naming. It ensures that when the :authority header | ||
| // is rewritten by the clusterimpl picker, the new authority is correctly | ||
| // validated against the server's TLS certificate before the RPC proceeds. | ||
| func (s) TestAuthorityOverridingWithTLS(t *testing.T) { |
There was a problem hiding this comment.
I think what you might need is the following:
- A server TLS credentials, where the certificate is presented for a domain name like
x.test.exmaple.com - A client TLS credentials, where the roots are set to
x509/server_ca_cert.pem, and theserverNameOverrideis set to something like*.test.example.com - Send an xDS resource where the authority overwrite is set to
x.test.exmaple.comand verify the connection works - Send an xDS resource where the authority overwrite is set to
y.test.exmaple.comand verify the connection fails. This failure should be because the TLS handshake fails. - Send an xDS resource where the authority overwrite is set to
xyz.exmaple.comand verify the connection fails. This failure should be because the TLS creds rejects the authority override.
There was a problem hiding this comment.
I have updated the test and check both cases with valid authority and invalid authorityxyz.exmaple.com.
Send an xDS resource where the authority overwrite is set to y.test.exmaple.com and verify the connection fails. This failure should be because the TLS handshake fails.
Test is using the xdsCreds which will use InsecureSkipVerify while configuring tlsConfig to skip the tls handshake check. So, I don't think we can test that case now.
There was a problem hiding this comment.
I don't think the authority override configuration should/will cause TLS handshakes to fail. The authority header is part of the HTTP/2 header frame that is sent after the transport credentials have already completed hanshaking.
The interaction b/w the transport credentials and the authority override is this: The transport credentials must validate the authority override header:
grpc-go/internal/transport/http2_client.go
Lines 767 to 778 in 88ac703
When using xDS, we don't directly use TLS credentials on the client. Instead we use xdscredentials. The config for TLS needs to be set in the CDS resource.
See the following test as an example of configuring TLS using xDS:
On the server side, we can use the regular tlscreds like the CDS test above. We can can also use the xds creds and configure TLS in the LDS resource, similar to the following test:
grpc-go/test/xds/xds_server_integration_test.go
Lines 229 to 239 in 88ac703
It looks like the xDS creds don't implement the AuthorityValidator interface, so authority overriding will probably fail when using TLS with xDS.
grpc-go/internal/xds/bootstrap/tlscreds/bundle.go
Lines 114 to 158 in 88ac703
We would need to update the reloadingCreds to generate the required TLS credentials, similar to the ClientHandshake method and call ValidateAuthority on the tls creds object for things to work.
Once the xDS creds implement the AuthorityValidator interface, this test should use the CDS config along with xds credentials to test the authority override behaviour.
There was a problem hiding this comment.
@Pranjali-2501 apologies for the confusion. I misunderstood and thought that Credentials needed to implement the AuthorityValidator interface, but it is actually AuthInfo. Since XDSCredentials.ClientHandshake delegates to the TLS credentials (which return TLSAuthInfo), AuthorityValidator is effectively implemented when using xDS. The existing test seems correct.
| t.Fatalf("Failed to load server key pair: %v", err) | ||
| } | ||
|
|
||
| pemData, err := os.ReadFile(testdata.Path("x509/client_ca_cert.pem")) |
There was a problem hiding this comment.
We have functions that do all this work of reading from a file etc, right? Somewhere in testutils?
There was a problem hiding this comment.
Right, made the changes.
| NodeID: nodeID, | ||
| Host: "localhost", | ||
| Port: testutils.ParsePort(t, serverAddr), | ||
| SecLevel: e2e.SecurityLevelMTLS, |
There was a problem hiding this comment.
Discussed offline, this needs to change.
|
It would help if the description contains details of previous PRs that have already been merged. |
mbissa
left a comment
There was a problem hiding this comment.
couple of nits, LGTM otherwise.
There was a problem hiding this comment.
I noticed that with the A81 changes, the config selector creates a new context object for each RPC, even if host rewriting is disabled.
This results in one extra heap allocation per RPC for users who don't use the new feature. We should ensure there is no performance impact for users who don't opt-in.
Since the config selector only needs to enable rewriting, can we change SetAutoHostRewrite(ctx, bool) to EnableAutoHostRewrite(ctx)? This way, we can skip the call entirely (avoiding the allocation) when rt.autoHostRewrite is false.
Right, it is creating an extra allocation. |
| // Create an xDS resolver with the above bootstrap configuration. | ||
| if internal.NewXDSResolverWithConfigForTesting == nil { | ||
| t.Fatalf("internal.NewXDSResolverWithConfigForTesting is nil") | ||
| } |
There was a problem hiding this comment.
nit: We can omit this check and let the test panic. This function is not expected to be nil so we don't need to check for it.
There was a problem hiding this comment.
We have been using this nil check in all test. Do you want me remove it from everywhere?
There was a problem hiding this comment.
I'm fine with removing the nil check from the other tests, though it's not strictly necessary.
| // Rewriting and TLS Secure Naming. It ensures that when the :authority header | ||
| // is rewritten by the clusterimpl picker, the new authority is correctly | ||
| // validated against the server's TLS certificate before the RPC proceeds. | ||
| func (s) TestAuthorityOverridingWithTLS(t *testing.T) { |
There was a problem hiding this comment.
I don't think the authority override configuration should/will cause TLS handshakes to fail. The authority header is part of the HTTP/2 header frame that is sent after the transport credentials have already completed hanshaking.
The interaction b/w the transport credentials and the authority override is this: The transport credentials must validate the authority override header:
grpc-go/internal/transport/http2_client.go
Lines 767 to 778 in 88ac703
When using xDS, we don't directly use TLS credentials on the client. Instead we use xdscredentials. The config for TLS needs to be set in the CDS resource.
See the following test as an example of configuring TLS using xDS:
On the server side, we can use the regular tlscreds like the CDS test above. We can can also use the xds creds and configure TLS in the LDS resource, similar to the following test:
grpc-go/test/xds/xds_server_integration_test.go
Lines 229 to 239 in 88ac703
It looks like the xDS creds don't implement the AuthorityValidator interface, so authority overriding will probably fail when using TLS with xDS.
grpc-go/internal/xds/bootstrap/tlscreds/bundle.go
Lines 114 to 158 in 88ac703
We would need to update the reloadingCreds to generate the required TLS credentials, similar to the ClientHandshake method and call ValidateAuthority on the tls creds object for things to work.
Once the xDS creds implement the AuthorityValidator interface, this test should use the CDS config along with xds credentials to test the authority override behaviour.
I think we are already configuring TLS with xDS in this test
The test never call Sorry, I didn't get what exactly you want me to check? |
| // Create an xDS resolver with the above bootstrap configuration. | ||
| if internal.NewXDSResolverWithConfigForTesting == nil { | ||
| t.Fatalf("internal.NewXDSResolverWithConfigForTesting is nil") | ||
| } |
There was a problem hiding this comment.
I'm fine with removing the nil check from the other tests, though it's not strictly necessary.
| // Rewriting and TLS Secure Naming. It ensures that when the :authority header | ||
| // is rewritten by the clusterimpl picker, the new authority is correctly | ||
| // validated against the server's TLS certificate before the RPC proceeds. | ||
| func (s) TestAuthorityOverridingWithTLS(t *testing.T) { |
There was a problem hiding this comment.
@Pranjali-2501 apologies for the confusion. I misunderstood and thought that Credentials needed to implement the AuthorityValidator interface, but it is actually AuthInfo. Since XDSCredentials.ClientHandshake delegates to the TLS credentials (which return TLSAuthInfo), AuthorityValidator is effectively implemented when using xDS. The existing test seems correct.
| t.Fatalf("Timeout waiting for successful RPC after authority rewriting.") | ||
| } | ||
| } else { | ||
| if err == nil || !strings.Contains(err.Error(), "invalid authority") { |
There was a problem hiding this comment.
We should avoid checking error messages in tests since they may change. See https://google.github.io/styleguide/go/decisions#test-error-semantics
I think checking the status code should be sufficient here.
|
Mostly LGTM, some minor comments. |
This PR implements the xDS :authority header rewriting feature as specified in gRFC A81
Key Changes:
xds_cluster_impl LB Policy:
changes in stream.go:
callHdr.Authorityis updated with hostname.PR relies on the following changes already merged:
trusted_xds_serverserver feature #8692): Added the trusted_xds_server server feature to the bootstrap configuration.RELEASE NOTES:
autoHostRewriteis enabled in the xDS RouteConfiguration, the client will rewrite the HTTP/2 :authority header to the value of the selected endpoint's hostname.