S3 Credentials provider support in DefaultAwsClientFactory #7063#7066
Conversation
|
@jackye1995 @RussellSpitzer Could you please review this PR. TIA |
| @Override | ||
| public S3Client s3() { | ||
| return S3Client.builder() | ||
| .region(this.region) |
There was a problem hiding this comment.
[Suggestion] Shall we follow the applyMutation pattern here
.applyMutation(awsProperties:applyAwsRegionConfiguration)and have the following method in AwsProperties.java
public <T extends AwsSyncClientBuilder> void applyAwsRegionConfiguration(T builder) {
if (awsRegion != null) {
builder.region(Region.of(awsRegion))
}
}In this way, we do not need to do any further initialization in the DefaultAwsClientFactories
What do you think about this change?
There was a problem hiding this comment.
This is good suggestion @JonasJ-ap . I made the change. Please check.
| UrlConnectionHttpClientConfigurations urlConnectionHttpClientConfigurations = | ||
| (UrlConnectionHttpClientConfigurations) | ||
| loadHttpClientConfigurations(UrlConnectionHttpClientConfigurations.class.getName()); | ||
| loadHttpClientConfigurations(UrlConnectionHttpClientConfigurations.class.getName()); |
There was a problem hiding this comment.
Thank you for refactoring this.
|
Thank you very much for your contributions! I leave some comments above. |
|
@amogh-jahagirdar can you take a look at this? Looks we can also close #6847 using this |
| public static final String S3FILEIO_SESSION_TOKEN = "s3.session-token"; | ||
|
|
||
| /** | ||
| * Configure the aws credentials provider used to access S3FileIO. |
There was a problem hiding this comment.
Needs to note that this property expects the full classname of the provider and it would probably help to say that the class must implement "........" here as well.
| properties.put(AwsProperties.AWS_REGION, Region.AWS_GLOBAL.toString()); | ||
| properties.put( | ||
| AwsProperties.S3FILEIO_CREDENTIALS_PROVIDER, | ||
| SystemPropertyCredentialsProvider.class.getName()); |
There was a problem hiding this comment.
Shouldn't we check if this provider is loaded? Usually for things like this I just make a fake class which throws an exception when initialized. That we we can tell the constructor was called correctly.
There was a problem hiding this comment.
@RussellSpitzer Added Class.forName to check class exists in the classpath. Please check. I could not find any existing API in iceberg project like Spark's Utils.classIsLoadable API
There was a problem hiding this comment.
+1 for making a fake class for this. See examples like https://github.com/apache/iceberg/blob/master/core/src/test/java/org/apache/iceberg/TestLocationProvider.java#L185
And do we need Class.forName? I think it will naturally throw ClassNotFound in DynMethods
There was a problem hiding this comment.
Jack has it right, I'm not worried this class does not exist. But I want the test to prove that it successfully loaded the class.
Using a fake class lets us know with out a doubt that the class was loaded and used as the credential provider because we can put traps in it.
There was a problem hiding this comment.
Added check to make sure provider implements AwsCredentialsProvider interface.
@jackye1995 It looks like DynMethods does not throw ClassNotFoundException but DynClasses does. I replaced Class.forName with DynClasses.
There was a problem hiding this comment.
The point is not to test whether the CNFE is thrown, but to make sure the test code is dynamically loading the class specified. We wouldn't have had a way of distinguishing between SystemPropertiesCredentialProvider being loaded and any other provider being loaded. The new tests added are better in that they check (in the error message) that the class we wanted was actually loaded.
There was a problem hiding this comment.
Specifically this test just makes sure AwsClientFactories.from(properties) doesn't fail. It doesn't check that SystemProperitesCredentialsProvider is the credential provider
| * <p>When set, the default client factory will use this provider to get AWS credentials provided | ||
| * instead of reading the default credential chain to get S3 access credentials. | ||
| */ | ||
| public static final String S3FILEIO_CREDENTIALS_PROVIDER = "client.credentials-provider"; |
There was a problem hiding this comment.
can we do it for all clients, instead of just S3FileIO? so variable below should also be clientCredentialsProvider instead of s3CredentialsProvider
There was a problem hiding this comment.
Changed as suggested.
| * Used by {@link org.apache.iceberg.aws.AwsClientFactories.DefaultAwsClientFactory}. If set, all | ||
| * AWS clients except STS client will use * the given region instead of the default region chain. | ||
| */ | ||
| public static final String AWS_CLIENT_REGION = "client.region"; |
There was a problem hiding this comment.
similar comment for variable name, should be clientRegion instead of awsRegion, and CLIENT_CLIENT_REGION instead of AWS_CLIENT_REGION.
There was a problem hiding this comment.
Changed as suggested.
| builder.credentialsProvider( | ||
| credentialsProvider(s3AccessKeyId, s3SecretAccessKey, s3SessionToken)); | ||
| if (this.s3CredentialsProvider != null && this.s3CredentialsProvider.trim().length() > 0) { | ||
| builder.credentialsProvider(credentialsProvider(this.s3CredentialsProvider.trim())); |
There was a problem hiding this comment.
why do we need to trim?
There was a problem hiding this comment.
Just added in case if a user makes a config mistake by adding space. Removed now.
| return (s3AccessKeyId == null) == (s3SecretAccessKey == null); | ||
| } | ||
|
|
||
| private AwsCredentialsProvider credentialsProvider(String credentialsProviderClazz) { |
There was a problem hiding this comment.
you mentioned
Most of the AWS SDK Credential Providers are using create() method instead of constructor
any reference for that?
Because at least for the Trino case it's done by checking constructors. In Iceberg most uses initialize(...) instead of create(...).
There was a problem hiding this comment.
I guess there is one case for create which is
public static S3V4RestSignerClient create(Map<String, String> properties) {
return ImmutableS3V4RestSignerClient.builder().properties(properties).build();
}
in that case I am fine with this. But I think we should still support only 1 method for create(Map) instead of an additional create() without argument.
There was a problem hiding this comment.
In this case I would probably first look for a "map" create method and pass an empty map if we have no properties. If we don't have no-arg create method we can fall back to that if the first method doesn't exist. I think your filter code above guarantees that properties is non-null but we can check that.
@jackye1995 knows more about what credential providers there are, but I'm surprised that having a constructor is that rare. Anyway I definitely think we should always call a "create(map) if that exists"
try {
getCreate(Map)
invokeMapCreate(properties)
} catch (NoSuchMethod) {
try {
getCreate()
invokeCreate()
}
}There was a problem hiding this comment.
Thanks @RussellSpitzer . I am fine with your approach. Once @jackye1995 approves, I can add it.
hi @jackye1995 These are some of the AWS SDK provided credentials provider classes
software.amazon.awssdk.auth.credentials.AnonymousCredentialsProvider
software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider
software.amazon.awssdk.auth.credentials.SystemPropertyCredentialsProvider
software.amazon.awssdk.auth.credentials.WebIdentityTokenFileCredentialsProvider
AWS SDK V2 Apis not using constructors in most of the cases.
There was a problem hiding this comment.
@RussellSpitzer I made change as you suggested by checking create(Map) first and fallback to create().
| public <T extends S3ClientBuilder> void applyS3CredentialConfigurations(T builder) { | ||
| builder.credentialsProvider( | ||
| credentialsProvider(s3AccessKeyId, s3SecretAccessKey, s3SessionToken)); | ||
| if (this.s3CredentialsProvider != null && this.s3CredentialsProvider.trim().length() > 0) { |
There was a problem hiding this comment.
replaced with Strings.isNullOrEmpty().
|
hi @RussellSpitzer @jackye1995 I have addressed your review comments. Could you please check again. Thanks |
| return (s3AccessKeyId == null) == (s3SecretAccessKey == null); | ||
| } | ||
|
|
||
| private AwsCredentialsProvider credentialsProvider(String credentialsProviderClazz) { |
There was a problem hiding this comment.
I would probably not call this "Clazz", we use Class everywhere else
| try { | ||
| Class<?> providerClass = | ||
| DynClasses.builder() | ||
| .loader(this.getClass().getClassLoader()) |
There was a problem hiding this comment.
Any reason why we are specifying the classloader here?
There was a problem hiding this comment.
Removed ClassLoader. By default DynClasses api uses context class loader
| .impl(credentialsProviderClazz) | ||
| .buildChecked(); | ||
|
|
||
| ValidationException.check( |
There was a problem hiding this comment.
This is the wrong exception type, we throw these only for table validation errors.
/**
* Exception raised when validation checks fail.
* <p>
* For example, this is thrown when attempting to create a table with a {@link PartitionSpec} that
* is not compatible with the table {@link Schema}
*/I think this is probably better as a Preconditions.checkArg
There was a problem hiding this comment.
Replaced ValidationException.check with Preconditions. I also noticed that AwsProperties class still has ValidationException.check API call
|
|
||
| ValidationException.check( | ||
| AwsCredentialsProvider.class.isAssignableFrom(providerClass), | ||
| "%s is not an instance of software.amazon.awssdk.auth.credentials.AwsCredentialsProvider, loaded by the classloader for %s", |
There was a problem hiding this comment.
Error message format in iceberg is
"Cannot X because Y. (Remediation)"
Also in this I would probably embed the class name as a variable in the format string. "AwsCredentialsProvider.class.getName"
There was a problem hiding this comment.
Fixed all exception messages as recommended. thanks
| try { | ||
| provider = | ||
| DynMethods.builder("create") | ||
| .hiddenImpl(credentialsProviderClazz, Map.class) |
There was a problem hiding this comment.
This has a a version (String className, classes....)
So you don't actually need to get the class above
There was a problem hiding this comment.
public Builder hiddenImpl(String className, Class<?>... argClasses) {
There was a problem hiding this comment.
@RussellSpitzer DynMethods does not throw CNFE exception if a class does not exists. here is the code.
public Builder impl(String className, String methodName, Class<?>... argClasses) {
// don't do any work if an implementation has been found
if (method != null) {
return this;
}
try {
Class<?> targetClass = Class.forName(className, true, loader);
impl(targetClass, methodName, argClasses);
} catch (ClassNotFoundException e) {
// not the right implementation
}
return this;
}
By calling DynClasses, I am able to validate CNFE and also adding Provider interface is implemented or not.
…Properties::applyS3CredentialConfigurations to awsProperties::applyCredentialConfigurations, so that all other clients can use the same method
|
hi @RussellSpitzer I have added coverage for resolveCredentials() in the junit testcase. could you please check |
RussellSpitzer
left a comment
There was a problem hiding this comment.
Looks good to me, Let's see if @jackye1995 has any more review notes
amogh-jahagirdar
left a comment
There was a problem hiding this comment.
Overall LGTM, just a nit, thanks @dpaani!
…ettings; awsProperties::applyS3CredentialConfigurations and awsProperties::applyCredentialConfigurations. S3Client will use applyS3CredentialConfigurations by taking s3 accessKeyId settings preference first then consider credentials provider then default. Made all test inner classes private
|
hi @jackye1995 I have fixed all your comments. Could you please take a took at it again. TIA |
|
Thanks! I have some more comments but we should be close. Also I notice there is a merge conflict, could you resolve that? |
|
hi @jackye1995 I have resolved the conflicts. Thanks. |
JonasJ-ap
left a comment
There was a problem hiding this comment.
LGTM! Just some nits suggested by @jackye1995 . Thank you for your contribution! @dpaani
jackye1995
left a comment
There was a problem hiding this comment.
Looks good to me, sorry for all the nitpicking, thanks for the work! Just waiting for CI
|
Thank you so much @jackye1995 . Will document all the learnings from this pull request and send them to you for review. This may be helpful for new contributors. |
|
hi @jackye1995 All CI checks passed successfully. |
|
Thanks @dpaani ! Also big thanks to all the reviewers @amogh-jahagirdar , @jackye1995 , @JonasJ-ap ! |
… (#7066) Allows dynamically loading a Credential Provider with the DefaultAwsClientFactory. Co-authored-by: pani <[email protected]>
Fixes #7063