Skip to content

S3 Credentials provider support in DefaultAwsClientFactory #7063#7066

Merged
RussellSpitzer merged 13 commits into
apache:masterfrom
dpaani:master
Mar 23, 2023
Merged

S3 Credentials provider support in DefaultAwsClientFactory #7063#7066
RussellSpitzer merged 13 commits into
apache:masterfrom
dpaani:master

Conversation

@dpaani
Copy link
Copy Markdown
Contributor

@dpaani dpaani commented Mar 10, 2023

Fixes #7063

@github-actions github-actions Bot added the AWS label Mar 10, 2023
@dpaani
Copy link
Copy Markdown
Contributor Author

dpaani commented Mar 10, 2023

@jackye1995 @RussellSpitzer Could you please review this PR. TIA

@Override
public S3Client s3() {
return S3Client.builder()
.region(this.region)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Shall we follow the applyMutation pattern here

.applyMutation(awsProperties:applyAwsRegionConfiguration)

and have the following method in AwsProperties.java

public <T extends AwsSyncClientBuilder> void applyAwsRegionConfiguration(T builder) {
   if (awsRegion != null) {
       builder.region(Region.of(awsRegion))
  }
}

In this way, we do not need to do any further initialization in the DefaultAwsClientFactories

What do you think about this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good suggestion @JonasJ-ap . I made the change. Please check.

UrlConnectionHttpClientConfigurations urlConnectionHttpClientConfigurations =
(UrlConnectionHttpClientConfigurations)
loadHttpClientConfigurations(UrlConnectionHttpClientConfigurations.class.getName());
loadHttpClientConfigurations(UrlConnectionHttpClientConfigurations.class.getName());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for refactoring this.

Comment thread aws/src/test/java/org/apache/iceberg/aws/TestAwsClientFactories.java Outdated
@JonasJ-ap
Copy link
Copy Markdown
Contributor

Thank you very much for your contributions! I leave some comments above.

@jackye1995
Copy link
Copy Markdown
Contributor

@amogh-jahagirdar can you take a look at this? Looks we can also close #6847 using this

Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
Comment thread aws/src/test/java/org/apache/iceberg/aws/TestAwsClientFactories.java Outdated
public static final String S3FILEIO_SESSION_TOKEN = "s3.session-token";

/**
* Configure the aws credentials provider used to access S3FileIO.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to note that this property expects the full classname of the provider and it would probably help to say that the class must implement "........" here as well.

Copy link
Copy Markdown
Contributor Author

@dpaani dpaani Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated javadoc @RussellSpitzer

Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
properties.put(AwsProperties.AWS_REGION, Region.AWS_GLOBAL.toString());
properties.put(
AwsProperties.S3FILEIO_CREDENTIALS_PROVIDER,
SystemPropertyCredentialsProvider.class.getName());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check if this provider is loaded? Usually for things like this I just make a fake class which throws an exception when initialized. That we we can tell the constructor was called correctly.

Copy link
Copy Markdown
Contributor Author

@dpaani dpaani Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RussellSpitzer Added Class.forName to check class exists in the classpath. Please check. I could not find any existing API in iceberg project like Spark's Utils.classIsLoadable API

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for making a fake class for this. See examples like https://github.com/apache/iceberg/blob/master/core/src/test/java/org/apache/iceberg/TestLocationProvider.java#L185

And do we need Class.forName? I think it will naturally throw ClassNotFound in DynMethods

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jack has it right, I'm not worried this class does not exist. But I want the test to prove that it successfully loaded the class.

Using a fake class lets us know with out a doubt that the class was loaded and used as the credential provider because we can put traps in it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check to make sure provider implements AwsCredentialsProvider interface.

@jackye1995 It looks like DynMethods does not throw ClassNotFoundException but DynClasses does. I replaced Class.forName with DynClasses.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is not to test whether the CNFE is thrown, but to make sure the test code is dynamically loading the class specified. We wouldn't have had a way of distinguishing between SystemPropertiesCredentialProvider being loaded and any other provider being loaded. The new tests added are better in that they check (in the error message) that the class we wanted was actually loaded.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically this test just makes sure AwsClientFactories.from(properties) doesn't fail. It doesn't check that SystemProperitesCredentialsProvider is the credential provider

* <p>When set, the default client factory will use this provider to get AWS credentials provided
* instead of reading the default credential chain to get S3 access credentials.
*/
public static final String S3FILEIO_CREDENTIALS_PROVIDER = "client.credentials-provider";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do it for all clients, instead of just S3FileIO? so variable below should also be clientCredentialsProvider instead of s3CredentialsProvider

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested.

* Used by {@link org.apache.iceberg.aws.AwsClientFactories.DefaultAwsClientFactory}. If set, all
* AWS clients except STS client will use * the given region instead of the default region chain.
*/
public static final String AWS_CLIENT_REGION = "client.region";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment for variable name, should be clientRegion instead of awsRegion, and CLIENT_CLIENT_REGION instead of AWS_CLIENT_REGION.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested.

Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
builder.credentialsProvider(
credentialsProvider(s3AccessKeyId, s3SecretAccessKey, s3SessionToken));
if (this.s3CredentialsProvider != null && this.s3CredentialsProvider.trim().length() > 0) {
builder.credentialsProvider(credentialsProvider(this.s3CredentialsProvider.trim()));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to trim?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added in case if a user makes a config mistake by adding space. Removed now.

return (s3AccessKeyId == null) == (s3SecretAccessKey == null);
}

private AwsCredentialsProvider credentialsProvider(String credentialsProviderClazz) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mentioned

Most of the AWS SDK Credential Providers are using create() method instead of constructor

any reference for that?

Because at least for the Trino case it's done by checking constructors. In Iceberg most uses initialize(...) instead of create(...).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is one case for create which is

public static S3V4RestSignerClient create(Map<String, String> properties) {
    return ImmutableS3V4RestSignerClient.builder().properties(properties).build();
  }

in that case I am fine with this. But I think we should still support only 1 method for create(Map) instead of an additional create() without argument.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I would probably first look for a "map" create method and pass an empty map if we have no properties. If we don't have no-arg create method we can fall back to that if the first method doesn't exist. I think your filter code above guarantees that properties is non-null but we can check that.

@jackye1995 knows more about what credential providers there are, but I'm surprised that having a constructor is that rare. Anyway I definitely think we should always call a "create(map) if that exists"

try {
  getCreate(Map)
  invokeMapCreate(properties)
} catch (NoSuchMethod) {
  try {
     getCreate()
     invokeCreate()
   }
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RussellSpitzer . I am fine with your approach. Once @jackye1995 approves, I can add it.

hi @jackye1995 These are some of the AWS SDK provided credentials provider classes

software.amazon.awssdk.auth.credentials.AnonymousCredentialsProvider
software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider
software.amazon.awssdk.auth.credentials.SystemPropertyCredentialsProvider
software.amazon.awssdk.auth.credentials.WebIdentityTokenFileCredentialsProvider

AWS SDK V2 Apis not using constructors in most of the cases.

Copy link
Copy Markdown
Contributor Author

@dpaani dpaani Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RussellSpitzer I made change as you suggested by checking create(Map) first and fallback to create().

public <T extends S3ClientBuilder> void applyS3CredentialConfigurations(T builder) {
builder.credentialsProvider(
credentialsProvider(s3AccessKeyId, s3SecretAccessKey, s3SessionToken));
if (this.s3CredentialsProvider != null && this.s3CredentialsProvider.trim().length() > 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strings.isNullOrEmpty()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced with Strings.isNullOrEmpty().

@dpaani
Copy link
Copy Markdown
Contributor Author

dpaani commented Mar 17, 2023

hi @RussellSpitzer @jackye1995 I have addressed your review comments. Could you please check again. Thanks

return (s3AccessKeyId == null) == (s3SecretAccessKey == null);
}

private AwsCredentialsProvider credentialsProvider(String credentialsProviderClazz) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably not call this "Clazz", we use Class everywhere else

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected.

try {
Class<?> providerClass =
DynClasses.builder()
.loader(this.getClass().getClassLoader())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we are specifying the classloader here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed ClassLoader. By default DynClasses api uses context class loader

.impl(credentialsProviderClazz)
.buildChecked();

ValidationException.check(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrong exception type, we throw these only for table validation errors.

/** 
 * Exception raised when validation checks fail.
 * <p>
 * For example, this is thrown when attempting to create a table with a {@link PartitionSpec} that
 * is not compatible with the table {@link Schema}
 */

I think this is probably better as a Preconditions.checkArg

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced ValidationException.check with Preconditions. I also noticed that AwsProperties class still has ValidationException.check API call

https://github.com/apache/iceberg/blob/master/aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java#L777


ValidationException.check(
AwsCredentialsProvider.class.isAssignableFrom(providerClass),
"%s is not an instance of software.amazon.awssdk.auth.credentials.AwsCredentialsProvider, loaded by the classloader for %s",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message format in iceberg is
"Cannot X because Y. (Remediation)"

Also in this I would probably embed the class name as a variable in the format string. "AwsCredentialsProvider.class.getName"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed all exception messages as recommended. thanks

try {
provider =
DynMethods.builder("create")
.hiddenImpl(credentialsProviderClazz, Map.class)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a a version (String className, classes....)

So you don't actually need to get the class above

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public Builder hiddenImpl(String className, Class<?>... argClasses) {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RussellSpitzer DynMethods does not throw CNFE exception if a class does not exists. here is the code.

    public Builder impl(String className, String methodName, Class<?>... argClasses) {
      // don't do any work if an implementation has been found
      if (method != null) {
        return this;
      }

      try {
        Class<?> targetClass = Class.forName(className, true, loader);
        impl(targetClass, methodName, argClasses);
      } catch (ClassNotFoundException e) {
        // not the right implementation
      }
      return this;
    }

By calling DynClasses, I am able to validate CNFE and also adding Provider interface is implemented or not.

Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
Comment thread aws/src/test/java/org/apache/iceberg/aws/TestAwsClientFactories.java Outdated
Comment thread aws/src/test/java/org/apache/iceberg/aws/TestAwsClientFactories.java Outdated
…Properties::applyS3CredentialConfigurations to awsProperties::applyCredentialConfigurations, so that all other clients can use the same method
@dpaani
Copy link
Copy Markdown
Contributor Author

dpaani commented Mar 22, 2023

hi @RussellSpitzer I have added coverage for resolveCredentials() in the junit testcase. could you please check

Copy link
Copy Markdown
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, Let's see if @jackye1995 has any more review notes

Copy link
Copy Markdown
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, just a nit, thanks @dpaani!

Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
Comment thread aws/src/test/java/org/apache/iceberg/aws/TestAwsClientFactories.java Outdated
Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
pani added 2 commits March 22, 2023 14:13
…ettings; awsProperties::applyS3CredentialConfigurations and awsProperties::applyCredentialConfigurations. S3Client will use applyS3CredentialConfigurations by taking s3 accessKeyId settings preference first then consider credentials provider then default. Made all test inner classes private
@dpaani
Copy link
Copy Markdown
Contributor Author

dpaani commented Mar 22, 2023

hi @jackye1995 I have fixed all your comments. Could you please take a took at it again. TIA

Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
@jackye1995
Copy link
Copy Markdown
Contributor

Thanks! I have some more comments but we should be close. Also I notice there is a merge conflict, could you resolve that?

Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
@dpaani
Copy link
Copy Markdown
Contributor Author

dpaani commented Mar 22, 2023

hi @jackye1995 I have resolved the conflicts. Thanks.

Comment thread aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java Outdated
Copy link
Copy Markdown
Contributor

@JonasJ-ap JonasJ-ap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just some nits suggested by @jackye1995 . Thank you for your contribution! @dpaani

Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, sorry for all the nitpicking, thanks for the work! Just waiting for CI

@dpaani
Copy link
Copy Markdown
Contributor Author

dpaani commented Mar 23, 2023

Thank you so much @jackye1995 . Will document all the learnings from this pull request and send them to you for review. This may be helpful for new contributors.

@dpaani
Copy link
Copy Markdown
Contributor Author

dpaani commented Mar 23, 2023

hi @jackye1995 All CI checks passed successfully.

@RussellSpitzer RussellSpitzer merged commit e25ee18 into apache:master Mar 23, 2023
@RussellSpitzer
Copy link
Copy Markdown
Member

Thanks @dpaani ! Also big thanks to all the reviewers @amogh-jahagirdar , @jackye1995 , @JonasJ-ap !

danielcweeks pushed a commit that referenced this pull request Apr 5, 2023
… (#7066)

Allows dynamically loading a Credential Provider with the DefaultAwsClientFactory.

Co-authored-by: pani <[email protected]>
@danielcweeks danielcweeks added this to the Iceberg 1.2.1 milestone Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

S3 Credentials provider support in DefaultAwsClientFactory

6 participants