AWS: Integrate S3 analytics accelerator library #12299
Conversation
|
Hey @SanjayMarreddi, I think this is a cool addition!! Since this will rely on the analytics-accelerator-s3 I'd assume there would be some tests to ensure nothing breaks. I was wondering if your team has any benchmark numbers or performance metrics for this that you can share here? Meanwhile I'll ping some others to get some opinions |
|
Hey @geruh, thanks for the review! I will make changes as you suggested above.
|
0a5ff57 to
361e465
Compare
dbee3c1 to
e6aa71e
Compare
e6aa71e to
e8e2645
Compare
c4814ea to
989ed63
Compare
989ed63 to
8d65c2c
Compare
|
I verified that the async client should only affect S3 FileIO when the feature flag is enabled.
|
Yes, I think in general there is data point supporting using async client & CRT client makes the performance faster, so we would like to gradually complete the whole code path using async, and then let people opt-in through some config like |
Maybe after many versions, once we have enough confidence that it is stable. But probably not in the short term. |
…OIntegration.java Co-authored-by: Kevin Liu <[email protected]>
|
@HonahX could you take a look? Given the fact that we plan to refactor the HTTPClientProperties and other related classes as the next step, it's probably good for you to take a look at this PR and the follow up PRs. |
geruh
left a comment
There was a problem hiding this comment.
Okay changes LGTM! Breaking up the PR to follow up with documentation and refactoring the S3FileIOProperties configurations to work with async clients makes sense.
HonahX
left a comment
There was a problem hiding this comment.
@SanjayMarreddi Thanks for the great contribution! Looking forward to documentation and follow-up PRs
Co-authored-by: Honah (Jonas) J. <[email protected]>
|
Looks like all comments are addressed, thanks @SanjayMarreddi for all the work! Let us know when you have the follow up PRs for async client configs and doc update! |
|
@SanjayMarreddi and @jackye1995 This PR added a new dependency, but it doesn't look like the LICENSE/NOTICE files were updated. This needs to be taken care of before the next release (cc @ajantha-bhat is handling that release). Edit: Looks like there's a NOTICE from that dependency that needs to be evaluated as well. |
|
@SanjayMarreddi: Will you be working a PR to update the Notice and License file for this? |
|
@ajantha-bhat: Yeah, I will be working on it. |
|
Hi,
Looking at this... https://github.com/awslabs/analytics-accelerator-s3/blob/main/doc/CONFIGURATION.md Thanks |
Hey, thanks for reaching out. As of now, when we create the async client that's used with Meanwhile, would you mind creating an opensource issue here on the codebase - so I can discuss with team. |
Description
iceberg/awsTesting
PS: We are also integrating this library into hadoop-aws.