Aws: Add Iceberg version to UserAgent in S3 requests#9963
Conversation
|
Hi @nastra, I've seen you reviewing PRs in the AWS space before, I was wondering if you could take a look at this (or suggest other reviewers who review S3-related code changes?). Thanks! |
| config -> | ||
| config.putAdvancedOption( | ||
| SdkAdvancedClientOption.USER_AGENT_PREFIX, | ||
| "s3fileio " + IcebergBuild.fullVersion())) |
There was a problem hiding this comment.
you might also need to update
There was a problem hiding this comment.
also maybe worth changing the user agent to "s3fileio/" + EnvironmentContext.get(). That way if you're running Spark/Flink, you'll also get information about that engine and its version
There was a problem hiding this comment.
Thanks, these are great callouts!
I factored this lambda out into a separate "mutator" like it is done for the other configs and included a unit test that at least calls the code. I tried to introduce some assertions to check if the correct String is set as User Agent but it felt like I was just re-implementing the original code path with captors so I abandoned that effort.
| } | ||
| } | ||
|
|
||
| private static final String S3_FILE_IO_SIGNATURE = "s3fileio/" + EnvironmentContext.get(); |
There was a problem hiding this comment.
this should be defined at the top of the file (after all the other public static final vars) . Also signature seems rather confusing to me. Why not just use S3_FILE_IO_USER_AGENT?
There was a problem hiding this comment.
Ack, thanks! In my mind signature/fingerprint a software leaves after it touched something mean the same thing, but the term signature also has uses in cryptography and AuthN so I get the confusion.
I agree that user agent will be more descriptive here, changed it now.
This allows developers to monitor which version of Iceberg they have deployed to a cluster (for example, S3 Access Logs contain the user agent field).
|
Looks like the link checker got throttled (HTTP 429s from Medium). Is this a known failure mode? |
This isn't currently a known issue. Could you open an issue for this? I've also seen other PRs fail on this unfortunately (cc @Fokko) |
amogh-jahagirdar
left a comment
There was a problem hiding this comment.
Nice! I think this is a helpful feature to add, and the changes look good to me.
|
Opened #10038 to track. |
This allows developers to monitor which version of Iceberg they have deployed to a cluster (for example, S3 Access Logs contain the user agent field). Co-authored-by: Geza Csenger <[email protected]>
This allows developers to monitor which version of Iceberg they have deployed to a cluster (for example, through the S3 Access Logs, which contain the user agent field).