Adds a readVectored implementation to S3InputStream. by ahmarsuhail · Pull Request #15581 · apache/iceberg

ahmarsuhail · 2026-03-10T13:04:30Z

This is a copy of #14352

Adds a readVectored() implementation to S3InputStream which ensures that parquet column chunks are read in parallel.

ReadVectored implementations increase TPC-DS benchmark performance by ~10-20%, as without this column chunks are read sequentially leading to significant IO stalls.

ahmarsuhail · 2026-03-10T13:10:12Z

@danielcweeks @pvary @geruh

Still working on this PR and need to make a couple of additional changes as well as benchmark again, but could I please get an initial review? I would really like to get this merged in before the next release as having a readVectored() implementation makes a significant difference to performance, thank you!

steveloughran · 2026-03-10T13:47:08Z

I can concur with the claimed speedup; we've had s3a running with vectored reads for 3y now.

github-actions · 2026-04-10T00:28:29Z

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

danielcweeks · 2026-04-13T21:10:46Z

  }

-  private ExecutorService executorService() {
+  private ExecutorService executorService(String name, int size) {


This change doesn't really work since it's a static initialization but you're also setting a name/size. Basically, the first one wins? This isn't really a workable update to handling the executor service.

Also, the executor service was recently changed due accommodate credential refresh scheduling, so the service is now more generic, but currently based on the delete threads property. We can probably update this to be more generic along with a more general io threads property. Alternatively, we could have a separate static service explicitly for the S3InputStream, which might be preferable to overloading the FileIO service for multiple operations.

danielcweeks · 2026-04-13T21:20:43Z

-  public FileRange(CompletableFuture<ByteBuffer> byteBuffer, long offset, int length)
-      throws EOFException {
-    Preconditions.checkNotNull(byteBuffer, "byteBuffer can't be null");
+  public FileRange(long offset, int length) {


I'm trying to understand why this is necessary. If we're coalescing ranges, we should probably be using something like Netty's CompositeByteBuf to wrap the file ranges and skipped ranges into a larger write buffer. This would allow us to read the composite ranges together but still populate the buffers directly.

Looking at how we're handling the ranges, the CompositeByteBuf approach probably doesn't make sense, but I have an alternate suggestion.

danielcweeks · 2026-04-13T21:33:17Z

+   * @param linkedRanges map to store linked ranges for each coalesced range
+   * @return a new list of coalesced ranges
+   */
+  private List<FileRange> coalesce(


I think we can do this better by creating a wrapper object for the original FileRanges without impacting the existing public api. Rather than creating new FileRanges and inserting them, just create a inner wrapper class (e.g. CoalescedFileRanges) that contains the ordered list of file ranges that should be combined. Then we can dispatch the read requests for the total combined range and populate internal buffers.

yes agree on this as we did something similar in Hadoop S3A implementation.

danielcweeks · 2026-04-13T21:35:11Z

@ahmarsuhail this looks good in concept, but we need to figure out a better path for the executor service handling and ideally not change the FileRange api.

mukund-thakur · 2026-04-14T21:03:53Z

+   * @param linkedRanges map to store linked ranges for each coalesced range
+   * @return a new list of coalesced ranges
+   */
+  private List<FileRange> coalesce(


yes agree on this as we did something similar in Hadoop S3A implementation.

mukund-thakur · 2026-04-14T21:07:27Z

-    Preconditions.checkNotNull(byteBuffer, "byteBuffer can't be null");
+  public FileRange(long offset, int length) {
+    Preconditions.checkArgument(
+        length() >= 0, "Invalid length: %s in range (must be >= 0)", length);


we should use the the input length and offset in the preconditions. someone found this a bug #15926

adds read vectored implementation to S3InputStream

0e4731f

github-actions Bot added API parquet AWS labels Mar 10, 2026

use a read thread pool for readVectored

2cde5ad

github-actions Bot added the stale label Apr 10, 2026

danielcweeks removed the stale label Apr 13, 2026

danielcweeks reviewed Apr 13, 2026

View reviewed changes

mukund-thakur reviewed Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds a readVectored implementation to S3InputStream.#15581

Adds a readVectored implementation to S3InputStream.#15581
ahmarsuhail wants to merge 2 commits into
apache:mainfrom
ahmarsuhail:read-vectored

ahmarsuhail commented Mar 10, 2026

Uh oh!

ahmarsuhail commented Mar 10, 2026

Uh oh!

steveloughran commented Mar 10, 2026

Uh oh!

github-actions Bot commented Apr 10, 2026

Uh oh!

danielcweeks Apr 13, 2026

Uh oh!

danielcweeks Apr 13, 2026 •

edited

Loading

Uh oh!

danielcweeks Apr 13, 2026 •

edited

Loading

Uh oh!

danielcweeks Apr 13, 2026

Uh oh!

mukund-thakur Apr 14, 2026

Uh oh!

danielcweeks commented Apr 13, 2026

Uh oh!

mukund-thakur Apr 14, 2026

Uh oh!

mukund-thakur Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ahmarsuhail commented Mar 10, 2026

Uh oh!

ahmarsuhail commented Mar 10, 2026

Uh oh!

steveloughran commented Mar 10, 2026

Uh oh!

github-actions Bot commented Apr 10, 2026

Uh oh!

danielcweeks Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

danielcweeks Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danielcweeks Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danielcweeks Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

mukund-thakur Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

danielcweeks commented Apr 13, 2026

Uh oh!

mukund-thakur Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

mukund-thakur Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

danielcweeks Apr 13, 2026 •

edited

Loading

danielcweeks Apr 13, 2026 •

edited

Loading