Core: Add writer for unordered position deletes by aokolnychyi · Pull Request #7692 · apache/iceberg

aokolnychyi · 2023-05-24T01:05:21Z

This PR adds a position delete writer that can handle unordered position deletes. This writer should allow us to avoid a local sort for some MERGE operations. Specifically, consider MERGE operations where 90% of data are inserts and the table is partitioned but no sort order is defined. Right now, we always request a local sort to order deletes. However, that sort can be useless for inserts if no sort order is defined and fanout writer is enabled. Moreover, ordering inserts may lead to a spill, which is expensive for wide tables and large tasks.

aokolnychyi · 2023-05-24T01:10:39Z

+
+  @Benchmark
+  @Threads(1)
+  public void writeUnpartitionedFanoutPositionDeleteWriterShuffled(Blackhole blackhole)


We should expect 5-15% overhead for the new buffering writer, which can still be beneficial for the job if we skip local ordering for inserts and potentially avoid spilling. This benchmark also does not take into account the cost to order records, it only tests the write performance. We will use this writer only if fanout is enabled. We should also explore Puffin delete files that would persist bitmaps directly.

Benchmark Mode Cnt Score Error Units ParquetWritersBenchmark.writeUnpartitionedClusteredPositionDeleteWriter ss 5 6.004 ± 0.185 s/op ParquetWritersBenchmark.writeUnpartitionedFanoutPositionDeleteWriter ss 5 6.503 ± 0.171 s/op ParquetWritersBenchmark.writeUnpartitionedFanoutPositionDeleteWriterShuffled ss 5 6.616 ± 0.204 s/op

We should also explore Puffin delete files that would persist bitmaps directly

+1

About memory overhead (not sure any thing measures it now): Should be just additional space of map (data_file_path => bitmaps)? Will there be cases, esp in fanout, where a writer writes many delete of many data files, that will start to stress it?

I ran this benchmark (100 data files, 50k deletes each, 5 million deletes total) with a GC profiler and did not see anything bad. Issues will arise when there are lots of unique data files. That's unlikely as we distribute by partition and this writer will still be disabled by default, so users will have to opt in explicitly. It isn't perfect for sure but there would be reasonable cases for it.

aokolnychyi · 2023-05-24T01:11:26Z

cc @singhpk234 @amogh-jahagirdar @RussellSpitzer @szehon-ho @flyrain @rdblue

singhpk234

LGTM, Thanks @aokolnychyi !

szehon-ho

Looks good to me too, left few comments

szehon-ho · 2023-05-24T22:01:42Z

  }

+  public PositionDelete<R> set(CharSequence newPath, long newPos) {
+    this.path = newPath;


Nit question: is it cleaner to have this constructor delegate to the other one?

szehon-ho · 2023-05-24T22:02:41Z

+import org.roaringbitmap.longlong.Roaring64Bitmap;
+
+/**
+ * A position delete writer that is capable of handling unordered deletes without rows.


Nit: can we add javadoc to the PositionDeleteWriter, when we get a chance?

Will add in this PR.

szehon-ho · 2023-05-24T22:12:47Z

+
+  @Benchmark
+  @Threads(1)
+  public void writeUnpartitionedFanoutPositionDeleteWriterShuffled(Blackhole blackhole)


About memory overhead (not sure any thing measures it now): Should be just additional space of map (data_file_path => bitmaps)? Will there be cases, esp in fanout, where a writer writes many delete of many data files, that will start to stress it?

szehon-ho · 2023-05-24T22:13:07Z


-    this.positionDeleteRows =
-        RandomData.generateSpark(DeleteSchemaUtil.pathPosSchema(), NUM_ROWS, 0L);
+    this.positionDeleteRows = generatePositionDeletes(false /* shuffle */);


Don't shuffle?

szehon-ho · 2023-05-24T22:13:22Z

+
+    for (int pathIndex = 0; pathIndex < NUM_DATA_FILES_PER_POSITION_DELETE_FILE; pathIndex++) {
+      UTF8String path = UTF8String.fromString("path/to/position/delete/file/" + UUID.randomUUID());
+      int step = 10;


why not just make this outside?

aokolnychyi · 2023-05-25T15:06:00Z

Thanks, @singhpk234 @szehon-ho!

github-actions Bot added core data spark labels May 24, 2023

aokolnychyi commented May 24, 2023

View reviewed changes

Core: Add writer for unordered position deletes

20ebae5

aokolnychyi force-pushed the fanout-delete-only-writer branch from 903e827 to 20ebae5 Compare May 24, 2023 01:35

aokolnychyi closed this May 24, 2023

aokolnychyi reopened this May 24, 2023

singhpk234 approved these changes May 24, 2023

View reviewed changes

szehon-ho reviewed May 24, 2023

View reviewed changes

Review

933604c

aokolnychyi merged commit 5eb4511 into apache:master May 25, 2023

rodmeneses pushed a commit to rodmeneses/iceberg that referenced this pull request Feb 19, 2024

Core: Add writer for unordered position deletes (apache#7692)

83f2bb6

Conversation

aokolnychyi commented May 24, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi May 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi commented May 24, 2023

Uh oh!

singhpk234 left a comment

Choose a reason for hiding this comment

Uh oh!

szehon-ho left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi commented May 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aokolnychyi May 24, 2023 •

edited

Loading