Skip to content

[Improve][connector-starrocks] Improved starrocks source enumerator splits allocation algorithm for subtasks#10867

Open
JeremyXin wants to merge 2 commits into
apache:devfrom
JeremyXin:improve-starrocks-split-balance
Open

[Improve][connector-starrocks] Improved starrocks source enumerator splits allocation algorithm for subtasks#10867
JeremyXin wants to merge 2 commits into
apache:devfrom
JeremyXin:improve-starrocks-split-balance

Conversation

@JeremyXin
Copy link
Copy Markdown
Contributor

Purpose of this pull request

Similar to pr #9108, improving starrocks source enumerator splits allocation algorithm for subtasks and add UT.

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

Copy link
Copy Markdown
Contributor

@DanielLeens DanielLeens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I pulled the latest head locally and reviewed the real StarRocks source enumerator lifecycle instead of only the helper diff.

What this PR fixes

  • User pain: the current owner calculation can leave StarRocks splits uneven across readers.
  • Fix approach: sort split ids first, then assign them round-robin.
  • In one sentence: the goal makes sense, but the current implementation breaks both the recovery contract and the real multi-table runtime path.

Runtime path I checked

normal startup
  -> run() [StartRocksSourceSplitEnumerator.java:81-94]
      -> poll one table
      -> getStarRocksSourceSplit(table) [193-201]
      -> addPendingSplit(newSplits) [151-164]
      -> assignSplit(readers) [167-190]

reader recovery
  -> addSplitsBack(splits, subtaskId) [102-106]
      -> addPendingSplit(splits)
      -> assignSplit(subtaskId)

Problem 1: the returned-split recovery path is no longer correct

  • Location: StartRocksSourceSplitEnumerator.java:102-106, 151-164
  • Why this is a problem: before this change, addSplitsBack() could recompute the same owner from splitId.hashCode() and then immediately assign back to the recovering subtaskId. After the change, owner selection depends on the local round-robin order inside the current addPendingSplit() call. That means a split returned by reader 2 can now be put into reader 0's bucket, while assignSplit(Collections.singletonList(subtaskId)) still only tries to send work back to reader 2.
  • Risk: the split can remain stranded in pendingSplit, so the recovery path can stall even though the source still has unassigned work.
  • Suggested fix:
    • Option A: keep addSplitsBack() pinned to the original subtaskId instead of reusing the new owner calculation.
    • Option B: if you want re-computed ownership, it still needs to be derived from a stable split identity, not from the per-call round-robin position.
  • Severity: high

Problem 2: the new balancing claim does not hold on the real multi-table path

  • Location: StartRocksSourceSplitEnumerator.java:81-94, 151-164
  • Why this is a problem: run() processes one table at a time, but assignCount is reset to 0 on every addPendingSplit() call. So if a table produces a single split, that split always starts again from reader 0. With many small tables, the normal path still concentrates work on low-number readers.
  • Risk: the PR can replace one skew mode with another, especially for common small-table workloads.
  • Suggested fix:
    • Option A: keep assignCount monotonic across the whole enumeration cycle.
    • Option B: aggregate all generated splits first, then do one global round-robin pass.
  • Severity: high

Tests

  • The new test only exercises one synthetic single-batch call to the private addPendingSplit() helper (StarRocksSourceSplitEnumeratorTest.java:40-60). It does not cover the actual run() batching behavior or the addSplitsBack() recovery path, which is exactly where the regressions are.

Conclusion: merge after fixes

  1. Blocking items
  • Problem 1: recovery can strand returned splits in the wrong reader bucket.
  • Problem 2: the real multi-table path still does not produce the balanced behavior the PR is aiming for.
  1. Suggested follow-up
  • Please add lifecycle-level coverage for both run() and addSplitsBack() once the logic is adjusted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants