Skip to content

Conversation

@blp
Copy link
Member

@blp blp commented Jan 8, 2026

This fixes the consistency of ad-hoc queries. It was probably difficult for the inconsistency to be visible, but it's still better to fix it.

blp added 2 commits January 8, 2026 08:51
The cast wasn't necessary in AdHocQueryExecution::children().

The clone written inline in AdHocQueryExecution::with_new_children() can
be more easily written as a real clone.

Signed-off-by: Ben Pfaff <[email protected]>
@blp blp requested a review from gz January 8, 2026 18:32
@blp blp self-assigned this Jan 8, 2026
@blp blp added bug Something isn't working adhoc Issue related to ad hoc query processing rust Pull requests that update Rust code labels Jan 8, 2026
Copilot AI review requested due to automatic review settings January 8, 2026 18:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes consistency issues with ad-hoc queries by ensuring that queries execute against a consistent snapshot of table data. Previously, the snapshot was accessed directly from a shared mutable reference, which could lead to inconsistent reads during query execution.

Key changes:

  • Refactored snapshot management to use immutable snapshots via Arc<BTreeMap<...>> instead of Arc<TokioMutex<BTreeMap<...>>>
  • Modified ad-hoc query execution to capture and use a consistent snapshot through DataFusion's session state
  • Centralized SQL execution logic into a new execute_sql helper function

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
crates/adapters/src/server.rs Updated query handler to pass Controller instead of SessionContext to support consistent snapshots
crates/adapters/src/controller.rs Added consistent_snapshot() method and refactored snapshot management to use immutable Arc-wrapped snapshots
crates/adapters/src/adhoc/table.rs Modified AdHocTable to retrieve snapshot from session state instead of storing it as a field; simplified execution plan cloning
crates/adapters/src/adhoc.rs Added execute_sql helper that captures consistent snapshot in session state before query execution

async fn scan(
&self,
_state: &dyn Session,
state: &dyn Session,
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter was renamed from _state to state to indicate it's now being used. However, consider a more descriptive name like session or df_session to better reflect that it's a DataFusion session containing the consistent snapshot.

Copilot uses AI. Check for mistakes.
projection: Option<&Vec<usize>>,
filters: &[Expr],
limit: Option<usize>,
_limit: Option<usize>,
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limit parameter is now prefixed with underscore since it's no longer used, but the parameter remains in the function signature. Consider whether this parameter should be removed entirely from the trait implementation if it's not being utilized.

Copilot uses AI. Check for mistakes.
Comment on lines +418 to 423
Ok(Arc::new(Self {
children,
..Arc::unwrap_or_clone(self)
}))
}

Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Arc::unwrap_or_clone with struct spread syntax assumes all fields except children should be cloned from the original. This is fragile if new fields are added to the struct, as they would be automatically included without explicit consideration. Consider explicitly listing all fields that should be preserved to make future modifications safer.

Suggested change
Ok(Arc::new(Self {
children,
..Arc::unwrap_or_clone(self)
}))
}
let mut base = Arc::unwrap_or_clone(self);
base.children = children;
Ok(Arc::new(base))
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@gz gz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't say I still understand this well since it's been a few months but at least didn't see anything wrong

@blp blp added this pull request to the merge queue Jan 9, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 9, 2026
Until now, each table scan during an ad-hoc query separately locked the
table of snapshots.  The table of snapshots could change from one scan to
the next.  This meant that ad-hoc queries that involved multiple tables,
or that scanned a single table multiple times, could work with
inconsistent data.  This fixes the problem.

Signed-off-by: Ben Pfaff <[email protected]>
@blp blp force-pushed the adhoc-consistency branch from c7a9189 to 360e320 Compare January 9, 2026 20:28
@blp blp enabled auto-merge January 9, 2026 20:28
@blp blp added this pull request to the merge queue Jan 9, 2026
Merged via the queue into main with commit 2f8f651 Jan 9, 2026
1 check passed
@blp blp deleted the adhoc-consistency branch January 9, 2026 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Issue related to ad hoc query processing bug Something isn't working rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants