fix(review): default scope to main, add --base flag by peyton-alt · Pull Request #1175 · entireio/cli

peyton-alt · 2026-05-10T22:08:37Z

https://entire.io/gh/entireio/cli/trails/350

Summary

Fixes the user-reported failure where entire review would silently scope reviews against random recently-merged feature branches (e.g. origin/soph/add-codex-post-tool) instead of mainline, producing reviews that "had nothing to do with what was done in that branch." Plus a complementary feature that surfaces in-progress agent session context to the reviewer agent.

Scope-detection changes

Default scope is mainline, not "closest merged ancestor" — delete the timestamp-based heuristic. It was designed for stacked-PR workflows but routinely picked unrelated recently-merged feature branches because git fetch mirrors all of origin's branches and merged PR branches often live on before deletion. New default walks the fallback chain origin/HEAD → origin/main → origin/master → main → master. Removes ~130 lines (slowDetectScopeBaseRef, isAncestorOf, candidate selection).
--base <ref> flag for explicit override — stacked-PR review is now a user choice, not an inference. Accepts any rev-parseable ref. Validates via go-git repo.ResolveRevision(plumbing.Revision(ref)) (matching the codebase pattern at explain.go:156, addressing Soph's PR feedback). An unknown ref errors immediately, naming the bad ref.
Uncommitted changes now in review scope — scope clause changed from "review only the commits unique to this branch vs <ref>." to "review the commits unique to this branch vs <ref>, plus any uncommitted changes in the working tree." Without this, agents correctly skipped in-progress edits, which surprised users mid-feature.
Three-dot diff in countFilesChanged — was using git diff base..HEAD which over-counts files when mainline advances past the branch point (upstream-only changes show as reversed deltas). Now uses base...HEAD (merge-base diff). Caught by codex during smoke review of this PR.

In-progress session context (added during smoke iteration)

Without this, reviews of branches with only uncommitted work lose the rich context that makes Entire's review better than vanilla agent reviews — the agent sees the diff but no signal about what each session was for.

reviewCheckpointContext is now a composer over two sources:

reviewCheckpointContext(ctx, worktreeRoot, scopeBaseRef)
  = committed-section + in-progress-section
  where:
    committed-section:  "Checkpoint context from commits in scope:" (unchanged, reads entire/checkpoints/v1)
    in-progress-section: NEW "In-progress session context (uncommitted):" (reads session state + filesystem prompt.txt)

In-progress entries are filtered by:

WorktreePath matches the current worktree (canonicalised via filepath.EvalSymlinks for /var vs /private/var on macOS)
BaseCommit matches current HEAD (work since the last commit only)
!FullyCondensed (still in flight)
Kind != KindAgentReview (don't include the review agent itself)

Per-entry format matches the existing prompt-fallback path (reviewPromptText), reused verbatim — no new truncation or caps. Sessions whose prompt.txt is missing or empty are skipped silently.

Background

User's reproducer (bug #4):

Fresh clone, never checked out the offending branch
Run entire review → banner: Reviewing <branch> vs origin/soph/add-codex-post-tool: 34 commits, 73 files changed
Codex's review covered 30+ unrelated upstream commits, not the actual branch work

Diagnosis: scope detection picked origin/soph/add-codex-post-tool because (a) it was merged into mainline recently so its tip committerdate is newer than older mainline-only commits, (b) it's an ancestor of HEAD via the merge into main, (c) the heuristic prefers most-recent-tip among ancestors. The agent then faithfully reviewed origin/soph/add-codex-post-tool...HEAD, which dragged in everything between that branch's merge and current HEAD.

Fresh-repo reproduction confirmed this isn't about stale local refs — git fetch mirrors origin's branches by default, so every contributor's local repo has the same offending refs.

Smoke-verified end-to-end

After the fix on this branch, codex's review opened with:

Reviewed scope: no commits unique to origin/main; findings are from the uncommitted worktree.

Then for the in-progress session context (separately smoke-tested with a fake codex binary capturing stdin):

In-progress session context (uncommitted):
802b09ab Claude Code (touched: 2 files) prompt:

Both proofs use live data from the same Claude Code session this PR was authored in.

Coverage

The --base flag is generic: accepts any rev-parseable ref. The scope clause covers any non-zero-exit failure mode for the reviewer agent (auth, network, license, segfault, etc. all surface the same way). The session-context section auto-skips condensed sessions, review-kind sessions, and sessions from other worktrees.

What this PR does NOT do

Doesn't change behavior for entire review --fix or entire review --findings. Only affects the main review-run code path.
Doesn't address Ctrl+O drill-in truncation (bug add .entire/logs to .entire/.gitignore, small refactor #3 from the broader observability sweep — separate ticket).
Doesn't add a persistent default base in settings — --base is per-invocation. If users want a project-default base, that's a follow-up design discussion.
Doesn't run a summary-generation LLM call for in-progress sessions (would be option A from the brainstorm). Uses raw latest-prompt fallback path (option C); cheaper and matches existing committed-pipeline prompt-fallback behavior.

Test plan

mise run fmt clean
mise run lint — 0 issues
go test ./cmd/entire/cli/review/ passes
go test ./cmd/entire/cli/ passes (includes 3 new tests for the in-progress session context path)
Manual smoke against fresh-build binary with multiple --base shapes
Real codex run on this branch confirmed correct scope + uncommitted review behavior
In-progress session context smoke-verified against live data

Tests added

Scope/flag:

TestDetectScopeBaseRef_PrefersMainOverAncestorBranches
TestComputeScopeStats_BaseOverrideUsed / _BaseOverrideUnknownRefErrors / _EmptyOverrideUsesMainlineDetection
TestComposeReviewPrompt_ScopeIncludesUncommittedChanges
TestReviewCommandSmoke_BaseFlagThreadsThroughToPromptAndBanner / _BadBaseRefErrorsBeforeAgentSpawn

In-progress session context:

TestReviewSessionContext_IncludesActiveSessionWithLatestPrompt
TestReviewSessionContext_SkipsSessionsOutsideScope
TestReviewCommandSmoke_IncludesInProgressSessionContextInPrompt

One existing test updated (TestDetectScopeBaseRef_ClosestAncestorPreferred replaced with TestDetectScopeBaseRef_PrefersMainOverAncestorBranches); TestIsAncestorOf deleted (function gone).

🤖 Generated with Claude Code

…ommitted changes Four coordinated changes to fix the user-reported failure mode where `entire review` would silently scope reviews against random recently-merged feature branches (e.g., `origin/soph/add-codex-post-tool`) instead of mainline, producing reviews dominated by 30+ unrelated upstream commits. 1. scope.go: delete the "merged-into-HEAD ancestor with most recent committerdate" heuristic. It was designed for stacked-PR workflows but routinely picked unrelated branches because `git fetch` mirrors all of origin's branches by default and merged PR branches often live on for a while before deletion. Replaced with the existing fallback chain origin/HEAD → origin/main → origin/master → main → master. Removed slowDetectScopeBaseRef, isAncestorOf, and the candidate selection loop — ~130 lines of code go away. 2. cmd.go: register `--base <ref>` flag for explicit override (stacked PR review is now an explicit user choice, not an inferred heuristic). Validates the ref via `git rev-parse --verify <ref>^{commit}` before spawning agents; an unknown ref errors immediately with a message naming the bad ref. Threaded baseOverride through runReview → runSingleAgentPath / runMultiAgentPath → detectScope → ComputeScopeStats. 3. prompt.go: scope clause now reads "review the commits unique to this branch vs <ref>, plus any uncommitted changes in the working tree." Without this, agents (correctly) followed the prior commits-only wording and silently skipped uncommitted work — the most common case when a developer is mid-feature. Checkpoint context is preserved as a separate prompt section. 4. scope.go countFilesChanged: switch from two-dot diff (`git diff base..HEAD`) to three-dot (`git diff base...HEAD`). The two-dot form shows the tree diff between base and HEAD, so files modified on mainline AFTER the branch was cut appear as reversed deltas, inflating the banner's file count. Three-dot uses the merge-base, excluding upstream-only changes. Caught by codex during the smoke test review of this very PR. Tests: - TestDetectScopeBaseRef_PrefersMainOverAncestorBranches: replaces the obsolete TestDetectScopeBaseRef_ClosestAncestorPreferred which asserted the opposite (now-deleted) behavior. - TestComputeScopeStats_BaseOverrideUsed / _BaseOverrideUnknownRefErrors / _EmptyOverrideUsesMainlineDetection: unit-level coverage for the override path. - TestComposeReviewPrompt_ScopeIncludesUncommittedChanges: asserts the new scope wording. - TestReviewCommandSmoke_BaseFlagThreadsThroughToPromptAndBanner / _BadBaseRefErrorsBeforeAgentSpawn: end-to-end command-level tests that catch wiring bugs (would have caught the silentErr suppression bug I had to fix mid-smoke). - Deleted TestIsAncestorOf (function is gone). Smoke-verified end-to-end on this branch: codex's review correctly identified `Reviewed scope: no commits unique to origin/main; findings are from the uncommitted worktree.` — proof that the right base was picked, uncommitted changes are in scope, and the wrong-branch-review failure mode is structurally fixed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Entire-Checkpoint: 5600c1e3f60e

Copilot

Pull request overview

This PR improves entire review scoping so reviews default to mainline (instead of heuristically picking recently-merged feature branches), adds an explicit --base override for stacked-PR workflows, and updates the review prompt/scope stats to include uncommitted working-tree changes.

Changes:

Replace timestamp/ancestor-branch heuristic with a predictable mainline fallback chain, plus a --base <ref> override validated up-front.
Expand the prompt’s scope clause to include uncommitted working-tree changes.
Fix file-change counting to use a three-dot diff (base...HEAD) so upstream-only changes aren’t miscounted.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
cmd/entire/cli/review/scope.go	Reworks base ref selection logic; validates `--base`; uses `base...HEAD` for changed-file counting.
cmd/entire/cli/review/scope_test.go	Updates/extends unit tests for mainline-first base detection and `--base` override behavior.
cmd/entire/cli/review/prompt.go	Updates prompt scope clause to include uncommitted working-tree changes.
cmd/entire/cli/review/prompt_test.go	Adjusts prompt expectations and adds coverage for uncommitted-scope wording.
cmd/entire/cli/review/cmd.go	Wires `--base` through single- and multi-agent review flows; changes failure behavior for invalid overrides.
cmd/entire/cli/review_context_test.go	Adds command-level smoke tests to ensure `--base` is threaded through and invalid refs fail before agent spawn.

- cmd/entire/cli/agent/codex/reviewer_test.go: update scope-clause assertion to match the new wording. Was the only test outside cmd/entire/cli/review/ that asserted the old "review only the commits unique to..." string; missed it in the first sweep. - cmd.go: --base help text and flag description now list the full fallback chain (origin/HEAD → origin/main → origin/master → main → master), aligned with the implementation in fallbackScopeRef. - scope.go: ComputeScopeStats doc comment now mentions the `^{commit}` peel applied to the rev-parse --verify check, matching the actual code. - scope_test.go: remove stale comment block that still described the old "closest ancestor preferred" behavior above the renamed test. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Entire-Checkpoint: 4e5cb1b17589

peyton-alt · 2026-05-10T22:22:22Z

@BugBot review

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit a2e951c. Configure here.}

Cursor Bugbot caught an inconsistency in detectScope: when git.PlainOpen fails AND baseOverride is set, the function silently returned ("", nil) and the run proceeded in degraded mode — dropping the user's explicit --base flag without warning. That contradicts the fail-loud contract the function's own doc comment establishes for invalid overrides on the ComputeScopeStats path a few lines below. Now mirror the same pattern at the PlainOpen branch: on PlainOpen failure with a non-empty baseOverride, return an error naming both the override and the worktree path so the user can diagnose. Auto-detection failures (no --base) still degrade gracefully. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Entire-Checkpoint: c21ae2d65511

Soph

otherwise 👍

Extends the existing checkpoint-context pipeline to surface uncommitted agent work alongside committed checkpoint summaries. Without this, reviews of branches with only uncommitted work (common case: developer mid-feature) lose the rich context that makes Entire's review better than vanilla agent reviews — agents see the diff but no signal about what each session was for. reviewCheckpointContext is now a thin composer over two sources: reviewCheckpointContext(ctx, worktreeRoot, scopeBaseRef) = committed-section + in-progress-section where: committed-section: existing "Checkpoint context from commits in scope:" block, sourced from entire/checkpoints/v1 in-progress-section: NEW "In-progress session context (uncommitted):" block, sourced from session state files (.git/entire-sessions/<id>.json) + the on-filesystem prompt.txt that lifecycle.go appends to every turn Inclusion criteria for in-progress sessions (reviewSessionContext): - state.WorktreePath == current worktree (canonicalised via filepath.EvalSymlinks so /var vs /private/var on macOS matches) - state.BaseCommit == current HEAD (work since last commit only) - !state.FullyCondensed (still in flight) - state.Kind != KindAgentReview (don't include the review agent) Per-session entry format mirrors the existing prompt-fallback path (reviewPromptText), so the agent-facing output is consistent whether the work is committed or in-progress: <sessionID[:8]> <AgentType> [(touched: N file(s))] prompt: <latest> Reuses reviewPromptText for prompt rendering — no new truncation logic, no new caps. Sessions whose prompt.txt is missing or empty are skipped silently. The compose layer (joinReviewContextSections) emits non-empty sections joined by a blank line; ComposeReviewPrompt already skips empty sections cleanly, so no plumbing change needed for the "0 commits + 0 sessions" degenerate case. Smoke-verified end-to-end on this branch: built binary, ran review with a fake codex that captures stdin, observed the section in the captured prompt with live data from the current claude-code session (correct session ID prefix, agent name, touched file count, and latest prompt from the actual conversation). Tests added: - TestReviewSessionContext_IncludesActiveSessionWithLatestPrompt: unit-level, asserts the section text contains the session ID prefix, agent display name, and latest prompt. - TestReviewSessionContext_SkipsSessionsOutsideScope: four-case exclusion check (FullyCondensed, WrongWorktree, WrongBaseCommit, KindAgentReview). - TestReviewCommandSmoke_IncludesInProgressSessionContextInPrompt: end-to-end through the cobra command, asserts the in-progress block reaches the captured agent prompt. Catches wiring regressions between reviewSessionContext, the deps bridge, and ComposeReviewPrompt that the unit tests alone cannot. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Entire-Checkpoint: d08b2e17d004

Soph noted on PR #1175 that the existing codebase pattern for ref validation is repo.ResolveRevision (explain.go:156), not shelling out to git rev-parse. Switch ComputeScopeStats's --base override check to follow that pattern. ResolveRevision handles all the cases the previous rev-parse call did — branches, tags (with annotated-tag dereferencing), abbreviated SHAs, HEAD~N — and avoids the subprocess overhead plus the git-on-PATH dependency that runGit carried. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Entire-Checkpoint: 1f14090406c1

…-mainline-default

Soph pointed out the doc comment still referenced `git rev-parse --verify <ref>^{commit}` after the validation switched to `repo.ResolveRevision` in b22552c. Update the comment to match the actual code. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Entire-Checkpoint: e31eb25c73e6

… message, regression test) - Sweep "closest ancestor" framing and dead git for-each-ref reference from 5 sites (prompt.go, scope.go, types/reviewer.go, CLAUDE.md) left behind by the heuristic deletion. - Set cmd.SilenceUsage = true on the WorktreeRoot and currentHeadSHA prerequisite branches in both runSingleAgentPath and runMultiAgentPath so prerequisite failures no longer dump cobra's usage block at users. - Improve the fallback-chain failure error in fallbackScopeRef: name the refs tried and point users at --base <ref> or `git fetch`. - Add TestCountFilesChanged_ThreeDotIgnoresUpstreamOnlyChanges — a regression test where main advances past the branch point, so the three-dot vs two-dot diff distinction is observable. Existing tests use fast-forward branches and would pass under either dot count. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Entire-Checkpoint: 945292726919

…-mainline-default

Copilot AI review requested due to automatic review settings May 10, 2026 22:08

peyton-alt requested a review from a team as a code owner May 10, 2026 22:08

Copilot started reviewing on behalf of peyton-alt May 10, 2026 22:09 View session

Copilot AI reviewed May 10, 2026

View reviewed changes

Comment thread cmd/entire/cli/review/cmd.go Outdated

Comment thread cmd/entire/cli/review/cmd.go Outdated

Comment thread cmd/entire/cli/review/scope_test.go Outdated

Comment thread cmd/entire/cli/review/scope.go Outdated

cursor Bot reviewed May 10, 2026

View reviewed changes

Comment thread cmd/entire/cli/review/cmd.go

Soph reviewed May 11, 2026

View reviewed changes

Comment thread cmd/entire/cli/review/scope.go Outdated

peyton-alt and others added 3 commits May 11, 2026 13:05

Merge remote-tracking branch 'origin/main' into feat/review-base-flag…

ebb1c78

…-mainline-default

peyton-alt enabled auto-merge (squash) May 11, 2026 17:20

Soph previously approved these changes May 11, 2026

View reviewed changes

Comment thread cmd/entire/cli/review/scope.go Outdated

peyton-alt dismissed Soph’s stale review via d78dafa May 11, 2026 17:31

peyton-alt mentioned this pull request May 12, 2026

review: clearer failure reason when findings aren't saved #1185

Merged

7 tasks

peyton-alt and others added 2 commits May 12, 2026 00:14

Merge remote-tracking branch 'origin/main' into feat/review-base-flag…

300436f

…-mainline-default

peyton-alt changed the title ~~feat(review): default scope to mainline, add --base flag, include uncommitted changes~~ fix(review): default scope to main, add --base flag May 12, 2026

Soph approved these changes May 12, 2026

View reviewed changes

peyton-alt merged commit 56bc6e7 into main May 12, 2026
9 checks passed

peyton-alt deleted the feat/review-base-flag-mainline-default branch May 12, 2026 15:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(review): default scope to main, add --base flag#1175

fix(review): default scope to main, add --base flag#1175
peyton-alt merged 9 commits into
mainfrom
feat/review-base-flag-mainline-default

peyton-alt commented May 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

peyton-alt commented May 10, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Soph left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Conversation

peyton-alt commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope-detection changes

In-progress session context (added during smoke iteration)

Background

Smoke-verified end-to-end

Coverage

What this PR does NOT do

Test plan

Tests added

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

peyton-alt commented May 10, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Soph left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

peyton-alt commented May 10, 2026 •

edited

Loading