Skip to content

fix(review): default scope to main, add --base flag#1175

Merged
peyton-alt merged 9 commits into
mainfrom
feat/review-base-flag-mainline-default
May 12, 2026
Merged

fix(review): default scope to main, add --base flag#1175
peyton-alt merged 9 commits into
mainfrom
feat/review-base-flag-mainline-default

Conversation

@peyton-alt
Copy link
Copy Markdown
Contributor

@peyton-alt peyton-alt commented May 10, 2026

https://entire.io/gh/entireio/cli/trails/350

Summary

Fixes the user-reported failure where entire review would silently scope reviews against random recently-merged feature branches (e.g. origin/soph/add-codex-post-tool) instead of mainline, producing reviews that "had nothing to do with what was done in that branch." Plus a complementary feature that surfaces in-progress agent session context to the reviewer agent.

Scope-detection changes

  1. Default scope is mainline, not "closest merged ancestor" — delete the timestamp-based heuristic. It was designed for stacked-PR workflows but routinely picked unrelated recently-merged feature branches because git fetch mirrors all of origin's branches and merged PR branches often live on before deletion. New default walks the fallback chain origin/HEAD → origin/main → origin/master → main → master. Removes ~130 lines (slowDetectScopeBaseRef, isAncestorOf, candidate selection).

  2. --base <ref> flag for explicit override — stacked-PR review is now a user choice, not an inference. Accepts any rev-parseable ref. Validates via go-git repo.ResolveRevision(plumbing.Revision(ref)) (matching the codebase pattern at explain.go:156, addressing Soph's PR feedback). An unknown ref errors immediately, naming the bad ref.

  3. Uncommitted changes now in review scope — scope clause changed from "review only the commits unique to this branch vs <ref>." to "review the commits unique to this branch vs <ref>, plus any uncommitted changes in the working tree." Without this, agents correctly skipped in-progress edits, which surprised users mid-feature.

  4. Three-dot diff in countFilesChanged — was using git diff base..HEAD which over-counts files when mainline advances past the branch point (upstream-only changes show as reversed deltas). Now uses base...HEAD (merge-base diff). Caught by codex during smoke review of this PR.

In-progress session context (added during smoke iteration)

Without this, reviews of branches with only uncommitted work lose the rich context that makes Entire's review better than vanilla agent reviews — the agent sees the diff but no signal about what each session was for.

reviewCheckpointContext is now a composer over two sources:

reviewCheckpointContext(ctx, worktreeRoot, scopeBaseRef)
  = committed-section + in-progress-section
  where:
    committed-section:  "Checkpoint context from commits in scope:" (unchanged, reads entire/checkpoints/v1)
    in-progress-section: NEW "In-progress session context (uncommitted):" (reads session state + filesystem prompt.txt)

In-progress entries are filtered by:

  • WorktreePath matches the current worktree (canonicalised via filepath.EvalSymlinks for /var vs /private/var on macOS)
  • BaseCommit matches current HEAD (work since the last commit only)
  • !FullyCondensed (still in flight)
  • Kind != KindAgentReview (don't include the review agent itself)

Per-entry format matches the existing prompt-fallback path (reviewPromptText), reused verbatim — no new truncation or caps. Sessions whose prompt.txt is missing or empty are skipped silently.

Background

User's reproducer (bug #4):

  • Fresh clone, never checked out the offending branch
  • Run entire review → banner: Reviewing <branch> vs origin/soph/add-codex-post-tool: 34 commits, 73 files changed
  • Codex's review covered 30+ unrelated upstream commits, not the actual branch work

Diagnosis: scope detection picked origin/soph/add-codex-post-tool because (a) it was merged into mainline recently so its tip committerdate is newer than older mainline-only commits, (b) it's an ancestor of HEAD via the merge into main, (c) the heuristic prefers most-recent-tip among ancestors. The agent then faithfully reviewed origin/soph/add-codex-post-tool...HEAD, which dragged in everything between that branch's merge and current HEAD.

Fresh-repo reproduction confirmed this isn't about stale local refs — git fetch mirrors origin's branches by default, so every contributor's local repo has the same offending refs.

Smoke-verified end-to-end

After the fix on this branch, codex's review opened with:

Reviewed scope: no commits unique to origin/main; findings are from the uncommitted worktree.

Then for the in-progress session context (separately smoke-tested with a fake codex binary capturing stdin):

In-progress session context (uncommitted):
802b09ab Claude Code (touched: 2 files) prompt:

Both proofs use live data from the same Claude Code session this PR was authored in.

Coverage

The --base flag is generic: accepts any rev-parseable ref. The scope clause covers any non-zero-exit failure mode for the reviewer agent (auth, network, license, segfault, etc. all surface the same way). The session-context section auto-skips condensed sessions, review-kind sessions, and sessions from other worktrees.

What this PR does NOT do

  • Doesn't change behavior for entire review --fix or entire review --findings. Only affects the main review-run code path.
  • Doesn't address Ctrl+O drill-in truncation (bug add .entire/logs to .entire/.gitignore, small refactor #3 from the broader observability sweep — separate ticket).
  • Doesn't add a persistent default base in settings — --base is per-invocation. If users want a project-default base, that's a follow-up design discussion.
  • Doesn't run a summary-generation LLM call for in-progress sessions (would be option A from the brainstorm). Uses raw latest-prompt fallback path (option C); cheaper and matches existing committed-pipeline prompt-fallback behavior.

Test plan

  • mise run fmt clean
  • mise run lint — 0 issues
  • go test ./cmd/entire/cli/review/ passes
  • go test ./cmd/entire/cli/ passes (includes 3 new tests for the in-progress session context path)
  • Manual smoke against fresh-build binary with multiple --base shapes
  • Real codex run on this branch confirmed correct scope + uncommitted review behavior
  • In-progress session context smoke-verified against live data

Tests added

Scope/flag:

  • TestDetectScopeBaseRef_PrefersMainOverAncestorBranches
  • TestComputeScopeStats_BaseOverrideUsed / _BaseOverrideUnknownRefErrors / _EmptyOverrideUsesMainlineDetection
  • TestComposeReviewPrompt_ScopeIncludesUncommittedChanges
  • TestReviewCommandSmoke_BaseFlagThreadsThroughToPromptAndBanner / _BadBaseRefErrorsBeforeAgentSpawn

In-progress session context:

  • TestReviewSessionContext_IncludesActiveSessionWithLatestPrompt
  • TestReviewSessionContext_SkipsSessionsOutsideScope
  • TestReviewCommandSmoke_IncludesInProgressSessionContextInPrompt

One existing test updated (TestDetectScopeBaseRef_ClosestAncestorPreferred replaced with TestDetectScopeBaseRef_PrefersMainOverAncestorBranches); TestIsAncestorOf deleted (function gone).

🤖 Generated with Claude Code

…ommitted changes

Four coordinated changes to fix the user-reported failure mode where
`entire review` would silently scope reviews against random
recently-merged feature branches (e.g., `origin/soph/add-codex-post-tool`)
instead of mainline, producing reviews dominated by 30+ unrelated
upstream commits.

1. scope.go: delete the "merged-into-HEAD ancestor with most recent
   committerdate" heuristic. It was designed for stacked-PR workflows
   but routinely picked unrelated branches because `git fetch` mirrors
   all of origin's branches by default and merged PR branches often
   live on for a while before deletion. Replaced with the existing
   fallback chain origin/HEAD → origin/main → origin/master → main →
   master. Removed slowDetectScopeBaseRef, isAncestorOf, and the
   candidate selection loop — ~130 lines of code go away.

2. cmd.go: register `--base <ref>` flag for explicit override (stacked
   PR review is now an explicit user choice, not an inferred heuristic).
   Validates the ref via `git rev-parse --verify <ref>^{commit}` before
   spawning agents; an unknown ref errors immediately with a message
   naming the bad ref. Threaded baseOverride through runReview →
   runSingleAgentPath / runMultiAgentPath → detectScope →
   ComputeScopeStats.

3. prompt.go: scope clause now reads "review the commits unique to
   this branch vs <ref>, plus any uncommitted changes in the working
   tree." Without this, agents (correctly) followed the prior
   commits-only wording and silently skipped uncommitted work — the
   most common case when a developer is mid-feature. Checkpoint
   context is preserved as a separate prompt section.

4. scope.go countFilesChanged: switch from two-dot diff
   (`git diff base..HEAD`) to three-dot (`git diff base...HEAD`).
   The two-dot form shows the tree diff between base and HEAD, so
   files modified on mainline AFTER the branch was cut appear as
   reversed deltas, inflating the banner's file count. Three-dot uses
   the merge-base, excluding upstream-only changes. Caught by codex
   during the smoke test review of this very PR.

Tests:
- TestDetectScopeBaseRef_PrefersMainOverAncestorBranches: replaces
  the obsolete TestDetectScopeBaseRef_ClosestAncestorPreferred which
  asserted the opposite (now-deleted) behavior.
- TestComputeScopeStats_BaseOverrideUsed /
  _BaseOverrideUnknownRefErrors /
  _EmptyOverrideUsesMainlineDetection: unit-level coverage for the
  override path.
- TestComposeReviewPrompt_ScopeIncludesUncommittedChanges: asserts
  the new scope wording.
- TestReviewCommandSmoke_BaseFlagThreadsThroughToPromptAndBanner /
  _BadBaseRefErrorsBeforeAgentSpawn: end-to-end command-level tests
  that catch wiring bugs (would have caught the silentErr
  suppression bug I had to fix mid-smoke).
- Deleted TestIsAncestorOf (function is gone).

Smoke-verified end-to-end on this branch: codex's review correctly
identified `Reviewed scope: no commits unique to origin/main; findings
are from the uncommitted worktree.` — proof that the right base was
picked, uncommitted changes are in scope, and the wrong-branch-review
failure mode is structurally fixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Entire-Checkpoint: 5600c1e3f60e
Copilot AI review requested due to automatic review settings May 10, 2026 22:08
@peyton-alt peyton-alt requested a review from a team as a code owner May 10, 2026 22:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves entire review scoping so reviews default to mainline (instead of heuristically picking recently-merged feature branches), adds an explicit --base override for stacked-PR workflows, and updates the review prompt/scope stats to include uncommitted working-tree changes.

Changes:

  • Replace timestamp/ancestor-branch heuristic with a predictable mainline fallback chain, plus a --base <ref> override validated up-front.
  • Expand the prompt’s scope clause to include uncommitted working-tree changes.
  • Fix file-change counting to use a three-dot diff (base...HEAD) so upstream-only changes aren’t miscounted.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
cmd/entire/cli/review/scope.go Reworks base ref selection logic; validates --base; uses base...HEAD for changed-file counting.
cmd/entire/cli/review/scope_test.go Updates/extends unit tests for mainline-first base detection and --base override behavior.
cmd/entire/cli/review/prompt.go Updates prompt scope clause to include uncommitted working-tree changes.
cmd/entire/cli/review/prompt_test.go Adjusts prompt expectations and adds coverage for uncommitted-scope wording.
cmd/entire/cli/review/cmd.go Wires --base through single- and multi-agent review flows; changes failure behavior for invalid overrides.
cmd/entire/cli/review_context_test.go Adds command-level smoke tests to ensure --base is threaded through and invalid refs fail before agent spawn.

Comment thread cmd/entire/cli/review/cmd.go Outdated
Comment thread cmd/entire/cli/review/cmd.go Outdated
Comment thread cmd/entire/cli/review/scope_test.go Outdated
Comment thread cmd/entire/cli/review/scope.go Outdated
- cmd/entire/cli/agent/codex/reviewer_test.go: update scope-clause
  assertion to match the new wording. Was the only test outside
  cmd/entire/cli/review/ that asserted the old "review only the
  commits unique to..." string; missed it in the first sweep.
- cmd.go: --base help text and flag description now list the full
  fallback chain (origin/HEAD → origin/main → origin/master → main →
  master), aligned with the implementation in fallbackScopeRef.
- scope.go: ComputeScopeStats doc comment now mentions the `^{commit}`
  peel applied to the rev-parse --verify check, matching the actual
  code.
- scope_test.go: remove stale comment block that still described the
  old "closest ancestor preferred" behavior above the renamed test.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Entire-Checkpoint: 4e5cb1b17589
@peyton-alt
Copy link
Copy Markdown
Contributor Author

@BugBot review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit a2e951c. Configure here.

Comment thread cmd/entire/cli/review/cmd.go
Cursor Bugbot caught an inconsistency in detectScope: when git.PlainOpen
fails AND baseOverride is set, the function silently returned ("", nil)
and the run proceeded in degraded mode — dropping the user's explicit
--base flag without warning. That contradicts the fail-loud contract
the function's own doc comment establishes for invalid overrides on
the ComputeScopeStats path a few lines below.

Now mirror the same pattern at the PlainOpen branch: on PlainOpen
failure with a non-empty baseOverride, return an error naming both
the override and the worktree path so the user can diagnose.
Auto-detection failures (no --base) still degrade gracefully.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Entire-Checkpoint: c21ae2d65511
Copy link
Copy Markdown
Collaborator

@Soph Soph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise 👍

Comment thread cmd/entire/cli/review/scope.go Outdated
peyton-alt and others added 3 commits May 11, 2026 13:05
Extends the existing checkpoint-context pipeline to surface
uncommitted agent work alongside committed checkpoint summaries.
Without this, reviews of branches with only uncommitted work
(common case: developer mid-feature) lose the rich context that
makes Entire's review better than vanilla agent reviews — agents
see the diff but no signal about what each session was for.

reviewCheckpointContext is now a thin composer over two sources:

  reviewCheckpointContext(ctx, worktreeRoot, scopeBaseRef)
    = committed-section + in-progress-section
    where:
      committed-section: existing "Checkpoint context from commits
        in scope:" block, sourced from entire/checkpoints/v1
      in-progress-section: NEW "In-progress session context
        (uncommitted):" block, sourced from session state files
        (.git/entire-sessions/<id>.json) + the on-filesystem
        prompt.txt that lifecycle.go appends to every turn

Inclusion criteria for in-progress sessions (reviewSessionContext):
  - state.WorktreePath == current worktree (canonicalised via
    filepath.EvalSymlinks so /var vs /private/var on macOS matches)
  - state.BaseCommit == current HEAD (work since last commit only)
  - !state.FullyCondensed (still in flight)
  - state.Kind != KindAgentReview (don't include the review agent)

Per-session entry format mirrors the existing prompt-fallback path
(reviewPromptText), so the agent-facing output is consistent
whether the work is committed or in-progress:

  <sessionID[:8]> <AgentType> [(touched: N file(s))] prompt: <latest>

Reuses reviewPromptText for prompt rendering — no new truncation
logic, no new caps. Sessions whose prompt.txt is missing or empty
are skipped silently.

The compose layer (joinReviewContextSections) emits non-empty
sections joined by a blank line; ComposeReviewPrompt already
skips empty sections cleanly, so no plumbing change needed for
the "0 commits + 0 sessions" degenerate case.

Smoke-verified end-to-end on this branch: built binary, ran review
with a fake codex that captures stdin, observed the section in the
captured prompt with live data from the current claude-code
session (correct session ID prefix, agent name, touched file count,
and latest prompt from the actual conversation).

Tests added:
- TestReviewSessionContext_IncludesActiveSessionWithLatestPrompt:
  unit-level, asserts the section text contains the session ID
  prefix, agent display name, and latest prompt.
- TestReviewSessionContext_SkipsSessionsOutsideScope: four-case
  exclusion check (FullyCondensed, WrongWorktree, WrongBaseCommit,
  KindAgentReview).
- TestReviewCommandSmoke_IncludesInProgressSessionContextInPrompt:
  end-to-end through the cobra command, asserts the in-progress
  block reaches the captured agent prompt. Catches wiring
  regressions between reviewSessionContext, the deps bridge, and
  ComposeReviewPrompt that the unit tests alone cannot.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Entire-Checkpoint: d08b2e17d004
Soph noted on PR #1175 that the existing codebase pattern for ref
validation is repo.ResolveRevision (explain.go:156), not shelling
out to git rev-parse. Switch ComputeScopeStats's --base override
check to follow that pattern.

ResolveRevision handles all the cases the previous rev-parse call
did — branches, tags (with annotated-tag dereferencing), abbreviated
SHAs, HEAD~N — and avoids the subprocess overhead plus the
git-on-PATH dependency that runGit carried.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Entire-Checkpoint: 1f14090406c1
@peyton-alt peyton-alt enabled auto-merge (squash) May 11, 2026 17:20
Soph
Soph previously approved these changes May 11, 2026
Comment thread cmd/entire/cli/review/scope.go Outdated
Soph pointed out the doc comment still referenced `git rev-parse
--verify <ref>^{commit}` after the validation switched to
`repo.ResolveRevision` in b22552c. Update the comment to
match the actual code.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Entire-Checkpoint: e31eb25c73e6
peyton-alt and others added 2 commits May 12, 2026 00:14
… message, regression test)

- Sweep "closest ancestor" framing and dead git for-each-ref reference
  from 5 sites (prompt.go, scope.go, types/reviewer.go, CLAUDE.md) left
  behind by the heuristic deletion.
- Set cmd.SilenceUsage = true on the WorktreeRoot and currentHeadSHA
  prerequisite branches in both runSingleAgentPath and runMultiAgentPath
  so prerequisite failures no longer dump cobra's usage block at users.
- Improve the fallback-chain failure error in fallbackScopeRef: name the
  refs tried and point users at --base <ref> or `git fetch`.
- Add TestCountFilesChanged_ThreeDotIgnoresUpstreamOnlyChanges — a
  regression test where main advances past the branch point, so the
  three-dot vs two-dot diff distinction is observable. Existing tests
  use fast-forward branches and would pass under either dot count.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Entire-Checkpoint: 945292726919
@peyton-alt peyton-alt changed the title feat(review): default scope to mainline, add --base flag, include uncommitted changes fix(review): default scope to main, add --base flag May 12, 2026
@peyton-alt peyton-alt merged commit 56bc6e7 into main May 12, 2026
9 checks passed
@peyton-alt peyton-alt deleted the feat/review-base-flag-mainline-default branch May 12, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants