Skip to content

Fix v2 migration generation packing#1124

Merged
pfleidi merged 14 commits into
mainfrom
fix/checkpoints-v2-generation-refs-during-migration
May 6, 2026
Merged

Fix v2 migration generation packing#1124
pfleidi merged 14 commits into
mainfrom
fix/checkpoints-v2-generation-refs-during-migration

Conversation

@pfleidi
Copy link
Copy Markdown
Contributor

@pfleidi pfleidi commented May 6, 2026

https://entire.io/gh/entireio/cli/trails/306

What

Fix v2 checkpoint migration so under-threshold raw transcript batches stay in refs/entire/checkpoints/v2/full/current instead of creating an archived refs/entire/checkpoints/v2/full/ generation. This keeps migration behavior aligned with normal v2 generation rotation.

How

  • Reworked migration full-transcript packing to archive only full batches and leave the final partial batch in /full/current.
  • Preserved chronological packing and raw transcript readability, including reruns and force migration cleanup.
  • Added compare-and-set guards around /full/current updates so concurrent writers are not overwritten.
  • Centralized archived generation ref formatting in the checkpoint package.

Trade-offs and scope

  • The rotation threshold remains a soft limit. If /full/current already contains checkpoints, appending the final migrated partial batch can briefly push it past the threshold before a later rotation. Splitting that final batch to fill the current generation exactly was left out because it adds more complexity than the small, already-tolerated overshoot is worth.
  • This PR does not try to make concurrent migrations impossible. The /full/current updates are guarded so they avoid overwriting a changed ref, but repo-level migration locking is intentionally left for a separate change.
  • The migration path still has its own packing flow for raw transcripts because migrated /main writes intentionally skip raw transcript writes. The change keeps that flow scoped to deciding whether a batch becomes an archived generation or remains in /full/current.
  • The PR does not redesign normal v2 generation rotation, retention, cleanup policy, or v1 metadata migration. It only changes the migrated raw-transcript packing behavior and directly related repair/pruning paths.

Verification

  • mise run build
  • mise run lint
  • mise run test
  • mise run test:integration

Note

Medium Risk
Changes v2 migration packing and /full/current ref update/rotation behavior, including new compare-and-set semantics and shelling out to git update-ref, which can affect data integrity under concurrency if incorrect.

Overview
Fixes v2 checkpoint migration packing so only full batches become archived refs/entire/checkpoints/v2/full/<n> generations, while any final under-threshold batch remains in refs/entire/checkpoints/v2/full/current (including force-migration reruns).

Hardens generation rotation and migration writes with centralized archived ref naming (ArchivedGenerationRefName) plus compare-and-set ref updates for /full/current (using go-git CAS and a git update-ref fallback for create-if-missing), and adds raw-transcript eviction during migration to avoid leaving stale chunk files behind.

Updates/expands tests to cover under-threshold behavior, final-batch rotation when existing /full/current data pushes over the threshold, and concurrent-change rejection for both rotation reset and /full/current updates.

Reviewed by Cursor Bugbot for commit 33dab74. Configure here.

pfleidi added 12 commits May 5, 2026 14:25
…o fix/checkpoints-v2-generation-refs-during-migration
Entire-Checkpoint: 3c53429995f2
Entire-Checkpoint: aca92a1bf611
Drop the flattenMigratedFullEntries wrapper that built a map only to copy
it into a slice that the caller iterated back into a map. FlattenTree
now writes directly into the destination map, and GenerationFileName is
removed with a single delete.

Flatten the nested if/else around GetRefState into two early-exit blocks;
GetRefState already returns ZeroHash on not-found, so no manual reset of
parentHash is needed.

Entire-Checkpoint: 4eeda0c359bc
Entire-Checkpoint: 9987534d89d5
Entire-Checkpoint: 23eeb20faa1c
Entire-Checkpoint: db62fad97d90
Replace migratedFullEntrySet.toMap with mergeInto so writeMigratedFullGeneration
and writeMigratedFinalFullCurrent share the raw-overrides-task merge rule.
Move the per-session raw artifact eviction into a free-standing
evictMigratedRawArtifacts helper so migratedFullEntrySet no longer carries
rawArtifactPaths state for a single consumer.

Also tighten the comment placement on sortMigratableCheckpoints, simplify a
trailing return, and refresh a stale doc string on
collectMissingFullCheckpointForPacking.

Entire-Checkpoint: 984a336eb487
Entire-Checkpoint: 641bcd407edc
Entire-Checkpoint: d283a23cea17
Copilot AI review requested due to automatic review settings May 6, 2026 01:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reworks v2 checkpoint migration so migrated raw transcripts use the same /full/current vs archived-generation packing behavior as normal v2 writes, while also adding some concurrency protection around /full/current ref updates.

Changes:

  • Replaced the old migration packer flow with explicit batch handling so only full batches are archived and the final partial batch is written to /full/current.
  • Added compare-and-set style handling for /full/current updates during migration and generation rotation.
  • Centralized archived generation ref naming and updated tests to cover under-threshold, rotation, force, and repair scenarios.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
cmd/entire/cli/migrate.go Reworks migration batching, /full/current writes, and ref update flow for migrated raw transcripts.
cmd/entire/cli/migrate_test.go Adds regression tests for under-threshold packing, threshold rotation, force behavior, and concurrent ref-update failures.
cmd/entire/cli/checkpoint/v2_generation.go Adds archived ref helper and exposes conditional current-generation rotation with extra concurrency checks.
cmd/entire/cli/checkpoint/v2_generation_test.go Adds tests for archived ref naming and concurrent reset protection during rotation.

Comment thread cmd/entire/cli/migrate.go
Comment thread cmd/entire/cli/migrate.go
Comment thread cmd/entire/cli/checkpoint/v2_generation.go
Comment thread cmd/entire/cli/migrate.go
pfleidi added 2 commits May 6, 2026 10:40
Entire-Checkpoint: c6698f8102c7
…neration-refs-during-migration

# Conflicts:
#	cmd/entire/cli/migrate.go
@pfleidi pfleidi marked this pull request as ready for review May 6, 2026 18:05
@pfleidi pfleidi requested a review from a team as a code owner May 6, 2026 18:05
@pfleidi
Copy link
Copy Markdown
Contributor Author

pfleidi commented May 6, 2026

Bugbot run

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 33dab74. Configure here.

@pfleidi pfleidi merged commit b04cfcb into main May 6, 2026
14 checks passed
@pfleidi pfleidi deleted the fix/checkpoints-v2-generation-refs-during-migration branch May 6, 2026 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants