Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-05-12
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
## Why

`gx branch finish --cleanup` (and the underlying `agent-worktree-prune.sh`) deleted the active Codex agent worktree while the Codex TUI was still running inside it. Once the cwd disappeared, Codex tried to refresh skills / run stop hooks and crashed with `No such file or directory (os error 2)` / `failed to reload config`. Operator lost the session and any in-flight unsaved work in that pane.

## What Changes

- Add `has_live_process_in_worktree()` to `templates/scripts/agent-worktree-prune.sh` that walks `/proc/*/cwd` and returns true when any live process's cwd resolves to inside the managed worktree (including the "(deleted)" suffix that appears after a partial unlink).
- Call it from `process_entry()` BEFORE any branch/worktree removal. When a live process is detected, the worktree is skipped and a clear `[agent-worktree-prune] Skipping live process worktree: <path>` line is logged. The `skipped_active` counter is incremented for the summary.
- Add a regression test (`test/doctor.test.js`) that spawns a long-running Node child process inside a detached agent worktree, runs `gx doctor` cleanup, and asserts the worktree is preserved and the log line is emitted.

## Impact

- Affects `templates/scripts/agent-worktree-prune.sh` (the script copied into managed repos by `gx setup`) and the doctor cleanup path it drives.
- No public API change; just a stricter precondition before destructive cleanup.
- Risk: if `/proc` is unavailable (non-Linux), the live-process check returns false and prune behaves exactly as before. Fail-open is the correct posture here — we'd rather over-cleanup on platforms without `/proc` than block all cleanup permanently.
- Rollout: no migration. New behavior takes effect the moment users pick up the updated `agent-worktree-prune.sh` (via `gx setup` / template copy or via a fresh clone of the repo).
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## ADDED Requirements

### Requirement: agent-worktree-prune skips worktrees with live processes

The `agent-worktree-prune.sh` cleanup script SHALL NOT remove a managed agent worktree, nor delete its branch, while any live process on the host has its current working directory resolved to a path inside that worktree.

#### Scenario: Live process inside detached agent worktree preserves the worktree

- **GIVEN** a managed agent worktree at `<repo>/.omc/agent-worktrees/<slug>` is in detached-HEAD state and would otherwise satisfy the prune criteria
- **AND** a live process on the host has its cwd inside that worktree (as reported by `/proc/*/cwd`)
- **WHEN** `agent-worktree-prune.sh` runs against the parent repo
- **THEN** the worktree directory continues to exist after the run
- **AND** a `[agent-worktree-prune] Skipping live process worktree: <path>` line is emitted to stdout
- **AND** the `skipped_active` counter is incremented in the run summary
- **AND** regressions are covered by a `test/doctor.test.js` case that spawns a child process inside a detached worktree and asserts both the preservation and the log line.

#### Scenario: No /proc available falls back to legacy behavior

- **GIVEN** the host does not expose `/proc` (e.g., the script runs on a platform without procfs)
- **WHEN** `agent-worktree-prune.sh` runs
- **THEN** the live-process check returns false (fail-open)
- **AND** the rest of the prune flow proceeds exactly as it did before this change, so cleanup on non-Linux hosts is not permanently blocked.
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
## Definition of Done

This change is complete only when **all** of the following are true:

- Every checkbox below is checked.
- The agent branch reaches `MERGED` state on `origin` and the PR URL + state are recorded in the completion handoff.
- If any step blocks (test failure, conflict, ambiguous result), append a `BLOCKED:` line under section 4 explaining the blocker and **STOP**. Do not tick remaining cleanup boxes; do not silently skip the cleanup pipeline.

## Handoff

- Handoff: change=`agent-codex-skip-live-codex-worktree-cleanup-2026-05-13-01-13`; branch=`agent/<your-name>/<branch-slug>`; scope=`TODO`; action=`continue this sandbox or finish cleanup after a usage-limit/manual takeover`.
- Copy prompt: Continue `agent-codex-skip-live-codex-worktree-cleanup-2026-05-13-01-13` on branch `agent/<your-name>/<branch-slug>`. Work inside the existing sandbox, review `openspec/changes/agent-codex-skip-live-codex-worktree-cleanup-2026-05-13-01-13/tasks.md`, continue from the current state instead of creating a new sandbox, and when the work is done run `gx branch finish --branch agent/<your-name>/<branch-slug> --base dev --via-pr --wait-for-merge --cleanup`.

## 1. Specification

- [x] 1.1 Finalize proposal scope and acceptance criteria for `agent-codex-skip-live-codex-worktree-cleanup-2026-05-13-01-13`.
- [x] 1.2 Define normative requirements in `specs/skip-live-codex-worktree-cleanup/spec.md`.

## 2. Implementation

- [x] 2.1 Implement scoped behavior changes (`has_live_process_in_worktree` + `process_entry` precondition in `templates/scripts/agent-worktree-prune.sh`).
- [x] 2.2 Add/update focused regression coverage (`test/doctor.test.js` — "preserves detached agent worktrees with live processes").

## 3. Verification

- [x] 3.1 Run targeted project verification commands. Evidence: `node --test --test-name-pattern="preserves detached agent worktrees with live processes" test/doctor.test.js` → 1 passed; sibling tests ("auto-prunes detached-HEAD agent worktrees", "preserves stranded worktrees when GUARDEX_SKIP_AUTO_WORKTREE_PRUNE=1") also pass (3/3).
- [x] 3.2 Run `openspec validate agent-codex-skip-live-codex-worktree-cleanup-2026-05-13-01-13 --type change --strict`. Evidence: "Change ... is valid".
- [x] 3.3 Run `openspec validate --specs`. Evidence: "No items found to validate" (no main spec deltas in this change).

## 4. Cleanup (mandatory; run before claiming completion)

- [ ] 4.1 Run the cleanup pipeline: `gx branch finish --branch agent/codex/skip-live-codex-worktree-cleanup-2026-05-13-01-13 --base main --via-pr --wait-for-merge --cleanup`. This handles commit -> push -> PR create -> merge wait -> worktree prune in one invocation.
- [ ] 4.2 Record the PR URL and final merge state (`MERGED`) in the completion handoff.
- [ ] 4.3 Confirm the sandbox worktree is gone (`git worktree list` no longer shows the agent path; `git branch -a` shows no surviving local/remote refs for the branch).
25 changes: 25 additions & 0 deletions templates/scripts/agent-worktree-prune.sh
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,26 @@ read_branch_activity_epoch() {

skipped_recent=0

has_live_process_in_worktree() {
local wt="$1"
local proc_cwd=""

[[ -d /proc ]] || return 1

for proc_cwd in /proc/[0-9]*/cwd; do
[[ -e "$proc_cwd" ]] || continue
local live_cwd=""
live_cwd="$(readlink "$proc_cwd" 2>/dev/null || true)"
[[ -n "$live_cwd" ]] || continue
live_cwd="${live_cwd% (deleted)}"
if [[ "$live_cwd" == "$wt" || "$live_cwd" == "${wt}"/* ]]; then
return 0
fi
done

return 1
}

branch_idle_gate() {
local branch="$1"
local wt="$2"
Expand Down Expand Up @@ -501,6 +521,11 @@ process_entry() {
echo "[agent-worktree-prune] Skipping active cwd worktree: ${wt}"
return
fi
if has_live_process_in_worktree "$wt"; then
skipped_active=$((skipped_active + 1))
echo "[agent-worktree-prune] Skipping live process worktree: ${wt}"
return
fi

local remove_reason=""
local branch_delete_mode="safe"
Expand Down
39 changes: 39 additions & 0 deletions test/doctor.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -1141,6 +1141,45 @@ test('gx doctor auto-prunes detached-HEAD agent worktrees under .omc/agent-workt
);
});


test('gx doctor preserves detached agent worktrees with live processes', async () => {
const repoDir = initRepoOnBranch('main');
seedCommit(repoDir);

const worktreeRoot = path.join(repoDir, '.omc', 'agent-worktrees');
fs.mkdirSync(worktreeRoot, { recursive: true });
const liveWorktree = path.join(worktreeRoot, 'live-agent-worktree');

let result = runHumanCmd('git', ['branch', 'agent/claude/live-demo'], repoDir);
assert.equal(result.status, 0, result.stderr || result.stdout);
result = runHumanCmd('git', ['worktree', 'add', liveWorktree, 'agent/claude/live-demo'], repoDir);
assert.equal(result.status, 0, result.stderr || result.stdout);
result = runHumanCmd('git', ['-C', liveWorktree, 'checkout', '--detach', 'HEAD'], repoDir);
assert.equal(result.status, 0, result.stderr || result.stdout);
result = runHumanCmd('git', ['branch', '-D', 'agent/claude/live-demo'], repoDir);
assert.equal(result.status, 0, result.stderr || result.stdout);

const child = cp.spawn(process.execPath, ['-e', 'setInterval(() => {}, 1000);'], {
cwd: liveWorktree,
stdio: 'ignore',
});
const exitPromise = new Promise((resolve) => {
child.once('exit', (code, signal) => resolve({ code, signal }));
});

try {
assert.equal(isPidAlive(child.pid), true, 'live worktree process should be running');
result = runNode(['doctor', '--target', repoDir, '--skip-agents', '--no-global-install'], repoDir);
assert.equal(result.status, 0, result.stderr || result.stdout);
const combined = `${result.stdout}\n${result.stderr}`;
assert.match(combined, /Skipping live process worktree/);
assert.equal(fs.existsSync(liveWorktree), true, 'live process worktree should be preserved');
} finally {
child.kill('SIGTERM');
await exitPromise;
}
});

test('gx doctor preserves stranded worktrees when GUARDEX_SKIP_AUTO_WORKTREE_PRUNE=1', () => {
const repoDir = initRepoOnBranch('main');
seedCommit(repoDir);
Expand Down
Loading