run_in_terminal: promote sync command to background after idle silence#316166
run_in_terminal: promote sync command to background after idle silence#316166meganrogge wants to merge 4 commits into
Conversation
If a synchronous run_in_terminal call produces no output for N ms, win the foreground race with a new idleSilence candidate that mirrors the existing timeout handler: promote the execution to background, return the terminal ID + output collected so far, append a steering hint. The process is never killed. Gated on chat.tools.terminal.idleSilenceTimeoutMs (default 60000, 0 disables). Listener and scheduler are owned by the existing raceCleanup DisposableStore so they go away when another candidate wins. Async (waitStrategy === 'idle') path is unchanged. Fixes #315884
Replace the boolean mentionTimeout parameter on _buildInputNeededSteeringText with a 'none' | 'timeout' | 'idleSilence' discriminator so the idle-silence promotion result no longer reuses the timeout wording. Add focused unit tests covering each mode.
|
/requires-eval-assessment terminalbench2 gpt-5.4,claude-opus-4.6,claude-opus-4.7 |
|
⏳ Queued vscode build for
|
There was a problem hiding this comment.
Pull request overview
This PR extends the terminal chat agent “run in terminal” tool to support an idle-silence path: when a foreground/sync command produces no output for a configurable duration, the tool returns early, moves the execution to a background terminal, and provides updated steering guidance to the model.
Changes:
- Add a new configuration setting
chat.tools.terminal.idleSilenceTimeoutMsto control idle-silence promotion timing (0 disables). - Implement idle-silence promotion logic in
RunInTerminalTooland adjust input-needed steering text to distinguish'none' | 'timeout' | 'idleSilence'. - Add unit tests validating the steering text content across the new “hung hint” modes.
Show a summary per file
| File | Description |
|---|---|
| src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/electron-browser/runInTerminalTool.test.ts | Adds tests for steering text behavior across none/timeout/idle-silence modes. |
| src/vs/workbench/contrib/terminalContrib/chatAgentTools/common/terminalChatAgentToolsConfiguration.ts | Introduces the new idleSilenceTimeoutMs setting with schema/description. |
| src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/runInTerminalTool.ts | Implements idle-silence promotion and updates steering text API/call sites. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 2
|
Base:
|
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
|
⏳ Queued vscode build for
|
Resolve conflict in runInTerminalTool.ts: keep idleSilence race type + try/finally cleanup. Co-authored-by: Copilot <[email protected]>
|
⏳ Queued vscode build for
|
|
🚀 Queued eval-assessment publish build for
|
|
🔬 Queued eval-assessment benchmark for
Results will be posted back here when the run completes. |
|
✅ Eval-assessment build published.
|
|
📊 Eval-assessment benchmark complete.
Eval-Agent ComparisonCandidate run: 63914249566137 Baseline runs: 25768037843, 25775688375, 25761410922 Detailed FindingsRun Comparisonvscode / terminalbench2
BASELINE 2 Step 4 — heavy pip install in constrained environment:run_in_terminal: pip install torch numpy pillow --quiet 2>&1 | tail -5 → Installation fails; baseline never recovers.CANDIDATE Step 5 — immediate stdlib pivot after first import failure:MSG: "The container is missing both PyTorch and Pillow, so I'm switching to lower-level eval-agent msbench instance analyze 63914249566137 --instances terminalbench2.eval.x86_64.extract-moves-from-video:msbench-0.1.1,gcode-to-text,terminalbench2.eval.x86_64.pytorch-model-cli:msbench-0.1.1,terminalbench2.eval.x86_64.install-windows-3.11:msbench-0.1.1,custom-memory-heap-crash,<REDACTED: Generic Secret> --custom-instructions "Identify instances where the candidate successfully pivots from a blocked or unavailable primary approach to a working alternative, and compare against baselines that either stuck with the failing approach, gave up, or refused the task." msbench-cli extract --run-id 63914249566137 --output out/63914249566137 --backend ces-dev1 msbench-cli extract --run-id 25768037843 --output out/25768037843 --backend ces-dev1 msbench-cli extract --run-id 25775688375 --output out/25775688375 --backend ces-dev1 msbench-cli extract --run-id 25761410922 --output out/25761410922 --backend ces-dev1 |
Re-land of #315885 on a standalone branch for isolated eval testing.
What this does
If a synchronous
run_in_terminalcall produces no terminal output for N ms (default 60s), a newidleSilencerace candidate wins the foreground race and promotes the execution to background. The process is never killed — the model gets the terminal ID + output so far and canget_terminal_output,send_to_terminal, orkill_terminal.Changes (3 files)
runInTerminalTool.ts: NewidleSilencerace candidate usingRunOnceScheduler+onDatalistener. Refactors_buildInputNeededSteeringTextfrommentionTimeout: booleantohungHint: 'none' | 'timeout' | 'idleSilence'discriminator with per-mode wording.terminalChatAgentToolsConfiguration.ts: New settingchat.tools.terminal.idleSilenceTimeoutMs(default 60000, 0 disables, experimental).runInTerminalTool.test.ts: 3 unit tests for steering text across all hint modes.Why this is safe
npm install,cargo build, etc.) reset the timer and never trip.OutputMonitoridle detection.raceCleanupDisposableStore— disposed when any other race candidate wins.