Skip to content

feat(llm): preserve reasoning blocks across Anthropic and Gemini turns (#200)#205

Merged
JackChen-me merged 2 commits into
open-multi-agent:mainfrom
MyPrototypeWhat:feat/reasoning-preservation-phase-1
May 3, 2026
Merged

feat(llm): preserve reasoning blocks across Anthropic and Gemini turns (#200)#205
JackChen-me merged 2 commits into
open-multi-agent:mainfrom
MyPrototypeWhat:feat/reasoning-preservation-phase-1

Conversation

@MyPrototypeWhat
Copy link
Copy Markdown
Contributor

Summary

Phase 1 of #200 — close the protocol gap where multi-turn extended thinking silently drops the signatures Anthropic and Gemini 3 require to continue a conversation. Scope follows the owner's approval comment verbatim.

IR changes (src/types.ts)

  • ReasoningBlock.signature — Anthropic thinking.signature and Gemini 3 thought-summary thoughtSignature. Required to echo back unchanged on Anthropic; recommended on Gemini 3.
  • ReasoningBlock.redactedData — Anthropic redacted_thinking.data. Block must still be echoed back on subsequent turns.
  • ToolUseBlock.signature — Gemini thoughtSignature for functionCall Parts. Required by Gemini 3+ ("required for function calling signatures" per Gemini docs).
  • AgentConfig.thinking: { enabled, budgetTokens?, effort? } — explicit config. effort carried separately from budgetTokens per the owner's open-question resolution.

Adapter changes

src/llm/anthropic.ts — extract signature from thinking blocks; new redacted_thinking branch on incoming; outgoing toAnthropicContentBlockParam handles reasoning blocks (signaturethinking block param, redactedDataredacted_thinking block param, neither → drop). New toAnthropicThinkingParam helper forwards thinking: { type: 'enabled', budget_tokens } on both chat() and stream(). Default budget 1024 (SDK comment: Must be ≥1024 and less than max_tokens).

src/llm/gemini.tsthoughtSignature is a top-level Part field per the @google/genai Part schema (verified at genai.d.ts:8675), NOT nested under functionCall. Extracted on both functionCall and thought-summary text Parts (Gemini 3 may sign both); echoed back on outgoing when present, dropped when unsigned to keep context lean. New toGeminiThinkingConfig helper forwards thinkingConfig: { includeThoughts, thinkingBudget }. fromGeminiPart helper extracted so chat() and stream() share one parsing implementation.

src/llm/bedrock.ts — comment refresh only. IR now carries the signature fields the previous comment said were missing; Bedrock-side wire-up to reasoningContent.reasoningText.signature is tracked as a follow-up.

src/agent/agent.ts + src/agent/runner.ts — one-line wire-ups to thread thinking from AgentConfig through RunnerOptions into LLMChatOptions.

Tests

+23 across tests/anthropic-adapter.test.ts (+12) and tests/gemini-adapter-contract.test.ts (+11):

  • Incoming extraction (signature, redactedData, thoughtSignature on both functionCall and thought-summary parts)
  • Outgoing round-trip echo for all signed blocks
  • Request-param forwarding (default 1024 for Anthropic, includeThoughts default for Gemini, omitted when disabled/absent)
  • Negative path: unsigned reasoning dropped on outgoing for both adapters

756/756 passing (baseline 733 + 23 new). npm run lint clean for this PR (3 pre-existing bedrock.ts errors are from a missing optional peer dep @aws-sdk/client-bedrock-runtime, unrelated and present on main).

Cross-provider asymmetry to flag

Important detail captured in the IR JSDoc: the same signature field name carries different protocol semantics depending on origin — Anthropic signs ReasoningBlocks (required round-trip), Gemini signs Parts containing functionCalls (required) or text+thought:true summaries (recommended). The IR doesn't try to model the difference; the adapters know how to interpret their own signatures.

Deferred to follow-up PRs (per owner)

  • OpenAI reasoning_effort forwarding (explicit effort on ThinkingConfig, owner asked to keep separate from this PR).
  • Anthropic adaptive thinking (Claude 4.7+ accepts only {type: 'adaptive'} and rejects enabled with HTTP 400). JSDoc on toAnthropicThinkingParam documents the constraint.
  • Anthropic 4.5 interleaved-thinking-2025-05-14 beta header detection.
  • Bedrock signature wire-up (separate adapter rewrite).

Streaming signature_delta accumulation is intentionally NOT separate work — the existing stream.finalMessage() path returns the assembled SDK content with signatures intact, so the new fromAnthropicContentBlock extraction handles streaming "for free" (covered by the extracts signature from streamed final message test).

Test plan

  • npm test — 756/756 (baseline 733 + 23 new)
  • npm run lint — no new errors (3 pre-existing bedrock-related errors confirmed via stash)
  • Anthropic implementation cross-checked against platform.claude.com/docs/en/docs/build-with-claude/extended-thinking and SDK type definitions in @anthropic-ai/[email protected]
  • Gemini implementation cross-checked against ai.google.dev/gemini-api/docs/thinking and @google/genai Part schema (verified thoughtSignature is top-level on Part, not nested)

open-multi-agent#200)

Phase 1 of open-multi-agent#200 — close the protocol gap where multi-turn extended thinking
silently drops the signatures Anthropic and Gemini 3 require.

IR additions (src/types.ts):
- ReasoningBlock.signature / redactedData
- ToolUseBlock.signature
- AgentConfig.thinking: { enabled, budgetTokens?, effort? }

Anthropic adapter:
- Extract signature from thinking blocks, surface redacted_thinking as
  ReasoningBlock with redactedData; echo both back unchanged on subsequent
  turns; forward thinking: { type: 'enabled', budget_tokens } on chat() and
  stream(). Default budget 1024 (SDK-enforced minimum).

Gemini adapter:
- Extract thoughtSignature from Part top-level (verified against @google/genai
  Part schema — NOT nested under functionCall); echo back when present on
  both functionCall and thought-summary text Parts; forward
  thinkingConfig: { includeThoughts, thinkingBudget }. Unsigned reasoning is
  dropped on outgoing to keep context lean.

Bedrock: comment refresh — IR now carries the signature fields, but
adapter-side preservation is tracked as a follow-up.

+23 tests across anthropic-adapter and gemini-adapter-contract suites,
covering incoming extraction, outgoing round-trip, request-param forwarding,
and the unsigned-reasoning drop path.

Deferred to follow-up PRs (per owner's reply on open-multi-agent#200):
- OpenAI reasoning_effort forwarding
- Anthropic adaptive thinking (Claude 4.7+)
- Anthropic 4.5 interleaved-thinking-2025-05-14 beta header detection
- Bedrock signature wire-up
@JackChen-me
Copy link
Copy Markdown
Member

toAnthropicThinkingParam defaults budget_tokens to 1024 without knowing maxTokens. Anthropic requires budget_tokens < max_tokens, so thinking: { enabled: true } with maxTokens <= 1024, or any budgetTokens >= maxTokens, is a guaranteed 400 at runtime. Pass maxTokens in and clamp, or validate and throw before the request.

Rest looks right for the phase-1 scope.

Per @JackChen-me's review on open-multi-agent#205: `toAnthropicThinkingParam` defaulted
budget_tokens to 1024 without considering maxTokens, so any caller passing
`thinking: { enabled: true }` with `maxTokens <= 1024` (or any explicit
budgetTokens >= maxTokens) would hit a runtime 400 from Anthropic.

Validate both constraints before the request:
- budget_tokens >= 1024 (SDK-documented minimum)
- budget_tokens < max_tokens (docs: "budget_tokens must be set to a value
  less than max_tokens"; the interleaved-thinking exception is not yet
  wired up — see RFC open-multi-agent#200 phase 2)

Throws with a clear message identifying which constraint failed and how
to fix it, rather than letting Anthropic's cryptic 400 propagate.

Threaded effectiveMaxTokens (= options.maxTokens ?? 4096) into both
chat() and stream() call sites so the validator sees the same value the
request will use.

+3 tests covering the two new throw paths and the regression scenario
from the review (default budget 1024 + maxTokens 1024). Existing tests
that relied on the chatOpts() default maxTokens=1024 were updated to
specify a larger maxTokens so they exercise the happy path.

759/759 passing.
@MyPrototypeWhat
Copy link
Copy Markdown
Contributor Author

Good catch, you're right. Verified against the docs ("budget_tokens must be set to a value less than max_tokens") — the interleaved-thinking exception doesn't apply since the beta header isn't wired up in this PR.

Pushed a0c721c taking the validate-and-throw path:

  • budget < 1024 → throws (SDK-documented minimum)
  • budget >= maxTokens → throws with a message identifying which constraint failed and how to fix it
  • Threaded effectiveMaxTokens = options.maxTokens ?? 4096 into both chat() and stream() so the validator sees the same value the request uses
  • +3 tests including a regression for the exact scenario you flagged (default budget 1024 + maxTokens 1024)

Went with throw rather than clamp because OMA's pattern is explicit-over-magic and silently changing a user-specified budgetTokens would surprise debuggers. Happy to switch to clamp if you'd prefer.

759/759 still green.

@JackChen-me JackChen-me merged commit 24f1180 into open-multi-agent:main May 3, 2026
4 checks passed
CodingBangboo pushed a commit to CodingBangboo/open-multi-agent that referenced this pull request May 4, 2026
open-multi-agent#200) (open-multi-agent#205)

* feat(llm): preserve reasoning blocks across Anthropic and Gemini turns (open-multi-agent#200)

Phase 1 of open-multi-agent#200 — close the protocol gap where multi-turn extended thinking
silently drops the signatures Anthropic and Gemini 3 require.

IR additions (src/types.ts):
- ReasoningBlock.signature / redactedData
- ToolUseBlock.signature
- AgentConfig.thinking: { enabled, budgetTokens?, effort? }

Anthropic adapter:
- Extract signature from thinking blocks, surface redacted_thinking as
  ReasoningBlock with redactedData; echo both back unchanged on subsequent
  turns; forward thinking: { type: 'enabled', budget_tokens } on chat() and
  stream(). Default budget 1024 (SDK-enforced minimum).

Gemini adapter:
- Extract thoughtSignature from Part top-level (verified against @google/genai
  Part schema — NOT nested under functionCall); echo back when present on
  both functionCall and thought-summary text Parts; forward
  thinkingConfig: { includeThoughts, thinkingBudget }. Unsigned reasoning is
  dropped on outgoing to keep context lean.

Bedrock: comment refresh — IR now carries the signature fields, but
adapter-side preservation is tracked as a follow-up.

+23 tests across anthropic-adapter and gemini-adapter-contract suites,
covering incoming extraction, outgoing round-trip, request-param forwarding,
and the unsigned-reasoning drop path.

Deferred to follow-up PRs (per owner's reply on open-multi-agent#200):
- OpenAI reasoning_effort forwarding
- Anthropic adaptive thinking (Claude 4.7+)
- Anthropic 4.5 interleaved-thinking-2025-05-14 beta header detection
- Bedrock signature wire-up

* fix(anthropic): validate thinking.budgetTokens against API constraints

Per @JackChen-me's review on open-multi-agent#205: `toAnthropicThinkingParam` defaulted
budget_tokens to 1024 without considering maxTokens, so any caller passing
`thinking: { enabled: true }` with `maxTokens <= 1024` (or any explicit
budgetTokens >= maxTokens) would hit a runtime 400 from Anthropic.

Validate both constraints before the request:
- budget_tokens >= 1024 (SDK-documented minimum)
- budget_tokens < max_tokens (docs: "budget_tokens must be set to a value
  less than max_tokens"; the interleaved-thinking exception is not yet
  wired up — see RFC open-multi-agent#200 phase 2)

Throws with a clear message identifying which constraint failed and how
to fix it, rather than letting Anthropic's cryptic 400 propagate.

Threaded effectiveMaxTokens (= options.maxTokens ?? 4096) into both
chat() and stream() call sites so the validator sees the same value the
request will use.

+3 tests covering the two new throw paths and the regression scenario
from the review (default budget 1024 + maxTokens 1024). Existing tests
that relied on the chatOpts() default maxTokens=1024 were updated to
specify a larger maxTokens so they exercise the happy path.

759/759 passing.
MyPrototypeWhat added a commit to MyPrototypeWhat/open-multi-agent that referenced this pull request May 6, 2026
…ch openai

The Copilot and Azure-OpenAI adapters were quietly dropping seven
AgentConfig fields that openai.ts has always forwarded:

  frequency_penalty, presence_penalty, top_p, top_k, min_p,
  parallel_tool_calls, extraBody (spread)

A user setting `topP: 0.9` or `extraBody: { logit_bias: {...} }` on a
Copilot/Azure agent would see no effect on the wire, with no warning —
the AgentConfig accepted the field, the adapter silently ignored it.

Mirror the openai.ts param object exactly: same field set, same ordering
(sampling first, ...extraBody, then structural fields so extraBody can
override sampling defaults but cannot clobber model/messages/tools/stream).

This was discovered while tracing the reasoning_effort change — it's not
strictly part of open-multi-agent#200, but the Copilot/Azure adapters were already in
this PR's surface area for the reasoning_effort fix and the silent-drop
behaviour is the same bug class as the budget-vs-maxTokens issue you
caught on open-multi-agent#205. Folding the parity fix in while the diff is open.

+6 tests (3 per adapter):
- forwards full sampling-param set + extraBody
- extraBody overrides sampling params (field-ordering contract)
- extraBody cannot override structural fields (model/stream)

784/784 passing.
JackChen-me pushed a commit that referenced this pull request May 6, 2026
…m parity in copilot/azure (#200) (#209)

* feat(openai): forward thinking.effort as reasoning_effort (#200)

Phase-1 follow-up to #205. Owner explicitly carved this out as separate
work in their RFC #200 reply: "OpenAI reasoning_effort is worth adding,
but keep it separate from the phase-1 protocol fix."

The IR field `AgentConfig.thinking.effort` was added in #205 but only
wired up for Anthropic / Gemini. This PR adds the OpenAI side.

Changes:
- openai.ts chat() and stream(): one-line `reasoning_effort: options.thinking?.effort`
  added before `...options.extraBody` so extraBody can override (consistent
  with how the other sampling params are ordered).
- types.ts AgentConfig.thinking JSDoc: refreshed to document per-provider
  mapping now that OpenAI is wired up.

Design notes:
- The `'minimal'` effort value (gpt-5) isn't yet in the SDK 4.104 type
  union (`'low' | 'medium' | 'high' | null`), but the existing `as
  ChatCompletionCreateParamsNonStreaming` cast at the bottom of the params
  object already covers this, consistent with how `top_k` / `min_p` are
  forwarded. Test verifies pass-through.
- `thinking.enabled` and `thinking.budgetTokens` are intentionally ignored
  by the OpenAI adapter — OpenAI's reasoning surface is qualitative
  (`effort`) only, and the explicit-effort design was the owner's
  open-question resolution in #200.
- Scope deliberately tight: openai.ts only. Other OpenAI-compatible
  adapters (copilot, azure-openai, deepseek, grok, minimax, qiniu) are
  unchanged — they each route to providers with their own reasoning
  semantics, so a separate PR per adapter is the right unit.

+6 tests covering: chat forwarding, stream forwarding, gpt-5 'minimal'
pass-through, omission when thinking absent, omission when effort unset
(with budgetTokens set, to verify enabled/budgetTokens are correctly
ignored), extraBody override of effort.

770/770 passing.

* feat(copilot, azure-openai): forward thinking.effort as reasoning_effort

Both adapters implement LLMAdapter directly (rather than extending
OpenAIAdapter like grok/minimax/deepseek/qiniu do), so they didn't pick
up the reasoning_effort forwarding from the openai.ts change in this PR.
Mirror the same pattern to close the gap:

- Add `reasoning_effort: options.thinking?.effort` to chat() and stream()
  on both adapters.
- Add `as ChatCompletionCreateParams{NonStreaming,Streaming}` casts on
  the params objects (didn't exist before because neither adapter
  forwarded any param outside the SDK's declared union — the cast is
  needed to pass the IR's `'minimal'` value through, same as openai.ts).

+8 tests across copilot-adapter and azure-openai-adapter suites: chat
forwarding, stream forwarding, gpt-5 'minimal' pass-through, omission
when thinking absent / effort unset.

778/778 passing.

* fix(copilot, azure-openai): backfill sampling-param forwarding to match openai

The Copilot and Azure-OpenAI adapters were quietly dropping seven
AgentConfig fields that openai.ts has always forwarded:

  frequency_penalty, presence_penalty, top_p, top_k, min_p,
  parallel_tool_calls, extraBody (spread)

A user setting `topP: 0.9` or `extraBody: { logit_bias: {...} }` on a
Copilot/Azure agent would see no effect on the wire, with no warning —
the AgentConfig accepted the field, the adapter silently ignored it.

Mirror the openai.ts param object exactly: same field set, same ordering
(sampling first, ...extraBody, then structural fields so extraBody can
override sampling defaults but cannot clobber model/messages/tools/stream).

This was discovered while tracing the reasoning_effort change — it's not
strictly part of #200, but the Copilot/Azure adapters were already in
this PR's surface area for the reasoning_effort fix and the silent-drop
behaviour is the same bug class as the budget-vs-maxTokens issue you
caught on #205. Folding the parity fix in while the diff is open.

+6 tests (3 per adapter):
- forwards full sampling-param set + extraBody
- extraBody overrides sampling params (field-ordering contract)
- extraBody cannot override structural fields (model/stream)

784/784 passing.

* fix(copilot, azure-openai): drop top_k / min_p — vLLM-only, not accepted by these endpoints

Mid-PR correction. The previous "mirror openai.ts exactly" reasoning was
sloppy: openai.ts forwards `top_k` and `min_p` deliberately for the
local-server case (`provider: 'openai', baseURL: 'http://localhost:8000'`
pointing at vLLM), but Copilot and Azure OpenAI both have hard-coded cloud
endpoints with no equivalent escape hatch.

Verified against real docs:
- Azure OpenAI Chat Completions reference lists `frequency_penalty` /
  `presence_penalty` / `top_p` / `parallel_tool_calls` / `logit_bias`,
  but explicitly does NOT include `top_k` or `min_p`.
- vLLM docs confirm `top_k` / `min_p` are vLLM-only sampling extensions
  not part of the OpenAI Chat Completions spec.
- Copilot uses `api.githubcopilot.com` — also a fixed cloud proxy that
  normalises to standard OpenAI Chat Completions.

So forwarding these as top-level fields on Copilot / Azure would be dead
weight at best and a 400 at worst. Users who genuinely need them on
either adapter can still pass via `extraBody` (e.g. for a local proxy
that re-routes to vLLM).

Removed from copilot.ts and azure-openai.ts param objects in both chat()
and stream(); updated cast comments to drop the `top_k` / `min_p`
mention; replaced the "forwards full set" tests with two narrower tests
each: (a) forwards the OpenAI-cloud-compatible subset, (b) does NOT
forward `top_k` / `min_p` (regression guard).

786/786 passing.

* docs(copilot): disclose that the API spec is reverse-engineered, not official

After verifying my assumptions per-field (per @MyPrototypeWhat's review
push), confirmed:

- Azure OpenAI: docs explicitly list frequency_penalty / presence_penalty
  / top_p / parallel_tool_calls (verbatim quotes from learn.microsoft.com).
- Copilot: GitHub doesn't publish a public chat/completions API
  reference. The available references are community-reverse-engineered
  (e.g. Alorse/copilot-to-api), and the documented request examples only
  show messages/model/max_tokens/temperature/stream/tools/tool_choice —
  the four sampling fields aren't called out.

Forwarding them anyway because:
1. Copilot's design promise is OpenAI Chat Completions compatibility.
2. The four fields are standard OpenAI fields, so the proxy promise
   should cover them.
3. Per-model behaviour (some routed reasoning models may ignore them) is
   out of scope for the adapter — same situation as openai.ts forwarding
   topP to OpenAI's o-series, which the model itself ignores.
4. Removing them would re-introduce the silent-drop bug for users whose
   models DO honour the fields.

Added a JSDoc caveat in the chat() params block so future maintainers
see the verification gap and don't mistake the forwarding for a verified
contract.

* revert(copilot): keep param surface narrow — no public spec for the 4 sampling fields

Per @MyPrototypeWhat's review: "既然没有公开的证据证明,那就保持原状呗?"
("If there's no public evidence, just keep the original state, right?")

Agreed — Copilot's API isn't publicly documented by GitHub, and the
reverse-engineered references only show model / messages / max_tokens /
temperature / tools / stream / tool_choice. Forwarding additional
sampling fields (frequency_penalty, presence_penalty, top_p,
parallel_tool_calls) and extraBody on the assumption that "Copilot is
OpenAI-compatible" was a guess, not evidence.

Reverting the Copilot side of the sampling-param backfill from this PR.
The Copilot adapter now stays at its pre-PR field surface PLUS the new
`reasoning_effort` (which is the actual #200 work and verified against
the SDK's typed union).

Replaced the "forwards full set" + "does NOT forward top_k/min_p" tests
with one consolidated "does NOT forward unverified fields" regression
test, plus a positive test that the documented subset (temperature /
max_tokens / tools) still works.

Azure-OpenAI side is unchanged — Microsoft Learn explicitly documents
frequency_penalty / presence_penalty / top_p / parallel_tool_calls /
logit_bias on the Azure OpenAI Chat Completions endpoint, so that
backfill stays.

784/784 passing.

* refactor(copilot, azure-openai): replace object-level cast with field-level cast on reasoning_effort

The object-level `as ChatCompletionCreateParams{NonStreaming,Streaming}`
cast was originally added to cover the IR's `'minimal'` reasoning effort
value (gpt-5), which isn't in SDK 4.104's type union. After last commit's
revert, that's the ONLY remaining type mismatch in either adapter — every
other field they forward is already in the SDK union.

A blanket object-level cast is overkill for a single field. Replace with
a narrow field-level `as 'low' | 'medium' | 'high' | null | undefined` on
the `reasoning_effort` line itself. Side benefit: TypeScript still
type-checks the rest of the params object, so a future typo / wrong-type
field would now be caught at compile time instead of slipping past the
broad cast.

(Note: openai.ts keeps its object-level cast because it forwards `top_k`
and `min_p` — those are field NAMES not in the SDK type at all, not
just unsupported values, so a field-level cast doesn't help.)

Removed the now-unused `ChatCompletionCreateParamsNonStreaming` and
`ChatCompletionCreateParamsStreaming` imports from both files.

784/784 passing. Lint clean.

* docs(azure-openai): trim verbose comments — keep only the WHY, drop verification process

The previous comment block leaked PR-review process into source code:
- "Mirrors the field ordering contract in openai.ts" — decorative pointer
- "(Confirmed against the Azure OpenAI Chat Completions API reference,
  which lists frequency_penalty/presence_penalty/top_p/parallel_tool_calls
  but not top_k or min_p.)" — verification work, belongs in commit/PR
  description, not in long-lived source

Kept the two genuinely load-bearing parts:
- Field ordering rationale (sampling first → extraBody → structural)
  — without this, future contributors will reorder
- Single-line note on why top_k/min_p are absent — defends against
  future "you forgot these" PRs

* docs: trim verbose comments per CLAUDE.md ("only when WHY is non-obvious")

After @MyPrototypeWhat's audit ("是否必须的?"), trimmed three categories:

1. Copilot chat() narrow-surface comment: 9 lines → 3 lines.
   The explicit list of forwarded/not-forwarded fields was redundant with
   the source code immediately below it. Kept the load-bearing WHY
   ("Copilot's API isn't publicly documented").

2. Three test rationale comments removed entirely (4-6 lines each).
   The test names already convey both WHAT and WHY:
   - "forwards the OpenAI-cloud-compatible sampling params and extraBody"
   - "does NOT forward vLLM-only top_k / min_p (Azure runs MS-hosted OpenAI models)"
   - "does NOT forward sampling params or extraBody that lack public Copilot spec coverage"
   The verification process and bug history live in PR #209's description
   and the per-commit messages, not in long-lived test source.

Net -21 lines. 784/784 still passing. Lint clean.

* refactor(thinking): drop 'minimal' from IR — route via extraBody (per @JackChen-me)

Per owner's call on #209: option 2. Narrow `ThinkingConfig.effort` to the
SDK 4.104-declared values (`'low' | 'medium' | 'high'`) and document
that newer API values (gpt-5 `'minimal'`, GPT-5.5 `'none'`) travel via
`extraBody: { reasoning_effort: '<value>' }` — the same escape hatch
already used for vLLM-only `top_k` / `min_p`.

Cleanup:
- src/types.ts: ThinkingConfig.effort union narrowed; JSDoc points users
  to extraBody for newer SDK-untyped values.
- src/llm/openai.ts: object-level cast comment trimmed (no longer needs
  to mention 'minimal'; cast still required for top_k/min_p/extraBody).
- src/llm/copilot.ts + src/llm/azure-openai.ts: field-level cast on
  `reasoning_effort` removed entirely. The IR type now satisfies the SDK
  type union, so no boundary cast is needed.
- tests/openai-adapter.test.ts + tests/azure-openai-adapter.test.ts:
  rewrote the 'minimal' tests to verify the extraBody route (the now-
  supported path) instead of the old IR-direct route.
- tests/copilot-adapter.test.ts: deleted the 'minimal' test outright.
  Copilot doesn't forward extraBody (narrow surface, no public spec),
  so neither route exists for Copilot users wanting 'minimal'. They
  use 'low' instead, or wait for SDK bump.

783/783 passing. Lint clean. Net -7 lines.

No behaviour change for users on `'low' | 'medium' | 'high'` (the entire
SDK-typed surface).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants