fix(react): resolve response_format conflict with native tool calling by qinxuye · Pull Request #175 · xorbitsai/xagent

qinxuye · 2026-03-17T13:17:00Z

Remove response_format when tools are available to enable native tool calls
Fix vision tool parameter names (use 'images' and 'question' instead of 'image' and 'prompt')
Add file information propagation in DAG step execution
Add tool parameter JSON schema display in plan and step stages
Add tests for file info propagation

This resolves the issue where Gemini returned empty responses when tools were present due to response_format conflicting with the tools parameter.

rogercloud

Code Review Summary

Issues Found: 5

1 CRITICAL: String type="tool_call" falls through to text handling
3 MINOR: Code edge cases and refactoring opportunities
1 INFO: Breaking changes documentation

Overall Assessment

This PR fixes an important bug (Gemini empty responses with tools) by enabling native tool calling. The core fix is sound, but there's one critical edge case that should be addressed.

Detailed Issues

1. CRITICAL - Missing `type="tool_call"` handling in string response parsing

File: src/xagent/core/agent/pattern/react.py

When a string response contains valid JSON with type="tool_call", it incorrectly falls through to "direct text response" handling instead of being rejected.

Current code (line 1683):

if isinstance(parsed, dict) and parsed.get("type") == "final_answer":
    # handles final_answer
    return action
except (json.JSONDecodeError, AttributeError):
    pass  # Only catches JSON errors

# BUG: If parsed has type="tool_call", falls through here!

Suggested fix:

if isinstance(parsed, dict):
    action_type = parsed.get("type")
    if action_type == "final_answer":
        # ... handle final_answer
    elif action_type == "tool_call":
        # Should not happen with native tool calling - reject
        raise PatternExecutionError(...)

2. MINOR - Potential IndexError on empty `accumulated_tool_calls`

File: src/xagent/core/model/chat/basic/claude.py

Line 887: tool_id = list(accumulated_tool_calls.keys())[0]

If we reach this branch because accumulated_tool_calls is empty (falsy), this raises IndexError. Add a guard check.

3. MINOR - anyOf/oneOf code duplication

File: src/xagent/core/model/chat/basic/claude.py (lines 55-139)

The anyOf and oneOf handling blocks contain ~80 lines of nearly identical logic. Consider extracting to a helper function like _handle_union_type(schema, key).

4. MINOR - Empty string tool_id creates dict entry

File: src/xagent/core/model/chat/basic/openai.py (line 966-977)

The comment says this handles "qwen's empty string id format", which appears intentional. However, if association by index fails, a tool call entry with "" as the key is created. Consider skipping such chunks instead.

5. INFO - Breaking changes in vision_tool.py

File: src/xagent/core/tools/adapters/vibe/vision_tool.py

Function signatures changed:

image parameter removed (renamed to images)
prompt parameter removed (renamed to question)
Both are now required (no default values)

These are breaking changes for existing code. Consider adding a CHANGELOG entry or backward compatibility shim.

Test Results

All 33 tests pass. The test expectations were updated to match the new behavior (invalid JSON treated as text response rather than triggering retry).

Recommendation

Address Issue #1 before merge. Issues #2-#5 are edge cases/documentation and can be addressed in follow-ups if needed.

src/xagent/core/agent/pattern/react.py

src/xagent/core/model/chat/basic/claude.py

src/xagent/core/model/chat/basic/openai.py

src/xagent/core/tools/adapters/vibe/vision_tool.py

rogercloud

Re-Review: Additional Issues Found

Reviewed the diff beyond the previously reported issues. Found 3 new minor issues.

New Issues

MINOR: Repeated instructional text inflates context (observation_content)
MINOR: Missing required reasoning field causes unnecessary retries
MINOR: Stale/misleading comment block

None are blocking, but addressing them improves code health.

src/xagent/core/agent/pattern/react.py

- Remove response_format when tools are available to enable native tool calls - Fix vision tool parameter names (use 'images' and 'question' instead of 'image' and 'prompt') - Add file information propagation in DAG step execution - Add tool parameter JSON schema display in plan and step stages - Add tests for file info propagation This resolves the issue where Gemini returned empty responses when tools were present due to response_format conflicting with the tools parameter.

- Add debug logging for better tracing in ReAct pattern - Simplify action requirements format when tools are available - Improve anyOf/oneOf handling for Claude tool schemas - Handle Union[str, List[str]] types properly - General code formatting improvements from ruff

- Add stream_chat support to MockReActLLM and MockLLM in tests - Fix string response parsing to prioritize JSON extraction - Properly handle final_answer with success/error fields - Update test expectations for invalid/truncated JSON in native mode - Add create_mock_stream_chat helper for DAG tests All 30 tests now pass with native tool calling enabled.

- Extract reasoning from full_content before tool calls - Pass reasoning through response object to action converter - Use actual LLM reasoning instead of hardcoded placeholder - Update prompt to only require type and reasoning in JSON - Add reasoning field to action execution traces - Ensure reasoning is available in final answer results This fixes the issue where LLM reasoning was lost when using native function calling, as the OpenAI API format doesn't include reasoning in tool_calls. Now reasoning is captured from the text content LLM generates before invoking tools.

This change separates the action type decision from tool invocation: 1. First LLM call: Determine action type (tool_call vs final_answer) - Returns JSON with action type and reasoning - No tool schemas passed 2. Second LLM call (if tool_call): Invoke specific tool via native calling - Pass tool schemas to LLM - LLM uses native function calling API - Returns tool_calls with tool_name and arguments Benefits: - Clearer separation of concerns - Better reasoning capture (not lost in native tool call format) - More predictable LLM behavior - Easier to debug and trace Tests updated to reflect 3-call pattern: - Call 1: Action type - Call 2: Native tool invocation - Call 3: Final answer

- Fix: Clean messages_with_prompt instead of messages in second call - Simplify first call prompt to focus on action type decision only - Remove confusing instructions about tool calling in first call - Clarify that system will guide tool invocation in follow-up call

…ling

Remove verbose debug logging from the first and second LLM calls while preserving important warnings and errors. The changes include: - Remove tool availability and name logs - Remove response type/value debug logs - Remove json_repair success logs - Remove token chunk logs - Remove second call info logs Keep critical logs for validation failures, unknown types, and errors.

…empty Add guard check before accessing accumulated_tool_calls.keys()[0] when event.index is not available. If accumulated_tool_calls is empty, skip the chunk instead of raising IndexError.

…tibility - Only use two-phase tool calling when step_id == "main" (top-level ReAct) - Skip second call for DAG steps to maintain backward compatibility - This fixes DAG tests that expect single-phase tool calling behavior - Also fix potential IndexError in claude.py when accumulated_tool_calls is empty

In DAG mode (step_id != "main"), use traditional single-phase tool calling where tools are passed to the LLM from the start. This maintains backward compatibility with DAG tests and avoids infinite retry loops. In main ReAct mode (step_id == "main"), use two-phase tool calling: - First call: JSON format to decide action type - Second call: Native tool calling to invoke the tool This fixes the issue where DAG steps were executing tools in a loop.

- refactor(claude): extract duplicated anyOf/oneOf handling into _handle_union_type() helper (~85 lines reduced) - fix(openai): skip unassociable tool call chunks instead of creating entries with empty string keys - fix(react): shorten observation instructional text while keeping behavioral nudge to prevent tool loops - fix(react): add missing required 'reasoning' field in _normalize_action_data() fallback dicts to prevent ValidationError - fix(react): remove stale comment about missing task_id/step_id access - fix(test): prevent infinite tool call loop in DAG sequential test by tracking executed tools in mock LLM

XprobeBot added the bug Something isn't working label Mar 17, 2026

qinxuye force-pushed the enh/tool_call branch from 4384073 to 8034dd0 Compare March 27, 2026 13:38

qinxuye marked this pull request as ready for review March 27, 2026 17:09

qinxuye force-pushed the enh/tool_call branch 2 times, most recently from 982a9bd to e354056 Compare March 27, 2026 18:08

rogercloud reviewed Mar 28, 2026

View reviewed changes

src/xagent/core/agent/pattern/react.py Outdated Show resolved Hide resolved

src/xagent/core/agent/pattern/react.py Outdated Show resolved Hide resolved

src/xagent/core/agent/pattern/react.py Outdated Show resolved Hide resolved

qinxuye and others added 12 commits March 29, 2026 23:44

debug(react): add logging to track tool availability in two-phase cal…

7a9c0a9

…ling

fix(claude): fix potential IndexError when accumulated_tool_calls is …

7711cf7

…empty Add guard check before accessing accumulated_tool_calls.keys()[0] when event.index is not available. If accumulated_tool_calls is empty, skip the chunk instead of raising IndexError.

qinxuye force-pushed the enh/tool_call branch from 7aaa8da to d9243d0 Compare March 29, 2026 15:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(react): resolve response_format conflict with native tool calling#175

fix(react): resolve response_format conflict with native tool calling#175
qinxuye wants to merge 12 commits intoxorbitsai:mainfrom
qinxuye:enh/tool_call

qinxuye commented Mar 17, 2026

Uh oh!

rogercloud left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rogercloud left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

qinxuye commented Mar 17, 2026

Uh oh!

rogercloud left a comment

Choose a reason for hiding this comment

Code Review Summary

Issues Found: 5

Overall Assessment

Detailed Issues

1. CRITICAL - Missing type="tool_call" handling in string response parsing

2. MINOR - Potential IndexError on empty accumulated_tool_calls

3. MINOR - anyOf/oneOf code duplication

4. MINOR - Empty string tool_id creates dict entry

5. INFO - Breaking changes in vision_tool.py

Test Results

Recommendation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rogercloud left a comment

Choose a reason for hiding this comment

Re-Review: Additional Issues Found

New Issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. CRITICAL - Missing `type="tool_call"` handling in string response parsing

2. MINOR - Potential IndexError on empty `accumulated_tool_calls`