fix: stop prepending reasoning_content to model output#64
Open
li-xiu-qi wants to merge 1 commit into
Open
Conversation
When using cloud reasoning models (e.g. step-3.7-flash) via Chat Completions
API, ask_llm_v2.py was wrapping reasoning_content as a <think> block and
prepending it to the model's content output. But the model's content already
contains <THINK> tags (required by the parser prompt), so this created two
layers of THINK tags:
<think>reasoning_content from API field</think>
<THINK>model's own THINK from prompt</THINK>
explain:... action:CLICK point:500,300 summary:...
The parser's str2action() uses split('</THINK>')[1] to extract key-value
pairs, which grabbed the second THINK block text instead of the actual
action parameters, causing:
ValueError: ui_action must contain 'action' or 'action_type'
Fix: log reasoning_content for debugging but return model's content as-is.
The content already follows the expected format:
<THINK>...</THINK>\texplain:...\taction:...\tsummary:...
Verified with step-3.7-flash controlling Android device:
- Task: open WeChat and find specific group chat
- Result: 5 steps, completed successfully (was failing at step 1 before fix)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Fixes #62
When using cloud reasoning models (e.g.
step-3.7-flash) via Chat Completions API, the parser fails on the first step:Root Cause
ask_llm_v2.pydetectsreasoning_contentin the API response and wraps it as a<think>block, prepending it to the model'scontent:But the model's
contentalready contains<THINK>tags (required by the parser prompt), so this creates two layers:The parser's
str2action()usessplit("</THINK>")[1]to extract key-value pairs, which grabs content after the first</THINK>— i.e., the second THINK block text, not the action parameters.Three API Comparison
reasoning_contentfield → prepended byask_llm_v2.pythinkingtype block (separate)reasoningtype output item (separate)The model's
contentalways has exactly one<THINK>layer. The problem is purely inask_llm_v2.py's handling.Fix
Stop prepending
reasoning_contentto the result. Log it for debugging, but returncontentas-is. The content already follows the expected<THINK>...</THINK>\texplain:...\taction:...\tsummary:...format.Only
tools/ask_llm_v2.pyis modified (1 file, 19 insertions, 26 deletions).Verification
Tested with
step-3.7-flashcontrolling a real Android device (iQOO Z9 Turbo+):Both tasks failed before the fix (parser error on step 1) and succeeded after.