Description
When an assistant step makes parallel tool calls, the tool-result parts are recorded in completion order (whichever execute resolves first), not call order. toResponseMessages (packages/ai/src/generate-text/to-response-messages.ts) then builds the tool message by iterating step content in that arrival order, with no canonical sort.
Completion order is nondeterministic across requests. The same logical history therefore serializes with tool_result blocks in different orders depending on how it was produced:
- Mid-run steps: results appear in completion order
- History rebuilt from persistence (e.g.
convertToModelMessages over stored UIMessages): results appear in call order, because UIMessage tool parts are created at call time
Providers pair tool_use/tool_result by id, so responses are unaffected - but prompt caching is byte-sensitive.
The first parallel batch whose order differs between requests invalidates the entire cached prefix behind it. In a long agent loop this shows up as the cache being re-written almost every turn; we measured ~47k tokens of redundant cache writes per request on Bedrock/Anthropic before pinning it to this.
Repro:
streamText with two tools, both called in parallel in one step; make tool A slower than tool B.
- Capture the request body of the following step:
tool_result blocks appear B, A (completion order).
- Rebuild the same conversation from persisted UIMessages via c
onvertToModelMessages and send: blocks appear A, B (call order).
- Byte-compare - identical content, different block order → provider cache miss from that point.
Proposed fix
In toResponseMessages, emit the tool message's tool-result/tool-error parts sorted by the order of the corresponding tool-call parts in the assistant content (stable sort, unknown ids last). Key order within a step is not semantically meaningful, so this is behavior-preserving - it just makes serialization deterministic and cache-friendly.
Workaround we have implemented:
Re-sorting in prepareStep:
prepareStep: ({ messages }) => ({ messages: canonicaliseToolResultOrder(messages) })
where the helper sorts each tool message's results to match the preceding assistant's tool-call order. Works, but every caching-sensitive consumer has to rediscover this independently.
AI SDK Version
Version: ai@6.0.182
Code of Conduct
Description
When an assistant step makes parallel tool calls, the tool-result parts are recorded in completion order (whichever
executeresolves first), not call order. toResponseMessages (packages/ai/src/generate-text/to-response-messages.ts) then builds the tool message by iterating step content in that arrival order, with no canonical sort.Completion order is nondeterministic across requests. The same logical history therefore serializes with tool_result blocks in different orders depending on how it was produced:
convertToModelMessagesover stored UIMessages): results appear in call order, because UIMessage tool parts are created at call timeProviders pair
tool_use/tool_result by id,so responses are unaffected - but prompt caching is byte-sensitive.The first parallel batch whose order differs between requests invalidates the entire cached prefix behind it. In a long agent loop this shows up as the cache being re-written almost every turn; we measured ~47k tokens of redundant cache writes per request on Bedrock/Anthropic before pinning it to this.
Repro:
streamTextwith two tools, both called in parallel in one step; make tool A slower than tool B.tool_resultblocks appear B, A (completion order).onvertToModelMessagesand send: blocks appear A, B (call order).Proposed fix
In
toResponseMessages, emit the tool message'stool-result/tool-errorparts sorted by the order of the correspondingtool-callparts in the assistant content (stable sort, unknown ids last). Key order within a step is not semantically meaningful, so this is behavior-preserving - it just makes serialization deterministic and cache-friendly.Workaround we have implemented:
Re-sorting in prepareStep:
prepareStep: ({ messages }) => ({ messages: canonicaliseToolResultOrder(messages) })
where the helper sorts each tool message's results to match the preceding assistant's tool-call order. Works, but every caching-sensitive consumer has to rediscover this independently.
AI SDK Version
Version: ai@6.0.182Code of Conduct