Skip to content

Parallel tool results are serialized in completion order, silently breaking provider prompt caching #16567

Description

@Kripu77

Description

When an assistant step makes parallel tool calls, the tool-result parts are recorded in completion order (whichever execute resolves first), not call order. toResponseMessages (packages/ai/src/generate-text/to-response-messages.ts) then builds the tool message by iterating step content in that arrival order, with no canonical sort.

Completion order is nondeterministic across requests. The same logical history therefore serializes with tool_result blocks in different orders depending on how it was produced:

  • Mid-run steps: results appear in completion order
  • History rebuilt from persistence (e.g. convertToModelMessages over stored UIMessages): results appear in call order, because UIMessage tool parts are created at call time

Providers pair tool_use/tool_result by id, so responses are unaffected - but prompt caching is byte-sensitive.

The first parallel batch whose order differs between requests invalidates the entire cached prefix behind it. In a long agent loop this shows up as the cache being re-written almost every turn; we measured ~47k tokens of redundant cache writes per request on Bedrock/Anthropic before pinning it to this.

Repro:

  1. streamText with two tools, both called in parallel in one step; make tool A slower than tool B.
  2. Capture the request body of the following step: tool_result blocks appear B, A (completion order).
  3. Rebuild the same conversation from persisted UIMessages via convertToModelMessages and send: blocks appear A, B (call order).
  4. Byte-compare - identical content, different block order → provider cache miss from that point.

Proposed fix

In toResponseMessages, emit the tool message's tool-result/tool-error parts sorted by the order of the corresponding tool-call parts in the assistant content (stable sort, unknown ids last). Key order within a step is not semantically meaningful, so this is behavior-preserving - it just makes serialization deterministic and cache-friendly.

Workaround we have implemented:

Re-sorting in prepareStep:

prepareStep: ({ messages }) => ({ messages: canonicaliseToolResultOrder(messages) })

where the helper sorts each tool message's results to match the preceding assistant's tool-call order. Works, but every caching-sensitive consumer has to rediscover this independently.

AI SDK Version

Version: ai@6.0.182

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    Fields

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions