Support OpenAI Responses incremental inputs (automatic previous_response_id chaining in the step loop)

### Description

**Background**

The OpenAI Responses API offers a [WebSocket mode](https://developers.openai.com/api/docs/guides/websocket-mode) whose main benefit is an *incremental inputs* fast path: instead of re-sending the full conversation each turn, you set `previous_response_id` to the prior response and put **only the new items** (tool outputs + next user message) in `input`. The server keeps the most recent response in a connection-local in-memory cache.

A WebSocket transport already exists (the `ai-sdk-openai-websocket-fetch` shim referenced in the docs), but it only delivers **connection reuse** — the *incremental inputs* fast path is still unreachable through the SDK, because:

1. **The automatic multi-step loop never chains `previous_response_id`.** It is only populated from a caller-supplied `providerOptions.openai.previousResponseId`; the internal step loop captures `response.id` into result metadata but does not feed it back into the next request. So every step resends the **full** accumulated `input`.

Adding `previous_response_id` at the transport/fetch layer is unsafe given (1): the server would prepend the cached prior turn *and* receive the full history → duplicated context. Correct incremental inputs require the SDK to both set `previous_response_id` **and** trim `input` to only-new-items, which only the SDK is positioned to do reliably (it owns item transformation: `function_call` id mapping, reasoning / encrypted-reasoning items, ordering).

**Current behaviour**

- `@ai-sdk/openai` responses model sends the full message list on every step.
- `previous_response_id` is set only from `providerOptions.openai.previousResponseId`
- Using the WebSocket fetch shim therefore yields connection reuse but not incremental inputs.

**Desired behaviour**

An opt-in mode where the automatic step loop chains `previous_response_id` and sends only new items (the streaming + tool-call loop being the primary win), so the WebSocket incremental-inputs path is actually usable.

Please also consider the reconnect semantics: with `store: false` / ZDR, a dropped socket (idle timeout, the 60-minute connection cap, or a transport error) invalidates the in-memory chain and an uncached `previous_response_id` returns `previous_response_not_found`. Any chaining mode should fall back to resending full context with `previous_response_id: null` on that error.

**Use case**

Lower-latency multi-step tool loops and long conversations. Using the WebSocket fetch shim we can realize connection-reuse savings, but not incremental inputs, because the SDK resends full `input` and doesn't chain `previous_response_id`.

**Related**

- #14807 — the manual-loop counterpart (injecting tool results via `providerOptions` while staying on the `previousResponseId` chain).


### AI SDK Version

- ai@6.0.208
- ai-sdk/openai@3.0.74

### Code of Conduct

- [x] I agree to follow this project's Code of Conduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support OpenAI Responses incremental inputs (automatic previous_response_id chaining in the step loop) #16356

Description

AI SDK Version

Code of Conduct

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Support OpenAI Responses incremental inputs (automatic previous_response_id chaining in the step loop) #16356

Description

Description

AI SDK Version

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions