Add Realtime Reasoning API support for gpt-realtime-2#284
Conversation
Includes Realtime Reasoning session and response-create types for reasoning effort and parallel tool calls while preserving existing Performance Realtime call sites. Decodes phased Realtime output for commentary and final answer items across response completion, output item, and conversation item events. Documents the current Realtime schema mapping and README examples, and removes obsolete Realtime GA/beta terminology. Adds focused encoding and decoding tests for the new wire shapes and compatibility behavior. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Also removed references to "GA" from the previous work, since OpenAI have closed the beta wire as of May 12, 2026. |
|
Looking at this a couple days later with fresh eyes, I think there are still improvements to be made to ergonomics. But I have smoke tested this on device, and it's functional. |
| // | ||
|
|
||
| /// `response.create` for Realtime Reasoning models. | ||
| nonisolated public struct OpenAIRealtimeReasoningResponseCreate: Encodable { |
There was a problem hiding this comment.
I'm curious if we need a new response.create versus updating the existing OpenAIRealtimeResponseCreatedEvent. Right now that type has a responseID? on it, but it seems like we could put a Response? on it too
There was a problem hiding this comment.
Oh, disregard. I confused response.create with response.created
| // | ||
|
|
||
| /// Configuration for OpenAI Realtime Reasoning models such as `gpt-realtime-2`. | ||
| nonisolated public struct OpenAIRealtimeReasoningConfiguration: Encodable, Sendable { |
There was a problem hiding this comment.
I think the main thing I'd like to understand before merging is if we need this separate ReasoningConfiguration, and separate initializer in OpenAIRealtimeSession. IIUC, a more surgical change would be to modify OpenAIRealtimeSessionConfiguration by adding a member: let reasoning: OpenAIRealtimeReasoning?.
The OpenAIRealtimeReasoning type would have a single member, effort, much like your current type OpenAIRealtimeReasoningConfiguration.
I don't see any real control flow or network sequencing differences between reasoning and non-reasoning versions right now, so I think this would be a simpler change. Let me know if I'm missing something @richarddas
There was a problem hiding this comment.
And a nit: For any new types that you do create, can you use one file per public type and pull them into a new folder OpenAI/Realtime (you can see the existing example of OpenAI/Conversations). I want to start organizing up realtime files for our eventual split of this repo into several single purpose clients. That will make the work down the road a bit easier
There was a problem hiding this comment.
My original thinking was around keeping Performance and Reasoning explicit at the callsite, but you make a solid point that the rest of the sequencing collapses the two anyway. Since models are provided as strings, the wrapper also doesn’t actually enforce that gpt-realtime-2 uses the Reasoning config. So the wrapper is probably overkill.
I’ll fold reasoning and parallelToolCalls into the existing session and response-create types, while keeping reasoning as a grouped value so the Reasoning intent is still explicit at the callsite.
Summary
Adds first-class support for OpenAI Realtime Reasoning models such as
gpt-realtime-2, while keeping existing Performance Realtime call sites (gpt-realtime-1.5, etc.) unchanged.OpenAIRealtimeReasoningConfiguration,OpenAIRealtimeReasoningSessionConfiguration,OpenAIRealtimeReasoningResponseCreateOpenAIService.realtimeSessionoverload for Reasoning session configurationreasoning.effortandparallel_tool_callsinto the existingsession.update/response.createsession payloadcommentaryvsfinal_answer) onresponse.done, output item events, and conversation item eventsgpt-realtime-2and phased responses; schema matrix atDocumentation/OpenAI/RealtimeSchemaMatrix.mdWhy
Callers using reasoning voice models need
reasoning.effortandparallel_tool_callson session and response-create, and need to handle phased output items. Performance Realtime usage should remain the same API surface with no new required parameters.Test plan
gpt-realtime-2(session connect, Reasoning configuration, phased output)swift test --filter OpenAIRealtimeswift testMigration notes
Performance (unchanged):
Reasoning (new):
Reasoning session configuration requires an explicit
session:argument so existingconfiguration: .init()call sites continue to resolve to Performance configuration without ambiguity.Made with Cursor