Add Realtime Reasoning API support for gpt-realtime-2 by richarddas · Pull Request #284 · AIProxyTeam/AIProxySwift

richarddas · 2026-06-02T19:05:16Z

Summary

Adds first-class support for OpenAI Realtime Reasoning models such as gpt-realtime-2, while keeping existing Performance Realtime call sites (gpt-realtime-1.5, etc.) unchanged.

New types: OpenAIRealtimeReasoningConfiguration, OpenAIRealtimeReasoningSessionConfiguration, OpenAIRealtimeReasoningResponseCreate
OpenAIService.realtimeSession overload for Reasoning session configuration
Wire encoding merges reasoning.effort and parallel_tool_calls into the existing session.update / response.create session payload
Decodes phased Realtime output (commentary vs final_answer) on response.done, output item events, and conversation item events
README examples for gpt-realtime-2 and phased responses; schema matrix at Documentation/OpenAI/RealtimeSchemaMatrix.md
Encoding/decoding tests for Reasoning fields and phases, plus Performance regression tests

Why

Callers using reasoning voice models need reasoning.effort and parallel_tool_calls on session and response-create, and need to handle phased output items. Performance Realtime usage should remain the same API surface with no new required parameters.

Test plan

On-device smoke test with gpt-realtime-2 (session connect, Reasoning configuration, phased output)
swift test --filter OpenAIRealtime
swift test

Migration notes

Performance (unchanged):

let session = try await openAIService.realtimeSession(
    model: "gpt-realtime-1.5",
    configuration: .init(),
    logLevel: .info
)

Reasoning (new):

let session = try await openAIService.realtimeSession(
    model: "gpt-realtime-2",
    configuration: OpenAIRealtimeReasoningSessionConfiguration(
        session: OpenAIRealtimeSessionConfiguration(
            outputModalities: [.audio],
            voice: .builtin("alloy")
        ),
        reasoning: .init(effort: .low),
        parallelToolCalls: true
    ),
    logLevel: .info
)

Reasoning session configuration requires an explicit session: argument so existing configuration: .init() call sites continue to resolve to Performance configuration without ambiguity.

Made with Cursor

Includes Realtime Reasoning session and response-create types for reasoning effort and parallel tool calls while preserving existing Performance Realtime call sites. Decodes phased Realtime output for commentary and final answer items across response completion, output item, and conversation item events. Documents the current Realtime schema mapping and README examples, and removes obsolete Realtime GA/beta terminology. Adds focused encoding and decoding tests for the new wire shapes and compatibility behavior. Co-authored-by: Cursor <cursoragent@cursor.com>

richarddas · 2026-06-02T19:10:55Z

Also removed references to "GA" from the previous work, since OpenAI have closed the beta wire as of May 12, 2026.

richarddas · 2026-06-04T16:59:32Z

Looking at this a couple days later with fresh eyes, I think there are still improvements to be made to ergonomics. But I have smoke tested this on device, and it's functional.

lzell · 2026-06-08T13:27:50Z

+//
+
+/// `response.create` for Realtime Reasoning models.
+nonisolated public struct OpenAIRealtimeReasoningResponseCreate: Encodable {


I'm curious if we need a new response.create versus updating the existing OpenAIRealtimeResponseCreatedEvent. Right now that type has a responseID? on it, but it seems like we could put a Response? on it too

Oh, disregard. I confused response.create with response.created

lzell · 2026-06-10T12:57:37Z

+//
+
+/// Configuration for OpenAI Realtime Reasoning models such as `gpt-realtime-2`.
+nonisolated public struct OpenAIRealtimeReasoningConfiguration: Encodable, Sendable {


I think the main thing I'd like to understand before merging is if we need this separate ReasoningConfiguration, and separate initializer in OpenAIRealtimeSession. IIUC, a more surgical change would be to modify OpenAIRealtimeSessionConfiguration by adding a member: let reasoning: OpenAIRealtimeReasoning?.

The OpenAIRealtimeReasoning type would have a single member, effort, much like your current type OpenAIRealtimeReasoningConfiguration.

I don't see any real control flow or network sequencing differences between reasoning and non-reasoning versions right now, so I think this would be a simpler change. Let me know if I'm missing something @richarddas

And a nit: For any new types that you do create, can you use one file per public type and pull them into a new folder OpenAI/Realtime (you can see the existing example of OpenAI/Conversations). I want to start organizing up realtime files for our eventual split of this repo into several single purpose clients. That will make the work down the road a bit easier

My original thinking was around keeping Performance and Reasoning explicit at the callsite, but you make a solid point that the rest of the sequencing collapses the two anyway. Since models are provided as strings, the wrapper also doesn’t actually enforce that gpt-realtime-2 uses the Reasoning config. So the wrapper is probably overkill.

I’ll fold reasoning and parallelToolCalls into the existing session and response-create types, while keeping reasoning as a grouped value so the Reasoning intent is still explicit at the callsite.

richarddas mentioned this pull request Jun 2, 2026

Feature request: GA Realtime audio models gpt-realtime-translate & gpt-realtime-whisper #283

Open

lzell self-assigned this Jun 3, 2026

lzell reviewed Jun 8, 2026

View reviewed changes

lzell reviewed Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Realtime Reasoning API support for gpt-realtime-2#284

Add Realtime Reasoning API support for gpt-realtime-2#284
richarddas wants to merge 1 commit into
AIProxyTeam:mainfrom
richarddas:feature/realtime-reasoning-api-parity

richarddas commented Jun 2, 2026

Uh oh!

richarddas commented Jun 2, 2026

Uh oh!

richarddas commented Jun 4, 2026

Uh oh!

lzell Jun 8, 2026

Uh oh!

lzell Jun 8, 2026

Uh oh!

lzell Jun 10, 2026

Uh oh!

lzell Jun 10, 2026

Uh oh!

richarddas Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

richarddas commented Jun 2, 2026

Summary

Why

Test plan

Migration notes

Uh oh!

richarddas commented Jun 2, 2026

Uh oh!

richarddas commented Jun 4, 2026

Uh oh!

lzell Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

lzell Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

lzell Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

lzell Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

richarddas Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants