Skip to content

harness codex: spawn E2BIG when a large host tool catalog is passed through the codex config command line #16379

Description

@BilalAtique

Description

When using @ai-sdk/harness-codex with a large host tool set (around 170 tools, each carrying a full JSON input schema), every turn fails immediately with spawn E2BIG and the model process never launches. The host stream shows an empty turn (zero steps, zero output), so to a consumer it looks like the turn silently completed with no content.

Root cause

The codex bridge places the host tools into config.mcp_servers["harness-tools"].environment.TOOL_SCHEMAS as one JSON string, then passes that config to new Codex({ config }). The codex SDK (@openai/codex-sdk) flattens the config object and serializes the whole MCP block into a single command line argument of the form:

--config mcp_servers.harness-tools.env.TOOL_SCHEMAS="<entire tool catalog JSON>"

With a large catalog that single argument string is well over 340 KB, which exceeds the Linux per argument limit (MAX_ARG_STRLEN, 128 KB). spawn throws E2BIG before codex runs.

Captured stack (from the bridge event log)

spawn E2BIG
    at ChildProcess.spawn (node:internal/child_process:420:11)
    at spawn (node:child_process:787:9)
    at CodexExec.run (.../@openai/codex-sdk@0.130.0/dist/index.js:238:19)
    at runTurn (.../harness/codex/bridge.mjs:617:22)
    at handleInbound (.../harness/codex/bridge.mjs:261:11)

Why it is hard to notice

The real error frame is also lost on the host side. waitForBridgeReady acquires a reader on proc.stdout and does not release it, so the subsequent drainRest(proc.stdout) logs "Invalid state: The ReadableStream is locked" and the underlying error is swallowed. The turn appears to complete empty. The only reliable place the real error survives is the on disk bridge event log (<workdir>/.agent-runs/<sessionId>/bridge/event-log.ndjson). It would help a lot if a fatal bridge error were surfaced to the host stream rather than dropped.

Reproduction

  1. Build a HarnessAgent on the codex harness with a tool set whose combined serialized JSON schemas exceed roughly 128 KB (about 150 plus tools with non trivial input schemas).
  2. Run any turn.
  3. The turn ends with zero steps and no assistant content. The bridge event log contains the spawn E2BIG frame above.

Suggested fix

Avoid putting the large tool catalog on the command line. Two options that both work and keep the full catalog:

  1. Write mcp_servers to $CODEX_HOME/config.toml on disk (codex reads it natively per the OpenAI config reference) instead of passing it through new Codex({ config }).
  2. Or write TOOL_SCHEMAS to a file and pass only a short file path to the MCP server, having the MCP server read the file.

We applied option 1 locally as a patch and codex now runs cleanly with the full catalog.

AI SDK Version

  • @ai-sdk/harness-codex: 1.0.0-beta.24
  • @ai-sdk/harness: 1.0.0-beta.27
  • @openai/codex-sdk (bundled by the bridge): 0.130.0
  • Runtime: Node 22 in a Vercel Sandbox (Linux)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions