Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .changeset/grok-build-host-tools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
'@ai-sdk/harness-grok-build': patch
---

feat(harness-grok-build): drive the `grok agent stdio` ACP surface

Move the adapter to ACP (JSON-RPC over stdio): tool-call, tool-result, and
file-change events; token usage and a structured finish reason on finish;
host-defined custom tools via an in-sandbox MCP server; and built-in tool
approvals through the ACP `session/request_permission` flow.
76 changes: 64 additions & 12 deletions content/providers/02-ai-sdk-harnesses/04-grok-build.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,11 @@ description: Learn how to use the Grok Build harness adapter.
# Grok Build Harness

The Grok Build harness adapter connects `HarnessAgent` to the `grok` CLI. The
adapter runs a bridge inside the sandbox and streams the CLI's
`--output-format streaming-json` events back to the host over a sandbox-exposed
WebSocket.
adapter drives `grok agent stdio` over the Agent Client Protocol (ACP/JSON-RPC)
through a bridge inside the sandbox and streams its events back to the host over
a sandbox-exposed WebSocket. This surfaces text and reasoning, tool-call,
tool-result, and file-change events, token usage on finish, a structured finish
reason, host-defined (custom) tools, and built-in tool approvals.

<Note>
Harness packages are **experimental**. Expect breaking changes between
Expand Down Expand Up @@ -103,15 +105,16 @@ Use `createGrokBuild()` to configure the runtime:

```ts
const harness = createGrokBuild({
model: 'grok-code-fast-1',
planMode: true,
model: 'grok-build-0.1',
reasoningEffort: 'high',
});
```

Settings:

- `auth`: xAI or AI Gateway authentication settings.
- `model`: Grok model id. If omitted, the adapter uses its pinned default.
- `reasoningEffort`: reasoning effort (`'low' | 'medium' | 'high'`), passed to the CLI's `--reasoning-effort`.
- `planMode`: run the CLI in plan mode.
- `port`: bridge port override.
- `startupTimeoutMs`: maximum time to wait for the bridge to start.
Expand Down Expand Up @@ -153,15 +156,64 @@ const sandbox = createVercelSandbox({
});
```

## Known limitations
## Tools

The grok CLI's `--output-format streaming-json` surface is narrow:
Host-defined (custom) tools passed to `agent.tools` are exposed to the CLI
through an in-sandbox MCP server and executed on the host:

- Streams reasoning and text only — no tool-call, tool-result, or file-change
events, and no token usage.
- Allow-all permission mode only. The CLI runs with `--always-approve` and
executes tools itself; use `permissionMode: 'allow-all'`.
- No compaction.
```ts
import { tool } from 'ai';
import { z } from 'zod';

const weather = tool({
description: 'Get the current temperature for a city.',
inputSchema: z.object({ city: z.string() }),
execute: async ({ city }) => ({ city, celsius: 12 }),
});

const agent = new HarnessAgent({
harness: grokBuild,
sandbox,
tools: { weather },
});
```

The adapter also exposes these common Grok Build built-ins through `agent.tools`:

- `read`
- `write`
- `edit`
- `bash`
- `glob`
- `grep`
- `webSearch`

Tool-call, tool-result, and file-change events appear in the stream, and token
usage is reported on finish alongside a structured finish reason.

## Tool approvals

Grok Build requests approval before running a tool via the ACP
`session/request_permission` flow when `permissionMode` is `allow-reads` or
`allow-edits` (use `allow-all` to auto-approve). The adapter surfaces each
request to the host so it can be approved or rejected.

ACP approval is **synchronous**: Grok pauses the turn and waits for the reply on
the same live connection. Per the ACP specification, a prompt turn cannot be
paused and resumed later — cancellation ends it. Approval therefore only works
when the host answers inline over a connection that stays open for the whole
turn.

<Note>
This means per-tool approval works in single-stream setups (e.g. a TUI, or a
server route backed by a persistent SSE/WebSocket connection), but **not** in a
request/response HTTP route that ends one response at the approval and resumes
in a second request. The standard AI SDK `toolApproval: 'user-approval'`
split-request pattern cannot drive Grok Build approvals over plain HTTP,
because Grok's turn is mid-flight and ACP cannot resume it. For such routes,
run with `permissionMode: 'allow-all'` so the turn never blocks, or keep the
connection open for the turn's lifetime and answer approvals inline.
</Note>

## Related

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import { HarnessAgent } from '@ai-sdk/harness/agent';
import { createGrokBuild } from '@ai-sdk/harness-grok-build';
import { printFullStream } from '../../lib/print-full-stream';
import { run } from '../../lib/run';
import { createVercelSandbox } from '@ai-sdk/sandbox-vercel';

run(async () => {
const sandbox = createVercelSandbox({
runtime: 'node24',
ports: [4000],
timeout: 10 * 60 * 1000,
});
const agent = new HarnessAgent({
harness: createGrokBuild({ reasoningEffort: 'high' }),
sandbox,
});

let exitCode = 0;
const session = await agent.createSession();
try {
const result = await agent.stream({
session,
prompt:
'Plan a multi-step path from A to B where A=(0,0) and B=(3,4) on a grid, moving only N/S/E/W. ' +
'Explain your reasoning, then give the final path.',
});
await printFullStream({ result });
} catch (err) {
exitCode = 1;
console.error('[example] failed:', err);
} finally {
await session.destroy();
process.exit(exitCode);
}
});
54 changes: 54 additions & 0 deletions examples/ai-functions/src/harness-agent/grok-build/with-tools.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
import { HarnessAgent } from '@ai-sdk/harness/agent';
import { grokBuild } from '@ai-sdk/harness-grok-build';
import { createVercelSandbox } from '@ai-sdk/sandbox-vercel';
import { tool } from 'ai';
import { z } from 'zod';
import { printFullStream } from '../../lib/print-full-stream';
import { run } from '../../lib/run';

run(async () => {
const sandbox = createVercelSandbox({
runtime: 'node24',
ports: [4000],
timeout: 10 * 60 * 1000,
});
const weather = tool({
description: 'Get the current temperature for a city.',
inputSchema: z.object({ city: z.string() }),
execute: async ({ city }: { city: string }) => {
const temps: Record<string, number> = {
Paris: 12,
Tokyo: 18,
Reykjavik: 3,
};
return { city, celsius: temps[city] ?? 20 };
},
});

const agent = new HarnessAgent({
harness: grokBuild,
sandbox,
tools: { weather },
permissionMode: 'allow-all',
});

let exitCode = 0;
const session = await agent.createSession();
try {
const result = await agent.stream({
session,
prompt:
'What is the weather in Paris and Reykjavik? Use the `weather` tool, then summarize in one sentence.',
});

await printFullStream({ result });

console.log('steps:', (await result.steps).length);
} catch (err) {
exitCode = 1;
console.error('[example] failed:', err);
} finally {
await session.destroy();
process.exit(exitCode);
}
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import { weatherTool } from '@/lib/tools/weather-tool';
import {
WEATHER_CODES_REFERENCE,
weatherCodesSkill,
weatherForecastSkill,
weatherInstructions,
} from '@/lib/weather-utils';
import {
HarnessAgent,
createFileReporter,
createTraceTreeReporter,
} from '@ai-sdk/harness/agent';
import { grokBuild } from '@ai-sdk/harness-grok-build';
import { createVercelSandbox } from '@ai-sdk/sandbox-vercel';
import type { InferUITools, UIMessage } from 'ai';

export const weatherGrokBuildHarnessAgent = new HarnessAgent({
harness: grokBuild,
instructions: weatherInstructions,
skills: [weatherForecastSkill, weatherCodesSkill],
tools: { get_weather: weatherTool },
sandbox: createVercelSandbox({
runtime: 'node24',
ports: [4000],
}),
onSandboxSession: async ({ session, sessionWorkDir, abortSignal }) => {
await session.writeTextFile({
path: `${sessionWorkDir}/weather-codes.md`,
content: WEATHER_CODES_REFERENCE,
abortSignal,
});
},
telemetry: {
integrations: [
createTraceTreeReporter(),
createFileReporter({ dir: '.harness-observability/grok-build/weather' }),
],
},
});

export type WeatherGrokBuildHarnessAgentMessage = UIMessage<
unknown,
never,
InferUITools<typeof weatherGrokBuildHarnessAgent.tools>
>;
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import { weatherGrokBuildHarnessAgent } from '@/agent/harness/grok-build/weather-agent';
import {
detachAndPersist,
resumeOrCreateSession,
} from '@/util/harness-resume-store';
import {
convertToModelMessages,
createUIMessageStreamResponse,
toUIMessageStream,
type UIMessage,
} from 'ai';

export async function POST(request: Request) {
const body: {
id?: string;
messages: UIMessage[];
} = await request.json();

if (!body.id) {
return new Response('Missing chat id', { status: 400 });
}
const chatId = body.id;
const messages = await convertToModelMessages(body.messages);

const session = await resumeOrCreateSession(
weatherGrokBuildHarnessAgent,
chatId,
);

const result = await weatherGrokBuildHarnessAgent.stream({
session,
messages,
});

return createUIMessageStreamResponse({
stream: toUIMessageStream({
stream: result.stream,
onFinish: () => detachAndPersist(chatId, session),
}),
});
}
19 changes: 19 additions & 0 deletions examples/harness-e2e-next/app/harness/grok-build/weather/page.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import ChatIdProvider from '@/components/chat-id-provider';
import WeatherGrokBuildHarnessChat from '@/components/weather-grok-build-harness-chat';

export const metadata = {
title: 'Grok Build — Weather',
};

const STORAGE_KEY = 'harness-grok-build-weather-chat-id';

export default function HarnessGrokBuildWeatherPage() {
return (
<ChatIdProvider storageKey={STORAGE_KEY}>
<WeatherGrokBuildHarnessChat
apiRoute="/api/harness/grok-build/weather"
exampleLabel="Weather"
/>
</ChatIdProvider>
);
}
2 changes: 1 addition & 1 deletion examples/harness-e2e-next/app/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ const HARNESSES = [
{
slug: 'grok-build',
label: 'Grok Build',
variants: ['basic', 'basic-with-stop', 'ai-sdk-coding'],
variants: ['basic', 'basic-with-stop', 'ai-sdk-coding', 'weather'],
},
] as const;

Expand Down
Loading