feat(tools): add defineClientTool for client-resolved (HITL) tools by keesvandorp · Pull Request #204 · vercel/eve

keesvandorp · 2026-06-23T09:11:37Z

Fixes #203.

Problem

Authored tools are required to provide an execute: both the compiler
(normalizeToolDefinition → expectFunction(record.execute)) and the runtime
(resolveToolDefinition → expectFunction(resolvedRecord.execute)) reject a
tool without one. That makes it impossible to author a tool that participates in
the human-in-the-loop input flow the way the built-in ask_question does — no
executor, the model emits the call, the harness parks it, and the user's answer
becomes its single tool_result.

The practical consequence (from #203): overriding ask_question to widen its
(fixed, .strict()) input schema for typed HITL pickers forces an execute,
whose auto-result collides with the input response. The resumed turn then
carries two tool_result blocks for one tool_use id and the provider
rejects it:

each tool_use must have a single result. Found multiple `tool_result` blocks with id: toolu_…

Change

Add defineClientTool({ description, inputSchema, outputSchema? }) — an
authored tool with no execute, stamped clientResolved: true. eve never
runs it; the call parks for input and resolves out-of-band, producing exactly
one result.

internal/authored-definition/schema-backed.ts — allow omitting execute
when clientResolved; every other tool still requires it.
runtime/resolve-tool.ts — skip reattaching a live execute for
client-resolved tools.
public/definitions/tool.ts — defineClientTool + ClientToolDefinition;
passing execute throws.
public/tools/index.ts — export from eve/tools.

No harness change is needed: the runtime already surfaces executeless tools as
client-side (buildToolSet / wrapToolExecute return undefined) and
ResolvedToolDefinition.execute is already Optional. This PR just lets
authored tools reach that existing path. defineTool is unchanged and still
requires execute.

Authoring agent/tools/ask_question.ts with defineClientTool overrides the
built-in question tool with a wider, typed schema while keeping native
pause/resume — the parked input.requested carries the full typed input, so a
client can render a dedicated widget from it.

import { defineClientTool } from "eve/tools";
import { z } from "zod";

export default defineClientTool({
  description: "Ask the user to pick a template.",
  inputSchema: z.object({
    prompt: z.string(),
    ui: z.object({ kind: z.literal("template_picker") }).passthrough(),
  }),
});

Tests

defineClientTool brands the definition, marks it clientResolved, carries
no execute, and throws when execute is supplied.
normalizeToolDefinition accepts a client-resolved tool without execute and
still rejects a non-client tool that omits it.
pnpm --filter eve typecheck / oxlint / unit tests green.

Verified end-to-end in a downstream app (Next.js + useEveAgent): an
executeless ask_question override parks (input.requested → session.waiting),
resumes from the user's structured answer, produces a single tool_result, and
the turn continues — no duplicate-result 400.

Notes

Scope is intentionally minimal (issue No way to author a HITL / client-resolved tool: authored tools require execute, which yields a duplicate tool_result when the same call parks for input #203 solution 1). Generalising the
input-extraction park to executeless tools with arbitrary names (so HITL
tools needn't be named ask_question) is a natural follow-up, kept out of
this PR to keep the primitive focused.

vercel · 2026-06-23T09:11:42Z

@keesvandorp is attempting to deploy a commit to the Vercel Team on Vercel.

A member of the Team first needs to authorize it.

keesvandorp · 2026-06-23T11:39:41Z

Thanks — addressed in d839f7b.

Reject mixed shapes (compiler + runtime). A client-resolved tool that also defines execute now throws at both normalizeToolDefinition (compile) and resolveToolDefinition (runtime), rather than silently dropping the executor. The non-client-without-marker direction was already rejected. (defineClientTool still throws at authoring time too.) Unit-tested.

Regression 1 — the one that matters. New e2e/fixtures/agent-tools-hitl eval client-resolved-question: an authored, widened ask_question (defineClientTool + typed ui) parks, resumes from a structured answer, and continues into a downstream note tool. The single-result invariant is asserted operationally — before the fix the authored override carried an execute, so the resume reconstructed two tool_result blocks for one id and the provider 400'd; a green expectOk() + downstream calledTool('note') + completed() can only happen with exactly one result for the call.

Regression 2 — separation. New eval approval-vs-client-resolved: guarded-echo (approval-gated executable) parks for approval and resolves via its executor (token in the result); ask_question (client-resolved) parks for input and the user's answer is the result, no executor. Same parking machinery, opposite result sources.

Verified: eve typecheck + unit tests (incl. the mixed-shape rejection) + oxlint; the HITL fixture typechecks and eve builds with the override. I haven't run the evals themselves (they need a gateway) — they mirror the existing ask-question-select / approve-then-no-regate evals.

rpelevin · 2026-06-23T12:40:46Z

Nice update. The important thing I would preserve before merge is that the mixed-shape rejection and the resume regression stay paired.

The compiler and runtime checks close the construction path, but the fixture is what proves the runtime history is actually reconstructing one result for the original parked call.

The remaining review lens I would use is:

The client-resolved marker is the only path that allows no executor.
Any exported execute on that shape fails before it can be silently dropped.
The resumed structured input is bound to the same call id.
The reconstructed provider history contains exactly one result for that call.
The approval-gated executable path still proves the opposite source of truth: approval permits the executor to run; client input supplies the result.

If the gateway-backed evals cannot run in ordinary CI, I would at least keep the fixture build/typecheck plus the mixed-shape unit tests as merge blockers, and treat live eval execution as release evidence before closing the original issue.

Boundary: architecture and test feedback only; no claim about using this project or running its code.

keesvandorp · 2026-06-23T13:19:04Z

Agreed on all five, and on keeping the rejection and the resume regression paired — the compiler/runtime checks guard construction, the fixture proves the reconstructed history settles to one result for the parked call. Neither substitutes for the other.

The gating maps cleanly onto the existing CI:

Deterministic merge blockers (no gateway): ci.yml runs lint → typecheck → test-unit (pnpm test:unit, where the mixed-shape rejection tests live) → test-integration on every PR. That covers invariants 1–2 (only the marker permits no executor; an exported execute on that shape fails, not silently dropped) plus the construction surface.
Release evidence (gateway-backed): e2e-local.yml triggers on the e2e/** change — it builds eve + the HITL fixture (so construction errors still gate there), then runs eve eval --strict. The eval execution (invariants 3–5: same-call-id binding, exactly-one-result on resume, and the approval path proving the opposite source of truth) needs a model gateway, so I'd treat a green eval run as the evidence before closing No way to author a HITL / client-resolved tool: authored tools require execute, which yields a duplicate tool_result when the same call parks for input #203 rather than a PR gate.

Locally verified on the branch: pnpm --filter eve typecheck + test:unit (incl. the mixed-shape rejection) + oxlint, and the agent-tools-hitl fixture both tsc-typechecks and eve builds with the defineClientTool override.

If it'd help, I'm happy to split the fixture's typecheck/build into an explicit gateway-free job so the construction guarantee gates independently of eval execution — just say the word.

rpelevin · 2026-06-23T14:02:47Z

Yes, I would split that gateway-free job.

The useful boundary is:

A merge-blocking deterministic job proves construction: client-resolved tools can omit execute, mixed client-resolved plus execute fails, and the authored HITL fixture typechecks and builds.
Gateway-backed eval remains release evidence: it proves resumed history creates exactly one result for the parked call and keeps approval-gated executable output separate from client-resolved input.
Closing the original bug should wait for both pieces: deterministic construction gate green, and strict eval evidence green.

That split keeps CI fast while preserving the invariant that no executor-less path ships without a concrete fixture compiled against the authored surface.

I would make the job name explicit enough that future maintainers know what it protects, for example client-resolved-hitl-construction, and keep it pinned to the authored ask_question override plus the approval-vs-client-resolved fixture build.

Boundary: architecture and test feedback only; no claim about using this project or running its code.

Splits the construction contract from the gateway-backed evals (per review on vercel#204). New merge-blocking job proves, with no model gateway: - the client-resolved omit-execute + mixed-shape rejection unit guards, and - that the authored HITL fixture (ask_question override + approval-vs- client-resolved fixture) typechecks and builds against the authored surface. The gateway-backed `eve eval` (e2e-local / e2e-vercel) stays as runtime release evidence (single result on resume; approval-gated execution kept separate from client-resolved input). Keeps CI fast while guaranteeing no executor-less path ships without a fixture compiled against it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kees van Dorp <keesvandorp@me.com>

keesvandorp · 2026-06-23T15:03:12Z

Done — split out as a dedicated merge-blocking job in 32782ea: client-resolved-hitl-construction (in ci.yml, gateway-free).

It proves construction, deterministically:

the omit-execute + mixed-shape rejection unit guards (define-client-tool + schema-backed), and
pnpm --filter agent-tools-hitl run typecheck (eve build && tsc) — so the authored ask_question override and the approval-vs-client-resolved fixture must compile and build against the authored surface.

The gateway-backed eve eval (e2e-local / e2e-vercel) stays as runtime release evidence: exactly one result on resume, and approval-gated execution kept separate from client-resolved input. So closing #203 waits for both — construction gate green here, strict eval green as release evidence.

Job name is intentionally explicit and pinned to that fixture pair, with a header comment stating what it protects, so it's legible to future maintainers. Verified locally: unit guards 18/18, and the fixture eve build && tsc clean.

Splits the construction contract from the gateway-backed evals (per review on vercel#204). New merge-blocking job proves, with no model gateway: - the client-resolved omit-execute + mixed-shape rejection unit guards, and - that the authored HITL fixture (ask_question override + approval-vs- client-resolved fixture) typechecks and builds against the authored surface. The gateway-backed `eve eval` (e2e-local / e2e-vercel) stays as runtime release evidence (single result on resume; approval-gated execution kept separate from client-resolved input). Keeps CI fast while guaranteeing no executor-less path ships without a fixture compiled against it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kees van Dorp <keesvandorp@me.com>

Authored tools previously had to provide an `execute` (the compiler's `normalizeToolDefinition` and the runtime's `resolveToolDefinition` both called `expectFunction(execute)`). That made it impossible to author a human-in-the-loop tool the way the built-in `ask_question` works — no executor, the call parks for input and resolves out-of-band. Overriding `ask_question` to widen its input schema forced an `execute`, whose auto-result collided with the input response: two `tool_result` blocks for one `tool_use` id, which the provider rejects on resume ("each tool_use must have a single result"). Add `defineClientTool({ description, inputSchema, outputSchema? })`, which stamps `clientResolved: true` and carries no `execute`: - normalize-tool / schema-backed: allow omitting `execute` when `clientResolved`; every other tool still requires it. - resolve-tool: skip reattaching a live `execute` for client-resolved tools. - The runtime already surfaces executeless tools as client-side (buildToolSet / wrapToolExecute return undefined), so no harness change is needed; the resolved definition's `execute` is already Optional. `defineTool` is unchanged and still requires `execute`. Passing `execute` to `defineClientTool` throws. Fixes vercel#203 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kees van Dorp <keesvandorp@me.com>

…ressions Address review feedback on the defineClientTool contract: - Reject mixed shapes at BOTH the compiler (normalize-tool) and runtime (resolve-tool): a client-resolved tool that also defines `execute` now throws instead of silently dropping the executor. (A non-client tool that omits `execute` was already rejected.) - e2e HITL fixture regressions: - client-resolved-question: an authored, widened `ask_question` (defineClientTool + typed `ui`) parks, resumes from a structured answer, and continues into a downstream `note` tool — exactly one tool_result for the parked call id. Before the fix this resume 400'd ("each tool_use must have a single result"); a green resume + downstream call proves the single result. - approval-vs-client-resolved: proves executable-with-approval and client-resolved input are separate paths — approval runs the executor; client input supplies the result. Verified: eve typecheck + unit tests (incl. the mixed-shape rejection) + oxlint; the HITL fixture typechecks (tsc) and `eve build`s with the override. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kees van Dorp <keesvandorp@me.com>

Splits the construction contract from the gateway-backed evals (per review on vercel#204). New merge-blocking job proves, with no model gateway: - the client-resolved omit-execute + mixed-shape rejection unit guards, and - that the authored HITL fixture (ask_question override + approval-vs- client-resolved fixture) typechecks and builds against the authored surface. The gateway-backed `eve eval` (e2e-local / e2e-vercel) stays as runtime release evidence (single result on resume; approval-gated execution kept separate from client-resolved input). Keeps CI fast while guaranteeing no executor-less path ships without a fixture compiled against it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kees van Dorp <keesvandorp@me.com>

Public API added in this PR needs docs (per CONTRIBUTING). Add a "Custom client-resolved tools" section to the human-in-the-loop page covering defineClientTool: no execute, the ask_question override for typed pickers, the parked-input contract, and that defineTool/defineClientTool are mutually exclusive (exactly one result). Cross-link from the tools overview. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kees van Dorp <keesvandorp@me.com>

keesvandorp mentioned this pull request Jun 23, 2026

No way to author a HITL / client-resolved tool: authored tools require execute, which yields a duplicate tool_result when the same call parks for input #203

Open

keesvandorp force-pushed the feat/client-resolved-tools branch from 06c8930 to 4bc13be Compare June 23, 2026 09:18

keesvandorp force-pushed the feat/client-resolved-tools branch from 32782ea to c1f8a63 Compare June 23, 2026 15:07

keesvandorp and others added 4 commits June 23, 2026 17:20

keesvandorp force-pushed the feat/client-resolved-tools branch from 6e8a3b2 to 6b54d8f Compare June 23, 2026 15:20

ya5huk mentioned this pull request Jun 24, 2026

needsApproval on an authored (local) executable tool crashes on approve-resume with the OpenAI Responses provider: No tool output found for function call #236

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(tools): add defineClientTool for client-resolved (HITL) tools#204

feat(tools): add defineClientTool for client-resolved (HITL) tools#204
keesvandorp wants to merge 4 commits into
vercel:mainfrom
keesvandorp:feat/client-resolved-tools

keesvandorp commented Jun 23, 2026

Uh oh!

vercel Bot commented Jun 23, 2026

Uh oh!

keesvandorp commented Jun 23, 2026

Uh oh!

rpelevin commented Jun 23, 2026

Uh oh!

keesvandorp commented Jun 23, 2026

Uh oh!

rpelevin commented Jun 23, 2026

Uh oh!

keesvandorp commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

keesvandorp commented Jun 23, 2026

Problem

Change

Tests

Notes

Uh oh!

vercel Bot commented Jun 23, 2026

Uh oh!

keesvandorp commented Jun 23, 2026

Uh oh!

rpelevin commented Jun 23, 2026

Uh oh!

keesvandorp commented Jun 23, 2026

Uh oh!

rpelevin commented Jun 23, 2026

Uh oh!

keesvandorp commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants