Feat/cursor agent add Cursor Agent CLI support#152
Conversation
- Implement CursorAgent (src/core/agents/cursor.ts) that spawns cursor-agent with -p --output-format stream-json --force --trust, appends a JSON-schema final-output contract to the prompt, parses assistant text events (with fallback to the terminal result event), and recovers fenced or prose-wrapped JSON via parseAgentJson. - Parse the cursor-agent result.usage field for input/output/cached token totals so per-iteration usage shows up in the renderer on par with the other native agents. - Detect "invalid model" exits as PermanentAgentError so misconfigured models abort immediately instead of triggering the retry loop, using a new classifyExitError hook on setupChildProcessHandlers that other agents can reuse. - Wire CursorAgent into the agent factory, register cursor in AGENT_NAMES, reserve cursor's gnhf-managed CLI flags (-p/--print/--output-format/--stream-partial-output/--resume/ --continue/--workspace/--api-key) in config validation, and refresh the YAML bootstrap template. - Add unit tests for CursorAgent (spawn args, schema, Windows shell detection, abort handling, JSON parsing happy paths and recovery, usage extraction, permanent-error classification), plus factory and config cases for cursor and stream-utils coverage for the new classifyExitError hook. - Update README.md and AGENTS.md to document the cursor agent. Co-authored-by: Cursor <cursoragent@cursor.com>
cursor-agent's stream-json output only emits authoritative token counts on the terminal `result.usage` event, which never arrives if the run is aborted and only lands at the end of a successful run. That left the gnhf renderer stuck at zero tokens while the preview kept showing thinking deltas, assistant messages, and tool calls. Port the same live-estimation strategy ACP uses: - Seed an initial prompt-only input estimate as soon as the iteration starts so the renderer is non-zero immediately. - Grow the input estimate by 2000 tokens per `tool_call:started` event (the heuristic ACP uses; tool results feed back as input on the next round). - Grow the output estimate from accumulated character counts across `thinking` deltas and `assistant` text events (chars / 4). - Flag every live update with `estimated: true` so the renderer prefixes the display with "~" (existing convention shared with ACP). - Graduate to authoritative numbers when `result.usage` arrives and drop the estimated flag on the final onUsage callback; if the run aborts or errors before that, the resolved usage stays the final estimate rather than a stuck zero. Update the cursor tests to cover the new live-update behavior: prompt-seed estimate, tool-call growth (with `completed` not double counting), thinking-delta growth (with `completed` without `text` not firing), preservation of estimates when `result.usage` is empty, errored-result behavior, and the authoritative-result happy path now correctly drops the estimated flag on the last call. Co-authored-by: Cursor <cursoragent@cursor.com>
The estimated:true flag was making cursor's token counts render with a "~" prefix in the renderer, which made it visually inconsistent with every other native agent (claude, codex, copilot, opencode, rovodev, pi). They all just report whatever their CLI gives them, including zeros for fields the CLI omits. Keep the live estimation logic (prompt-token seed, +2000 tokens per tool_call:started, char-derived output tokens from thinking and assistant deltas) so the counters still climb during a run, but stop flagging the values as estimated. cursor now matches the display convention of the other native CLI agents; ACP remains the only place that uses the flag, since its adapters can be entirely heuristic. Update tests to assert estimated is undefined on every onUsage call and on the resolved usage, across the seed, tool-call growth, thinking-delta growth, empty-usage, errored-result, and live-estimate paths. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3a11d6058b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| arg === "--yolo" || | ||
| arg === "--sandbox" || | ||
| arg.startsWith("--sandbox=") || | ||
| arg === "--trust", |
There was a problem hiding this comment.
Do not disable default force mode when only --trust is set
Including --trust in userSpecifiedPermissionMode makes gnhf drop its default --force flag whenever users pass only --trust via agentArgsOverride.cursor. Cursor docs distinguish these: command/file execution still needs force mode in print/headless runs, while trust only bypasses workspace-trust prompts. In this path, iterations can lose command/write permission and fail or stall in non-interactive runs even though the user only intended to skip trust prompts.
Useful? React with 👍 / 👎.
What Changed
Risk Assessment
✅ Low: The change is well-bounded to adding a new Cursor native agent path, with config/factory wiring and focused tests covering argument construction, stream parsing, error classification, and usage reporting.
Testing
export PATH=/opt/homebrew/bin:/usr/local/bin:$PATH; CI=true pnpm install --frozen-lockfile && pnpm testPipeline
Updates from git push no-mistakes
⏭️ **intent** - skipped
Round 1 - passed ✅
✅ **Rebase** - passed
Round 1 - passed ✅
✅ **Review** - passed
Round 1 - passed ✅
✅ **Test** - passed
Round 1 - passed ✅
export PATH=/opt/homebrew/bin:/usr/local/bin:$PATH; CI=true pnpm install --frozen-lockfile && pnpm test🔧 **Document** - 3 issues found → auto-fixed
Round 1 - found 3 warnings
skills/gnhf/SKILL.md:63- The bundled GNHF skill's launch example omits the newcursoragent from the supported--agentchoices. Since this skill is shipped in the npm package as agent-facing usage documentation, it should listcursoralongside the other native agents.skills/gnhf/SKILL.md:152- The Morning Review process search does not includecursororcursor-agent, so agents following the skill may miss an active Cursor-backed GNHF run. Update thepgreppattern to include the new Cursor agent process.skills/gnhf/SKILL.md:161- The Agent selection guidance has entries for every existing native agent except the new Cursor Agent CLI. Add acursorentry describing when to choose it, consistent with the README's new Cursor support.Round 2 (auto-fix) - passed ✅
✅ **Lint** - passed
Round 1 - passed ✅
✅ **Push** - passed
Round 1 - passed ✅