Skip to content

Feat/cursor agent add Cursor Agent CLI support#152

Open
gizm0duck wants to merge 13 commits into
kunchenguid:mainfrom
gizm0duck:feat/cursor-agent
Open

Feat/cursor agent add Cursor Agent CLI support#152
gizm0duck wants to merge 13 commits into
kunchenguid:mainfrom
gizm0duck:feat/cursor-agent

Conversation

@gizm0duck

Copy link
Copy Markdown

What Changed

  • Added Cursor Agent CLI as a native agent option, including factory/config wiring, reserved API-key alias handling, streaming JSON parsing, assistant delta accumulation, result-output fallback, and token usage reporting.
  • Added focused coverage for Cursor agent behavior, factory selection, config validation, and shared stream utilities.
  • Updated README, AGENTS guidance, and the bundled GNHF skill to document Cursor support and include Cursor-backed runs in review/process guidance.

Risk Assessment

✅ Low: The change is well-bounded to adding a new Cursor native agent path, with config/factory wiring and focused tests covering argument construction, stream parsing, error classification, and usage reporting.

Testing

  • export PATH=/opt/homebrew/bin:/usr/local/bin:$PATH; CI=true pnpm install --frozen-lockfile && pnpm test
  • Outcome: ✅ passed across 1 run (29.0s)

Pipeline

Updates from git push no-mistakes

⏭️ **intent** - skipped

Round 1 - passed ✅

✅ **Rebase** - passed

Round 1 - passed ✅

✅ **Review** - passed

Round 1 - passed ✅

✅ **Test** - passed

Round 1 - passed ✅

  • export PATH=/opt/homebrew/bin:/usr/local/bin:$PATH; CI=true pnpm install --frozen-lockfile && pnpm test
🔧 **Document** - 3 issues found → auto-fixed

Round 1 - found 3 warnings

  • ⚠️ skills/gnhf/SKILL.md:63 - The bundled GNHF skill's launch example omits the new cursor agent from the supported --agent choices. Since this skill is shipped in the npm package as agent-facing usage documentation, it should list cursor alongside the other native agents.
  • ⚠️ skills/gnhf/SKILL.md:152 - The Morning Review process search does not include cursor or cursor-agent, so agents following the skill may miss an active Cursor-backed GNHF run. Update the pgrep pattern to include the new Cursor agent process.
  • ⚠️ skills/gnhf/SKILL.md:161 - The Agent selection guidance has entries for every existing native agent except the new Cursor Agent CLI. Add a cursor entry describing when to choose it, consistent with the README's new Cursor support.

Round 2 (auto-fix) - passed ✅

✅ **Lint** - passed

Round 1 - passed ✅

✅ **Push** - passed

Round 1 - passed ✅

gizm0duck and others added 11 commits May 22, 2026 08:18
- Implement CursorAgent (src/core/agents/cursor.ts) that spawns
  cursor-agent with -p --output-format stream-json --force --trust,
  appends a JSON-schema final-output contract to the prompt, parses
  assistant text events (with fallback to the terminal result event),
  and recovers fenced or prose-wrapped JSON via parseAgentJson.
- Parse the cursor-agent result.usage field for input/output/cached
  token totals so per-iteration usage shows up in the renderer on par
  with the other native agents.
- Detect "invalid model" exits as PermanentAgentError so misconfigured
  models abort immediately instead of triggering the retry loop, using
  a new classifyExitError hook on setupChildProcessHandlers that other
  agents can reuse.
- Wire CursorAgent into the agent factory, register cursor in
  AGENT_NAMES, reserve cursor's gnhf-managed CLI flags
  (-p/--print/--output-format/--stream-partial-output/--resume/
  --continue/--workspace/--api-key) in config validation, and refresh
  the YAML bootstrap template.
- Add unit tests for CursorAgent (spawn args, schema, Windows shell
  detection, abort handling, JSON parsing happy paths and recovery,
  usage extraction, permanent-error classification), plus factory and
  config cases for cursor and stream-utils coverage for the new
  classifyExitError hook.
- Update README.md and AGENTS.md to document the cursor agent.

Co-authored-by: Cursor <cursoragent@cursor.com>
cursor-agent's stream-json output only emits authoritative token counts
on the terminal `result.usage` event, which never arrives if the run is
aborted and only lands at the end of a successful run. That left the
gnhf renderer stuck at zero tokens while the preview kept showing
thinking deltas, assistant messages, and tool calls.

Port the same live-estimation strategy ACP uses:

- Seed an initial prompt-only input estimate as soon as the iteration
  starts so the renderer is non-zero immediately.
- Grow the input estimate by 2000 tokens per `tool_call:started` event
  (the heuristic ACP uses; tool results feed back as input on the next
  round).
- Grow the output estimate from accumulated character counts across
  `thinking` deltas and `assistant` text events (chars / 4).
- Flag every live update with `estimated: true` so the renderer
  prefixes the display with "~" (existing convention shared with ACP).
- Graduate to authoritative numbers when `result.usage` arrives and
  drop the estimated flag on the final onUsage callback; if the run
  aborts or errors before that, the resolved usage stays the final
  estimate rather than a stuck zero.

Update the cursor tests to cover the new live-update behavior:
prompt-seed estimate, tool-call growth (with `completed` not double
counting), thinking-delta growth (with `completed` without `text` not
firing), preservation of estimates when `result.usage` is empty,
errored-result behavior, and the authoritative-result happy path now
correctly drops the estimated flag on the last call.

Co-authored-by: Cursor <cursoragent@cursor.com>
The estimated:true flag was making cursor's token counts render with a
"~" prefix in the renderer, which made it visually inconsistent with
every other native agent (claude, codex, copilot, opencode, rovodev,
pi). They all just report whatever their CLI gives them, including
zeros for fields the CLI omits.

Keep the live estimation logic (prompt-token seed, +2000 tokens per
tool_call:started, char-derived output tokens from thinking and
assistant deltas) so the counters still climb during a run, but stop
flagging the values as estimated. cursor now matches the display
convention of the other native CLI agents; ACP remains the only place
that uses the flag, since its adapters can be entirely heuristic.

Update tests to assert estimated is undefined on every onUsage call
and on the resolved usage, across the seed, tool-call growth,
thinking-delta growth, empty-usage, errored-result, and live-estimate
paths.

Co-authored-by: Cursor <cursoragent@cursor.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3a11d6058b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/core/agents/cursor.ts
arg === "--yolo" ||
arg === "--sandbox" ||
arg.startsWith("--sandbox=") ||
arg === "--trust",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not disable default force mode when only --trust is set

Including --trust in userSpecifiedPermissionMode makes gnhf drop its default --force flag whenever users pass only --trust via agentArgsOverride.cursor. Cursor docs distinguish these: command/file execution still needs force mode in print/headless runs, while trust only bypasses workspace-trust prompts. In this path, iterations can lose command/write permission and fail or stall in non-interactive runs even though the user only intended to skip trust prompts.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant