Cursor Agent Support for Ralph

---
branch: ralph-cursor-agent
---
# Spec: Cursor Agent Support for Ralph

## Overview

Ralph's Python codebase already has the agent abstraction plumbed through (--agent flag, per-agent tokens, proxy ports, sandbox names, Dockerfile paths). However, the actual agent CLI invocation and several supporting functions are hardcoded to "claude". This spec adds cursor-agent as a second supported agent, enabling `ralph --agent cursor --issue <N>`.

Key differences from Claude:
- **Auth**: cursor-agent uses `CURSOR_API_KEY` env var (simple API key, not OAuth). No proxy-based credential injection — cursor-agent doesn't support HTTP proxy configuration.
- **CLI flags**: `cursor-agent -p --force --trust --model <model>` (no `--effort`, no `--dangerously-skip-permissions`)
- **Permissions config**: cursor-agent needs `~/.cursor/cli-config.json` with `{"permissions": {"allow": ["*"], "deny": []}}` baked into the Dockerfile
- **Sandbox**: Uses `docker sandbox create shell` with a custom template (no built-in cursor agent in Docker sandbox)
- **Network**: Needs `*.cursor.sh` hosts allowed (api2, api5, etc.) instead of Anthropic hosts
- **Secret delivery**: API key written to a temp file in the sandbox, read into env var and deleted *before* cursor-agent starts

## Architecture

```
ralph --agent cursor --issue 42
  │
  ├─ 1. Token: store_token/ensure_token
  │   ├─ Claude: run_claude_setup_token() → OAuth token → Keychain
  │   └─ Cursor: prompt for API key → Keychain (service: "cursor-token")
  │
  ├─ 2. Proxy (Claude only)
  │   ├─ Claude: start proxy on port 18080, inject OAuth token into API requests
  │   └─ Cursor: NO proxy — API key injected via secret file (see step 4)
  │
  ├─ 3. Sandbox
  │   ├─ Image: docker/agent-loop/cursor/Dockerfile (FROM sandbox-templates:shell + cursor-agent)
  │   ├─ Project layer: .agent-loop/Dockerfile.sandbox or .agent-loop/dependencies (same as claude)
  │   ├─ Create: docker sandbox create shell -t <tag> --name <name> <workspace>
  │   └─ Network: deny-by-default + allow api2.cursor.sh, api5.cursor.sh, sentry.io
  │
  ├─ 4. Secret file lifecycle (per iteration)
  │   ├─ Write API key to /tmp/.cursor-api-key in sandbox via docker sandbox exec
  │   ├─ Iteration command is a shell wrapper:
  │   │     sh -c 'export CURSOR_API_KEY=$(cat /tmp/.cursor-api-key) &&
  │   │            rm /tmp/.cursor-api-key &&
  │   │            exec cursor-agent -p --force --trust ...'
  │   └─ Key is deleted from disk BEFORE cursor-agent starts (exec replaces shell)
  │
  └─ 5. Iteration
      └─ cursor-agent -p --force --trust --model <model> --output-format text <prompt>
```

## Scope

**In scope:**
- Cursor agent support for Docker sandboxes
- Agent-specific token setup, sandbox creation, network policy, and CLI invocation
- Secret file lifecycle with pre-exec cleanup
- Project-level Dockerfile/dependencies support for cursor (same `.agent-loop/` config as claude)
- Parameterize all hardcoded "claude" references that block cursor support

**Out of scope:**
- Tart sandbox backend (Docker only for now)
- Cursor-specific selftest (selftest remains claude-only; cursor selftest can be added later)
- Proxy support for cursor (cursor-agent doesn't support HTTP proxies)
- MCP configuration inside the sandbox

---

## 1. Agent Abstraction

Each agent needs different behavior for:
- **Token setup**: Claude uses `setup-token` interactive TUI; Cursor prompts for an API key
- **Token validation**: Claude calls `claude -p --model haiku ok`; Cursor calls `cursor-agent -p --force --model auto ok` (or skips validation — API keys are long-lived)
- **Proxy**: Claude needs a proxy; Cursor does not
- **Sandbox creation**: Claude uses `docker sandbox create claude`; Cursor uses `docker sandbox create shell`
- **Network policy**: Different allowed hosts per agent
- **CLI invocation**: Different commands and flags
- **Secret delivery**: Claude uses proxy phantom token; Cursor uses secret file with read-and-delete-before-exec pattern

The implementation should use a per-agent configuration dict or similar structure to centralize these differences, rather than scattering if/else blocks.

## 2. CLI Flags

`cursor-agent` headless invocation:
```
cursor-agent -p --force --trust --model <model> --output-format text "<prompt>"
```

- `-p` / `--print` — headless mode (like Claude's `-p`)
- `--force` — allow file writes (like Claude's `--dangerously-skip-permissions`)
- `--trust` — trust the workspace (skip workspace trust prompt)
- `--model <model>` — model selection (default: `auto`)
- `--output-format text` — human-readable output

Additionally, `~/.cursor/cli-config.json` must exist with permissive tool permissions:
```json
{
  "permissions": {
    "allow": ["*"],
    "deny": []
  }
}
```
This is baked into the Cursor Dockerfile.

No equivalent of Claude's `--effort high`.

## 3. Network Policy

Per-agent allowed hosts:
- **Claude**: `api.anthropic.com`, `statsig.anthropic.com`, `sentry.io`
- **Cursor**: `api2.cursor.sh`, `api5.cursor.sh`, `sentry.io`

Note: Cursor may use additional subdomains (api3, api4, gcpp). Verify during implementation by running cursor-agent with network logging and add any missing hosts.

## 4. Project-Level Image Layers

The existing project image system (`.agent-loop/Dockerfile.sandbox` and `.agent-loop/dependencies`) already uses `ARG BASE_IMAGE` / `FROM ${BASE_IMAGE}`, so it layers generically on top of whichever agent's base image is in use. This must work for cursor the same way it works for claude — the cursor base image tag is passed as `BASE_IMAGE` when building the project layer.

No code changes should be needed for this, but the parameterized `sandbox_agent` subcommand (step 3) must be used consistently through `ensure_sandbox` → `ensure_project_image` → `_docker_sandbox_create`.

---

## Implementation Plan

### Step 1: Add agent configuration registry [done]

**Files:**
- `tools/ralph/src/ralph/agents.py` — New file: agent configuration registry

**Implement:**
1. Create a module with per-agent configuration dicts containing:
   - `cli_command`: the CLI binary name (`"claude"` or `"cursor-agent"`)
   - `sandbox_agent`: the `docker sandbox create` subcommand (`"claude"` or `"shell"`)
   - `cli_flags`: function that returns agent-specific flags for iteration (e.g., `["--dangerously-skip-permissions", "--effort", "high"]` for claude, `["--force", "--trust", "--output-format", "text"]` for cursor)
   - `allowed_hosts`: list of hosts for network policy
   - `default_model`: default model name (`"sonnet"` for claude, `"auto"` for cursor)
   - `uses_proxy`: boolean (`True` for claude, `False` for cursor)
   - `env_var_name`: the token env var name (`"CLAUDE_CODE_OAUTH_TOKEN"` for claude, `"CURSOR_API_KEY"` for cursor)
2. Provide a `get_agent(name)` lookup function that raises a clear error for unknown agents
3. Keep the valid agent names list in one place for CLI validation

**Acceptance:**
- `get_agent("claude")` returns claude config
- `get_agent("cursor")` returns cursor config
- `get_agent("unknown")` raises `ValueError`
- `pytest tests/test_agents.py -v` passes

### Step 2: Create Cursor Dockerfile and sandbox template [done]

**Files:**
- `docker/agent-loop/cursor/Dockerfile` — New Cursor sandbox image

**Implement:**
1. Create `docker/agent-loop/cursor/Dockerfile`:
   ```dockerfile
   FROM docker/sandbox-templates:shell
   USER root
   RUN curl https://cursor.com/install -fsS | bash
   RUN apt-get update && apt-get install -y --no-install-recommends \
       build-essential jq openssh-client fd-find \
       && rm -rf /var/lib/apt/lists/*
   USER agent
   RUN mkdir -p ~/.cursor && \
       echo '{"permissions":{"allow":["*"],"deny":[]}}' > ~/.cursor/cli-config.json
   ```
2. Verify the cursor installer works in the Docker build context and places `cursor-agent` on PATH
3. If the installer requires a different approach in Docker (e.g., direct binary download), adapt accordingly
4. The `cli-config.json` gives cursor-agent full tool permissions (equivalent to Claude's `--dangerously-skip-permissions`)

**Acceptance:**
- `docker build -t test-cursor docker/agent-loop/cursor/` succeeds
- `docker run --rm test-cursor which cursor-agent` finds the binary
- `docker run --rm test-cursor cursor-agent --version` returns a version
- `docker run --rm test-cursor cat ~/.cursor/cli-config.json` shows the permissions config

### Step 3: Parameterize sandbox creation to support shell agent [done]

**Files:**
- `tools/ralph/src/ralph/sandbox/docker.py` — Update `_docker_sandbox_create` and `apply_network_policy`

**Implement:**
1. Update `_docker_sandbox_create` to accept an `agent` parameter and use the agent config's `sandbox_agent` value instead of hardcoded `"claude"`
2. Update `apply_network_policy` to accept an `agent` parameter and use the agent config's `allowed_hosts` instead of hardcoded Anthropic hosts
3. Thread the `agent` parameter through `ensure_sandbox` → `_docker_sandbox_create` and `apply_network_policy`
4. Verify that `ensure_project_image` continues to work — it already passes the base image tag generically via `ARG BASE_IMAGE`, so no changes should be needed there, but confirm the full path works: cursor base image → project layer → `docker sandbox create shell -t <project-tag>`

**Acceptance:**
- Claude sandboxes still created with `docker sandbox create claude`
- Cursor sandboxes created with `docker sandbox create shell`
- Network policy uses agent-specific allowed hosts
- Project-level image layers work for both agents (`.agent-loop/Dockerfile.sandbox` and `.agent-loop/dependencies`)
- `pytest tests/test_sandbox_docker.py -v` passes

### Step 4: Parameterize agent CLI invocation in run_iteration [done]

**Files:**
- `tools/ralph/src/ralph/sandbox/docker.py` — Update `run_iteration`
- `tools/ralph/src/ralph/sandbox/__init__.py` — Update `SandboxBackend.run_iteration` signature
- `tools/ralph/src/ralph/loop.py` — Update env_vars and run_iteration call

**Implement:**
1. Add `agent` parameter to `run_iteration` in the `SandboxBackend` base class and `DockerSandbox`
2. For **claude**: use existing direct exec pattern — `docker sandbox exec ... claude -p <prompt> --model <model> --dangerously-skip-permissions --effort high`
3. For **cursor**: implement the secret file lifecycle:
   a. Write API key to `/tmp/.cursor-api-key` inside sandbox via `docker sandbox exec -i ... tee`
   b. Build a shell wrapper command: `sh -c 'export CURSOR_API_KEY=$(cat /tmp/.cursor-api-key) && rm /tmp/.cursor-api-key && exec cursor-agent -p --force --trust --model <model> --output-format text "<prompt>"'`
   c. The `exec` replaces the shell process, so the key exists only in the env of the cursor-agent process (not as a file on disk)
4. In `loop.py`, make env_vars agent-aware:
   - Claude: `CLAUDE_CODE_OAUTH_TOKEN=phantom`, `ANTHROPIC_BASE_URL=...`, `ANTHROPIC_CUSTOM_MODEL_OPTION=...` (existing)
   - Cursor: no env vars passed via `-e` flags — the API key is injected via the secret file + shell wrapper pattern above
5. Pass the API key string (from Keychain) through to `run_iteration` so it can write the secret file

**Acceptance:**
- Claude iterations still call `claude -p ... --dangerously-skip-permissions --effort high`
- Cursor iterations: secret file written, read into env var, deleted, then cursor-agent execs
- The API key file does not exist on disk while cursor-agent is running
- `pytest tests/test_sandbox_docker.py -v` passes
- `pytest tests/test_loop.py -v` passes

**Implementation notes:**
- Secret file path is `/tmp/.agent-api-key` (generic, not cursor-specific)
- Uses `shlex.quote` for shell quoting in the `sh -c` wrapper (consistent with tart.py)
- All values interpolated into `sh -c` string (prompt, model, flags) are shell-quoted for defense in depth
- TartSandbox.run_iteration signature updated to accept `agent`/`api_key` kwargs for compatibility
- Renamed `test_agent_codex_uses_correct_names` → `test_agent_cursor_uses_correct_names` since "codex" is not a valid agent in the registry
- `read_token_from_keychain` imported in loop.py for non-proxy agents to retrieve the raw API key

### Step 5: Add cursor-specific token management [done]

**Files:**
- `tools/ralph/src/ralph/token.py` — Add cursor token setup alongside claude

**Implement:**
1. Add `prompt_for_api_key()` function that:
   - Prompts user: "Enter your Cursor API key (from cursor.com/dashboard → Integrations → User API Keys):"
   - Reads the key from stdin
   - Stores in Keychain under `cursor-token` service as JSON: `{"accessToken": "<key>", "expiresAt": <far-future>}`
2. Make `store_token` dispatch based on agent:
   - `claude` → existing `run_claude_setup_token()` flow
   - `cursor` → `prompt_for_api_key()` flow
3. Make `ensure_token` dispatch based on agent:
   - `claude` → existing flow (run setup-token if missing)
   - `cursor` → prompt for API key if missing
4. Make `_parse_and_store_token` agent-aware:
   - Claude: validate via `claude -p --model haiku ok` with `CLAUDE_CODE_OAUTH_TOKEN`
   - Cursor: skip validation (API keys are long-lived and can't be validated without a full agent run) OR validate via `cursor-agent -p --force --model auto "ok"` with `CURSOR_API_KEY` if cursor-agent is installed on host
5. Update error messages to be agent-specific instead of hardcoding "claude setup-token"

**Acceptance:**
- `ralph store-token --agent claude` still runs `claude setup-token`
- `ralph store-token --agent cursor` prompts for an API key
- `ralph check-token --agent cursor` reports status from Keychain
- `pytest tests/test_token.py -v` passes

**Implementation notes:**
- Dispatch uses `agent_config["uses_proxy"]` rather than agent name comparison — claude (proxy) validates tokens, cursor (non-proxy) skips validation
- `prompt_for_api_key` uses `input()` (not `getpass`) for consistency with how `claude setup-token` echoes output
- Error messages changed from "running claude setup-token..." to "requesting new token..." (generic)
- Existing test using `"codex"` agent updated to `"cursor"` since `store_token` now calls `get_agent()` which validates agent names
- `_parse_and_store_token` validation uses `agent_config["cli_command"]` and `agent_config["env_var_name"]` for the validation subprocess call, not hardcoded "claude"

### Step 6: Make proxy conditional (skip for cursor) [done]

**Files:**
- `tools/ralph/src/ralph/cli.py` — Conditionally start proxy
- `tools/ralph/src/ralph/loop.py` — Conditionally use proxy env vars

**Implement:**
1. In `cli.py` main flow, check `agent_config.uses_proxy`:
   - If True (claude): ensure_proxy, start_proxy_keepalive as before
   - If False (cursor): skip proxy entirely, set proxy_port to None
2. In `loop.py` `process_issue`, build env_vars based on agent:
   - Claude: phantom token + proxy base URL + custom model option (existing)
   - Cursor: no proxy env vars — API key delivery handled in `run_iteration` (Step 4)
3. In `loop.py`, skip proxy health check / restart for non-proxy agents
4. Pass the raw API key (from `ensure_token`) through to `process_issue` → `run_iteration` for cursor's secret file injection

**Acceptance:**
- `ralph --agent claude --issue N` starts proxy as before
- `ralph --agent cursor --issue N` skips proxy, delivers API key via secret file in run_iteration
- `pytest tests/test_cli.py -v` passes
- `pytest tests/test_loop.py -v` passes

**Implementation notes:**
- Only `cli.py` needed changes — `loop.py` already had agent-conditional env_vars and proxy health check logic from Step 4
- `proxy_port=None` for non-proxy agents is safe: all downstream uses in `loop.py` are gated behind `agent_config["uses_proxy"]`
- Updated `test_agent_flag_passed_through` from "codex" to "cursor" since `get_agent()` now validates agent names
- Added `test_cursor_skips_proxy` and `test_claude_starts_proxy` for explicit proxy conditional coverage

### Step 7: Update default model handling [done]

**Files:**
- `tools/ralph/src/ralph/cli.py` — Use agent config for default model

**Implement:**
1. After agent is determined, set default model from `agent_config.default_model` instead of hardcoded `"sonnet"`
2. Only override if user didn't specify `--model` explicitly

**Acceptance:**
- `ralph --agent claude` defaults to model `sonnet`
- `ralph --agent cursor` defaults to model `auto`
- `ralph --agent cursor --model gpt-5` uses `gpt-5`
- `pytest tests/test_cli.py -v` passes

**Implementation notes:**
- `model` initialized to `None` instead of `"sonnet"`, then resolved from `agent_config["default_model"]` after arg parsing completes
- Moved `get_agent(agent)` call earlier (before validation) so `agent_config` is available for model resolution — also catches unknown agent names sooner
- Removed duplicate `get_agent(agent)` call that was further down in the function
- Updated usage text from "Claude model (default: sonnet)" to "Model name (default: per-agent, e.g. sonnet for claude)"

### Step 8: Run all checks [done]

**Acceptance:**
- `pytest tests/ -v` — all tests pass
- `python3 -m py_compile tools/ralph/src/ralph/agents.py` — no syntax errors
- `python3 -m py_compile tools/ralph/src/ralph/token.py` — no syntax errors
- `python3 -m py_compile tools/ralph/src/ralph/sandbox/docker.py` — no syntax errors
- `python3 -m py_compile tools/ralph/src/ralph/loop.py` — no syntax errors
- `python3 -m py_compile tools/ralph/src/ralph/cli.py` — no syntax errors
- No regressions in claude agent behavior

**Implementation notes:**
- All 5 py_compile checks pass clean
- 415 tests pass, 6 skipped (integration tests requiring Docker)
- Fixed 7 pre-existing test failures:
  - `test_contains_step_structure`: updated assertion to match current ITERATION_PROMPT wording ("For each task, follow this workflow")
  - 6 tart sandbox tests: removed `"--"` separator from expected `tart exec` commands to match tart 2.x syntax change (commit cd425ae), and adjusted command index offsets accordingly

---

## Conventions

- **Language:** Python 3 (stdlib only, no third-party dependencies)
- **Tests:** pytest with monkeypatching for subprocess calls
- **Error messages:** Prefix with `ralph:`
- **Exit codes:** 0=success, 1=runtime error, 2=usage error


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cursor Agent Support for Ralph #33

branch: ralph-cursor-agent

Spec: Cursor Agent Support for Ralph

Overview

Architecture

Scope

1. Agent Abstraction

2. CLI Flags

3. Network Policy

4. Project-Level Image Layers

Implementation Plan

Step 1: Add agent configuration registry [done]

Step 2: Create Cursor Dockerfile and sandbox template [done]

Step 3: Parameterize sandbox creation to support shell agent [done]

Step 4: Parameterize agent CLI invocation in run_iteration [done]

Step 5: Add cursor-specific token management [done]

Step 6: Make proxy conditional (skip for cursor) [done]

Step 7: Update default model handling [done]

Step 8: Run all checks [done]

Conventions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Cursor Agent Support for Ralph #33

Description

branch: ralph-cursor-agent

Spec: Cursor Agent Support for Ralph

Overview

Architecture

Scope

1. Agent Abstraction

2. CLI Flags

3. Network Policy

4. Project-Level Image Layers

Implementation Plan

Step 1: Add agent configuration registry [done]

Step 2: Create Cursor Dockerfile and sandbox template [done]

Step 3: Parameterize sandbox creation to support shell agent [done]

Step 4: Parameterize agent CLI invocation in run_iteration [done]

Step 5: Add cursor-specific token management [done]

Step 6: Make proxy conditional (skip for cursor) [done]

Step 7: Update default model handling [done]

Step 8: Run all checks [done]

Conventions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions