branch: ralph-pi-agent
Spec: Add pi coding agent to Ralph
Overview
Add pi as a new agent type in Ralph's agent loop. Pi is a third-party AI coding agent (@mariozechner/pi-coding-agent) that supports multiple model providers. The initial implementation uses the claude-sdk provider (subscription-backed via the Claude Agent SDK), with the architecture designed so future providers (OpenAI, Cursor API, etc.) can be added without restructuring.
The pi provider is configured per-project in .agent-loop/config.json. The provider selection determines the Docker base image, sandbox type, auth mechanism, and network policy — all driven by a providers dict in the agent config.
For the claude-sdk provider: pi runs in a Docker sandbox based on docker/sandbox-templates:claude-code (which provides Claude Code + Node.js). The provider spawns a Claude Code subprocess via the Agent SDK; env vars (CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_BASE_URL) propagate through the process tree so the existing credential proxy handles auth unchanged.
Future providers (e.g., OpenAI) would use a lightweight node:22-slim base, "shell" sandbox type, and direct API key injection — no proxy needed.
Architecture
ralph --agent pi --issue 42
│
├─ cli.py: ensure_token() + ensure_proxy() (based on resolved provider config)
│
├─ loop.py: process_issue()
│ ├─ load_runtime_config() → reads .agent-loop/config.json
│ │ including pi.provider (default: "claude-sdk")
│ ├─ resolve_agent_config() → merges provider-specific config into agent config
│ ├─ ensure_sandbox() → builds pi Docker image (base image varies by provider)
│ └─ run_iteration() → docker sandbox exec ... pi -p "<prompt>" ...
│
└─ Docker sandbox:
pi -p "<ITERATION_PROMPT>" --model claude-sonnet-4-6 --no-skills --no-session
└─ claude-sdk-provider extension
└─ sdk.query() → Claude Code subprocess
└─ API calls → proxy (host.docker.internal:18080) → api.anthropic.com
Docker build context assembly
The pi Docker image needs source from two locations: the Dockerfile/settings from docker/agent-loop/pi/ and extensions from pi/packages/. A prepare_build_context(agent) method on DockerImageMixin assembles a temp directory containing both before running docker build. All directories in pi/packages/ are auto-discovered — adding a new extension requires no wiring.
temp-context/
Dockerfile ← from docker/agent-loop/pi/
settings.json ← from docker/agent-loop/pi/
packages/
claude-sdk-provider/ ← from pi/packages/claude-sdk-provider/ (sans node_modules, dist)
future-provider/ ← auto-discovered from pi/packages/
Provider-aware agent config
The AGENTS["pi"] entry uses a providers dict keyed by provider name. Each provider defines its own uses_proxy, env_var_name, allowed_hosts, sandbox_agent, base_image, and auth config. A helper resolves the effective config by merging the selected provider's settings into the base agent config.
AGENTS["pi"] = {
"cli_command": "pi",
"cli_flags": _pi_cli_flags,
"default_model": "claude-sonnet-4-6",
"default_provider": "claude-sdk",
"providers": {
"claude-sdk": {
"sandbox_agent": "claude",
"base_image": "docker/sandbox-templates:claude-code",
"uses_proxy": True,
"allowed_hosts": ["api.anthropic.com", "statsig.anthropic.com", "sentry.io"],
"env_var_name": "CLAUDE_CODE_OAUTH_TOKEN",
"default_auth_mode": "oauth",
"auth_modes": { ... }, # same as claude agent
},
# Future example:
# "openai": {
# "sandbox_agent": "shell",
# "base_image": "node:22-slim",
# "uses_proxy": False,
# "allowed_hosts": ["api.openai.com"],
# "env_var_name": "OPENAI_API_KEY",
# },
},
}
Dockerfile with parameterized base image
The Dockerfile uses ARG BASE_IMAGE so the build process can pass the provider-appropriate base:
ARG BASE_IMAGE
FROM ${BASE_IMAGE}
# ... install pi, extensions, settings
The build_image method passes --build-arg BASE_IMAGE=<provider.base_image> when building.
Per-project config
.agent-loop/config.json gains an optional pi section:
{
"type": "docker-sandbox",
"pi": {
"provider": "claude-sdk",
"model": "claude-sonnet-4-6"
}
}
Both fields are optional and fall back to the AGENTS dict defaults.
1. Build Context Assembly
DockerImageMixin gains a prepare_build_context(agent) method:
- For agents without extensions: returns the existing
build_context(agent) path unchanged.
- For
pi: creates a temp directory, copies docker/agent-loop/pi/* into it, then copies each subdirectory of pi/packages/ (excluding node_modules/, dist/, .git/) into a packages/ subdirectory.
- Returns the assembled path.
The content_hash computation must also incorporate the extension source so cache invalidation works when extensions change. Hash the filtered contents of pi/packages/ alongside the Dockerfile and base digest.
build_image calls prepare_build_context and builds from the returned path. If a temp directory was created, it is cleaned up after the build.
2. Pi Docker Image
The Dockerfile uses ARG BASE_IMAGE so the build process controls the base per provider. For claude-sdk, the base is docker/sandbox-templates:claude-code (Claude Code + Node.js). Future providers would pass node:22-slim.
The image installs pi globally, builds all extensions from packages/, symlinks them into pi's extension discovery path, and copies the settings file.
3. Provider-Aware Agent Config
The pi agent introduces a provider abstraction. Each provider in the providers dict defines sandbox_agent, base_image, uses_proxy, allowed_hosts, env_var_name, and optionally auth_modes. A resolve_agent_config() function merges the selected provider's config into a flat dict compatible with the rest of ralph's agent dispatch.
4. Per-Project Config
load_runtime_config() parses the optional pi section from .agent-loop/config.json and validates the provider against known providers.
5. Constraints
- No new dependencies: ralph remains stdlib-only Python.
- No changes to existing agents: claude and cursor behavior must be unchanged.
- Proxy always starts for pi in this iteration: since only claude-sdk is implemented, the proxy is always needed. Future providers that skip the proxy will require refactoring token/proxy bootstrap order in cli.py — out of scope.
- Extension auto-discovery: adding a new pi extension package to
pi/packages/ must not require any code changes in ralph — only a Docker image rebuild.
Implementation Plan
Step 1: Build context assembly in DockerImageMixin [done]
Notes:
prepare_build_context(agent) returns a tuple (path, cleanup) where cleanup is either a tempfile.TemporaryDirectory instance (whose .cleanup() should be invoked when the build is done) or None for the non-pi case. This keeps the path-returning contract while letting build_image clean up the temp dir afterward.
- Added
_EXTENSION_EXCLUDES = ("node_modules", "dist", ".git") and extensions_dir(agent) helper. extensions_dir returns the path to pi/packages/ for the pi agent and None for all other agents (backwards compatible — content_hash for non-pi agents is identical to before).
_hash_extensions_dir walks the dir streaming file contents into a sha256 (with relative paths and NUL separators, sorted entries) for stable, deterministic hashing.
content_hash gained an optional extensions_hash="" argument; empty string yields the legacy two-argument result so existing tests still pass.
build_image gained an optional base_image=None argument; when set, --build-arg BASE_IMAGE=<base_image> is passed through. Other existing call sites pass nothing and behave unchanged.
- Pytest is unavailable in the iteration sandbox (pypi downloads blocked). The added pytest tests were exercised via stdlib
unittest equivalents (13/13 passed) that drive the same code paths.
Files:
tools/ralph/src/ralph/runtime/__init__.py — add prepare_build_context(), update hashing, update build_image
Implement:
- Add
prepare_build_context(agent) to DockerImageMixin. For agent == "pi", create a tempfile.TemporaryDirectory, copy docker/agent-loop/pi/* into it, then copy each subdir of pi/packages/ (excluding node_modules, dist, .git) into packages/<name>/. Return the temp dir path. For other agents, return self.build_context(agent).
- Add
_hash_extensions_dir(packages_dir) that walks pi/packages/ (respecting the same exclusions) and produces a stable content hash of all file contents.
- Update
compute_tag to incorporate the extensions hash for the pi agent, so image tags change when extension source changes.
- Update
build_image to accept an optional base_image arg, call prepare_build_context, pass --build-arg BASE_IMAGE=<base_image> when provided, build from the returned path, and clean up the temp dir afterward.
Acceptance:
prepare_build_context("claude") returns the existing path unchanged.
prepare_build_context("pi") returns a temp dir containing Dockerfile, settings.json, and packages/claude-sdk-provider/ (without node_modules or dist).
compute_tag("pi") produces different tags when extension source changes.
- Existing claude/cursor image builds are unaffected.
pytest tools/ralph/tests/ -v -k build_context passes.
Step 2: Pi Docker image and settings [done]
Notes:
- The build step uses plain
npm install (not npm install --production) because the claude-sdk-provider extension's build runs tsc, and typescript lives in devDependencies. --production would skip dev deps and break the build. The extra dev deps remain in the image (no prune step) — this is consistent with how the host installs the extension via roles/pi/install.
- Extension build loop runs in a single
RUN with set -eu so any failure aborts the build cleanly.
- Symlink uses
${pkg%/} to strip the trailing slash from the for-loop expansion before passing to ln -sfn so the link target is a directory path (not a path ending in /).
- Single
chown -R agent:agent /home/agent/.pi at the end fixes ownership of everything created as root: the extensions/ dir, the symlinks, and the COPY'd settings.json.
- Acceptance test for "actual
docker build succeeds" cannot be run in the iteration sandbox (image registry pulls are network-blocked). The Dockerfile passes docker buildx build --check for syntax — only warning is InvalidDefaultArgInFrom, which matches the existing generate_project_dockerfile() pattern in the codebase.
- Added
TestPiDockerfileShipped test class covering: Dockerfile uses ARG BASE_IMAGE/FROM ${BASE_IMAGE}, installs pi + extensions, installs all required apt deps, and settings.json parses with the four expected default fields.
Files:
docker/agent-loop/pi/Dockerfile — new
docker/agent-loop/pi/settings.json — new
tools/ralph/tests/test_runtime_docker_sandbox.py — new TestPiDockerfileShipped class
Implement:
- Create
docker/agent-loop/pi/Dockerfile:
ARG BASE_IMAGE / FROM ${BASE_IMAGE}
- Install system deps as root:
build-essential jq openssh-client fd-find
- Install pi:
npm install -g @mariozechner/pi-coding-agent
COPY packages/ /opt/pi-extensions/
- Build each extension: loop over
/opt/pi-extensions/*/, run npm install && npm run build (deviation from spec — see Notes)
- Symlink extensions into
~/.pi/agent/extensions/
COPY settings.json /home/agent/.pi/agent/settings.json
chown -R agent:agent /home/agent/.pi
- Switch to
USER agent
- Create
docker/agent-loop/pi/settings.json with defaultProvider: "claude-sdk", defaultModel: "claude-sonnet-4-6", defaultThinkingLevel: "high", hideThinkingBlock: true.
Acceptance:
docker build --build-arg BASE_IMAGE=docker/sandbox-templates:claude-code succeeds with a valid assembled build context.
- The image contains
pi on PATH, the claude-sdk-provider extension built and symlinked, and settings.json at /home/agent/.pi/agent/settings.json.
Step 3: Pi agent config with provider abstraction [done]
Notes:
resolve_agent_config(name, provider=None) returns the base config dict unchanged (same identity) for agents without a providers key — keeping claude/cursor behavior 100% backwards-compatible.
- For pi, a shallow copy of the base config is built (with
providers stripped out) and the selected provider's keys are merged in. The selected provider name is also stored as provider on the result so downstream code can branch on it.
get_auth_mode was updated to delegate to resolve_agent_config instead of get_agent. This is a no-op for claude/cursor (resolve returns base) but lets pi reach into its claude-sdk provider's auth_modes. No call sites needed to change.
- Pytest is unavailable in the iteration sandbox (pypi blocked), so acceptance assertions were exercised via stdlib equivalents driving the same code paths. The pytest tests added to
test_agents.py follow the file's existing class/style conventions.
Files:
tools/ralph/src/ralph/agents.py — add pi agent entry, resolve_agent_config()
Implement:
- Add
_pi_cli_flags(model) returning ["--no-skills", "--no-session"].
- Add
AGENTS["pi"] with cli_command: "pi", default_model: "claude-sonnet-4-6", default_provider: "claude-sdk", and a providers dict containing the claude-sdk provider config (sandbox_agent, base_image, uses_proxy, allowed_hosts, env_var_name, default_auth_mode, auth_modes).
- Add
resolve_agent_config(name, provider=None):
- Gets base config via
get_agent(name).
- If
providers dict exists, selects provider (fallback to default_provider), validates it exists, returns a shallow copy with provider-specific keys merged in.
- If no
providers dict, returns base config as-is (backwards-compatible for claude/cursor).
- Raises
ValueError for unknown providers.
- Update
VALID_AGENTS to include "pi".
Acceptance:
get_agent("pi") returns the pi config dict with providers key.
resolve_agent_config("pi") returns a flat dict with uses_proxy: True, sandbox_agent: "claude", allowed_hosts from claude-sdk provider, etc.
resolve_agent_config("claude") returns the existing claude config unchanged.
resolve_agent_config("pi", provider="unknown") raises ValueError.
pytest tools/ralph/tests/ -v -k agent passes.
Step 4: Per-project config extension [done]
Notes:
AGENTS is imported lazily inside load_runtime_config() (rather than at module top) to keep the runtime module's import graph free of any agents-module dependency. Functionally identical, but avoids forcing every consumer of runtime to load agents.
- Added a defensive type check: a non-dict
pi value (e.g. {"pi": "claude-sdk"}) raises ValueError rather than crashing later when .get() is called on a string. The error message prefix matches the project convention (ralph: 'pi' section in <path> must be an object).
- Empty
pi: {} is treated as "no pi config" (no pi_provider/pi_model keys surface). This keeps the keys absent rather than None, so downstream callers can use a single cfg.get(...) lookup with their own default.
pi_model is not validated — pi/Claude has many model aliases and the agent itself rejects unknowns. Validating provider only is consistent with the spec.
- pytest is unavailable in the iteration sandbox (pypi blocked); the new pytest tests follow the existing class style in
test_runtime_docker_sandbox.py and were exercised end-to-end via stdlib equivalents that drive the same code paths (9/9 passed).
Files:
tools/ralph/src/ralph/runtime/__init__.py — extend load_runtime_config()
tools/ralph/tests/test_runtime_docker_sandbox.py — added pi-section coverage to TestLoadRuntimeConfig
Implement:
- In
load_runtime_config(), parse the optional pi section from the config dict. Extract pi.provider and pi.model if present.
- Validate that
pi.provider (if present) is a known provider by importing and checking against AGENTS["pi"]["providers"].
- Store
pi_provider and pi_model in the returned config dict.
Acceptance:
- Config without
pi section works unchanged.
- Config with
{"pi": {"provider": "claude-sdk"}} parses correctly, returns pi_provider: "claude-sdk".
- Config with
{"pi": {"provider": "unknown"}} raises ValueError.
pytest tools/ralph/tests/ -v -k runtime_config passes.
Step 5: Loop integration [done]
Notes:
cli.py now uses resolve_agent_config(agent) for the uses_proxy check. For pi this resolves the default provider (claude-sdk), giving uses_proxy=True. The proxy reuses port 18080 via proxy_port_for_agent("pi") (DEFAULT_PROXY_PORT fallback) and writes its PID to /tmp/ralph-proxy-pi.pid. ensure_proxy("pi", ...) will reuse a healthy claude proxy on the same port (mode/version match), so concurrent ralph invocations on the same auth mode share one proxy.
--model precedence is tracked via a new explicit_model boolean: cli.py sets it to True only when the user passed --model. It's forwarded to process_issue / poll_loop, where the order becomes: explicit --model > pi.model from .agent-loop/config.json > agent default. This avoids interpreting "user passed the default value explicitly" as "use pi.model instead", because explicit-ness is tracked separately from the resolved value.
pi_provider / pi_model are popped from the runtime config dict in process_issue before the remaining kwargs are forwarded to create_runtime. Otherwise unknown keys would leak into runtime constructors (the existing kwargs are forwarded with **config).
runtime.ensure_sandbox gained a base_image=None kwarg that is threaded into ensure_image → pull_base_image / compute_tag / build_image. Loop only forwards base_image=... when the resolved agent config has a non-empty base_image field, keeping the existing claude/cursor assert_called_once_with(...) assertions matching exactly.
DockerImageMixin._effective_base_image(agent, base_image) is the new resolution helper. It returns the override when supplied; otherwise it parses the Dockerfile's FROM directive and rejects ARG-style placeholders (e.g. ${BASE_IMAGE}). This way, agents that template their base must supply an override at the call site instead of silently producing a broken docker pull.
compute_tag was kept tolerant of ARG-style FROM with no override — it falls back to an empty base_digest so existing TestComputeTagPiExtensions (which doesn't pass base_image) continues to pass and exercise the extensions-hash logic.
token.py::_resolve_mode_string switched from get_agent to resolve_agent_config, so pi's auth_modes (which live on the claude-sdk provider) surface correctly when the user runs ralph store-token --agent pi.
- pytest is unavailable in the iteration sandbox; the new pytest tests in
test_loop.py::TestProcessIssuePi were exercised end-to-end via stdlib unittest.mock equivalents (7/7 passed) plus targeted runtime checks for _effective_base_image and ensure_image with/without base_image.
Files:
tools/ralph/src/ralph/loop.py — pop pi keys, re-resolve agent config, model precedence, base_image kwarg
tools/ralph/src/ralph/cli.py — resolve_agent_config, track explicit_model, forward to loop
tools/ralph/src/ralph/runtime/__init__.py — _effective_base_image helper, thread base_image through ensure_image / compute_tag / pull_base_image / needs_rebuild
tools/ralph/src/ralph/runtime/docker_sandbox.py — resolve_agent_config; ensure_sandbox accepts base_image
tools/ralph/src/ralph/runtime/container.py — resolve_agent_config; ensure_sandbox accepts base_image
tools/ralph/src/ralph/token.py — _resolve_mode_string uses resolve_agent_config
tools/ralph/tests/test_loop.py — new TestProcessIssuePi class
Implement:
- In
process_issue(), after load_runtime_config(), extract pi_provider and pi_model from the config.
- Call
resolve_agent_config(agent, provider=pi_provider) to get the effective config. Use the resolved config for all downstream calls: proxy env building, ensure_sandbox (pass base_image from resolved config), run_iteration, network policy.
- If
pi_model is set in config, use it as the model (CLI --model takes precedence if explicitly provided).
- In
ensure_sandbox, pass the resolved base_image to build_image so the correct base is used per provider.
- In
cli.py, ensure ensure_token and ensure_proxy use the resolved agent config. Since pi reuses claude's auth_modes, this should work without changes, but verify the code path.
Acceptance:
ralph --agent pi --issue <N> runs pi in a Docker sandbox with the claude-sdk provider.
- Pi receives
CLAUDE_CODE_OAUTH_TOKEN=phantom and ANTHROPIC_BASE_URL=http://host.docker.internal:18080 as env vars.
- The iteration runs:
pi -p "<prompt>" --model claude-sonnet-4-6 --no-skills --no-session.
- Spec file is written to
/tmp/spec.md and read back after iteration.
- Claude and cursor agents are unaffected.
Step 6: Run all checks [done]
Notes:
- Full pytest collection surfaced four regressions that the per-step pytest filters in steps 1–5 (
-k build_context, -k agent, -k runtime_config) had skipped past. All four were fixed in this step:
TestSecretFileLifecycle's parametrize fixture filtered VALID_AGENTS by get_agent(a)["uses_proxy"], which KeyErrors on pi (its uses_proxy lives under the claude-sdk provider). Switched to resolve_agent_config.
TestSandboxEnsureSandbox / TestContainerEnsureSandbox test_force_rebuild_passed_through assertions used assert_called_once_with("claude", force_rebuild=True). The Step 5 implementation note promised the existing claude/cursor call signature would be preserved; in practice both ensure_sandbox methods unconditionally forwarded base_image=None to ensure_image. Updated ensure_sandbox to only forward base_image when it's set, matching the spec's intent and the existing loop.py pattern.
test_rejects_invalid_env_var_name patched ralph.runtime.container.get_agent but Step 5 switched the call site to resolve_agent_config. Updated the patch target.
test_proxy_recovery_passes_auth_mode was failing as of commit a90c82e ("proxy resilience for idle shutdown") — that commit added a per-issue ensure_proxy(...) call before the recovery path, so ensure_proxy is now called twice (once at startup health check, once on recovery) but the test still used assert_called_once_with. Rewrote the assertion to verify both calls forward auth_mode="api_key".
- After fixes: 880 passed, 14 skipped (integration/docker-gated). Pre-existing collection errors in
tests/test_sandbox_prune.py and tests/test_tart_sandbox.py (top-level dotfiles tests, unrelated to ralph/pi) reference a ralph.sandbox.tart module that no longer exists; verified they failed identically on 0bd0801~1 so they are out of scope for this spec.
Acceptance:
pytest tools/ralph/tests/ -v — all tests pass ✓ (880 passed, 14 skipped)
python3 -m py_compile tools/ralph/src/ralph/agents.py — no syntax errors ✓
python3 -m py_compile tools/ralph/src/ralph/runtime/__init__.py — no syntax errors ✓
python3 -m py_compile tools/ralph/src/ralph/loop.py — no syntax errors ✓
python3 -m py_compile tools/ralph/src/ralph/cli.py — no syntax errors ✓
Conventions
- Language: Python 3 (stdlib only, no third-party deps) for ralph; Dockerfile for image
- Tests: pytest, files in
tools/ralph/tests/
- Error messages: Prefix with
ralph:
- Exit codes: 0=success, 1=runtime error, 2=usage error
branch: ralph-pi-agent
Spec: Add pi coding agent to Ralph
Overview
Add
pias a new agent type in Ralph's agent loop. Pi is a third-party AI coding agent (@mariozechner/pi-coding-agent) that supports multiple model providers. The initial implementation uses theclaude-sdkprovider (subscription-backed via the Claude Agent SDK), with the architecture designed so future providers (OpenAI, Cursor API, etc.) can be added without restructuring.The pi provider is configured per-project in
.agent-loop/config.json. The provider selection determines the Docker base image, sandbox type, auth mechanism, and network policy — all driven by aprovidersdict in the agent config.For the
claude-sdkprovider: pi runs in a Docker sandbox based ondocker/sandbox-templates:claude-code(which provides Claude Code + Node.js). The provider spawns a Claude Code subprocess via the Agent SDK; env vars (CLAUDE_CODE_OAUTH_TOKEN,ANTHROPIC_BASE_URL) propagate through the process tree so the existing credential proxy handles auth unchanged.Future providers (e.g., OpenAI) would use a lightweight
node:22-slimbase,"shell"sandbox type, and direct API key injection — no proxy needed.Architecture
Docker build context assembly
The pi Docker image needs source from two locations: the Dockerfile/settings from
docker/agent-loop/pi/and extensions frompi/packages/. Aprepare_build_context(agent)method onDockerImageMixinassembles a temp directory containing both before runningdocker build. All directories inpi/packages/are auto-discovered — adding a new extension requires no wiring.Provider-aware agent config
The
AGENTS["pi"]entry uses aprovidersdict keyed by provider name. Each provider defines its ownuses_proxy,env_var_name,allowed_hosts,sandbox_agent,base_image, and auth config. A helper resolves the effective config by merging the selected provider's settings into the base agent config.Dockerfile with parameterized base image
The Dockerfile uses
ARG BASE_IMAGEso the build process can pass the provider-appropriate base:The
build_imagemethod passes--build-arg BASE_IMAGE=<provider.base_image>when building.Per-project config
.agent-loop/config.jsongains an optionalpisection:{ "type": "docker-sandbox", "pi": { "provider": "claude-sdk", "model": "claude-sonnet-4-6" } }Both fields are optional and fall back to the AGENTS dict defaults.
1. Build Context Assembly
DockerImageMixingains aprepare_build_context(agent)method:build_context(agent)path unchanged.pi: creates a temp directory, copiesdocker/agent-loop/pi/*into it, then copies each subdirectory ofpi/packages/(excludingnode_modules/,dist/,.git/) into apackages/subdirectory.The
content_hashcomputation must also incorporate the extension source so cache invalidation works when extensions change. Hash the filtered contents ofpi/packages/alongside the Dockerfile and base digest.build_imagecallsprepare_build_contextand builds from the returned path. If a temp directory was created, it is cleaned up after the build.2. Pi Docker Image
The Dockerfile uses
ARG BASE_IMAGEso the build process controls the base per provider. Forclaude-sdk, the base isdocker/sandbox-templates:claude-code(Claude Code + Node.js). Future providers would passnode:22-slim.The image installs pi globally, builds all extensions from
packages/, symlinks them into pi's extension discovery path, and copies the settings file.3. Provider-Aware Agent Config
The pi agent introduces a provider abstraction. Each provider in the
providersdict definessandbox_agent,base_image,uses_proxy,allowed_hosts,env_var_name, and optionallyauth_modes. Aresolve_agent_config()function merges the selected provider's config into a flat dict compatible with the rest of ralph's agent dispatch.4. Per-Project Config
load_runtime_config()parses the optionalpisection from.agent-loop/config.jsonand validates the provider against known providers.5. Constraints
pi/packages/must not require any code changes in ralph — only a Docker image rebuild.Implementation Plan
Step 1: Build context assembly in DockerImageMixin [done]
Notes:
prepare_build_context(agent)returns a tuple(path, cleanup)wherecleanupis either atempfile.TemporaryDirectoryinstance (whose.cleanup()should be invoked when the build is done) orNonefor the non-pi case. This keeps the path-returning contract while lettingbuild_imageclean up the temp dir afterward._EXTENSION_EXCLUDES = ("node_modules", "dist", ".git")andextensions_dir(agent)helper.extensions_dirreturns the path topi/packages/for the pi agent andNonefor all other agents (backwards compatible —content_hashfor non-pi agents is identical to before)._hash_extensions_dirwalks the dir streaming file contents into asha256(with relative paths and NUL separators, sorted entries) for stable, deterministic hashing.content_hashgained an optionalextensions_hash=""argument; empty string yields the legacy two-argument result so existing tests still pass.build_imagegained an optionalbase_image=Noneargument; when set,--build-arg BASE_IMAGE=<base_image>is passed through. Other existing call sites pass nothing and behave unchanged.unittestequivalents (13/13 passed) that drive the same code paths.Files:
tools/ralph/src/ralph/runtime/__init__.py— addprepare_build_context(), update hashing, updatebuild_imageImplement:
prepare_build_context(agent)toDockerImageMixin. Foragent == "pi", create atempfile.TemporaryDirectory, copydocker/agent-loop/pi/*into it, then copy each subdir ofpi/packages/(excludingnode_modules,dist,.git) intopackages/<name>/. Return the temp dir path. For other agents, returnself.build_context(agent)._hash_extensions_dir(packages_dir)that walkspi/packages/(respecting the same exclusions) and produces a stable content hash of all file contents.compute_tagto incorporate the extensions hash for the pi agent, so image tags change when extension source changes.build_imageto accept an optionalbase_imagearg, callprepare_build_context, pass--build-arg BASE_IMAGE=<base_image>when provided, build from the returned path, and clean up the temp dir afterward.Acceptance:
prepare_build_context("claude")returns the existing path unchanged.prepare_build_context("pi")returns a temp dir containingDockerfile,settings.json, andpackages/claude-sdk-provider/(withoutnode_modulesordist).compute_tag("pi")produces different tags when extension source changes.pytest tools/ralph/tests/ -v -k build_contextpasses.Step 2: Pi Docker image and settings [done]
Notes:
npm install(notnpm install --production) because theclaude-sdk-providerextension's build runstsc, andtypescriptlives indevDependencies.--productionwould skip dev deps and break the build. The extra dev deps remain in the image (no prune step) — this is consistent with how the host installs the extension viaroles/pi/install.RUNwithset -euso any failure aborts the build cleanly.${pkg%/}to strip the trailing slash from the for-loop expansion before passing toln -sfnso the link target is a directory path (not a path ending in/).chown -R agent:agent /home/agent/.piat the end fixes ownership of everything created as root: theextensions/dir, the symlinks, and the COPY'dsettings.json.docker buildsucceeds" cannot be run in the iteration sandbox (image registry pulls are network-blocked). The Dockerfile passesdocker buildx build --checkfor syntax — only warning isInvalidDefaultArgInFrom, which matches the existinggenerate_project_dockerfile()pattern in the codebase.TestPiDockerfileShippedtest class covering: Dockerfile usesARG BASE_IMAGE/FROM ${BASE_IMAGE}, installspi+ extensions, installs all required apt deps, andsettings.jsonparses with the four expected default fields.Files:
docker/agent-loop/pi/Dockerfile— newdocker/agent-loop/pi/settings.json— newtools/ralph/tests/test_runtime_docker_sandbox.py— newTestPiDockerfileShippedclassImplement:
docker/agent-loop/pi/Dockerfile:ARG BASE_IMAGE/FROM ${BASE_IMAGE}build-essential jq openssh-client fd-findnpm install -g @mariozechner/pi-coding-agentCOPY packages/ /opt/pi-extensions//opt/pi-extensions/*/, runnpm install && npm run build(deviation from spec — see Notes)~/.pi/agent/extensions/COPY settings.json /home/agent/.pi/agent/settings.jsonchown -R agent:agent /home/agent/.piUSER agentdocker/agent-loop/pi/settings.jsonwithdefaultProvider: "claude-sdk",defaultModel: "claude-sonnet-4-6",defaultThinkingLevel: "high",hideThinkingBlock: true.Acceptance:
docker build --build-arg BASE_IMAGE=docker/sandbox-templates:claude-codesucceeds with a valid assembled build context.pion PATH, theclaude-sdk-providerextension built and symlinked, and settings.json at/home/agent/.pi/agent/settings.json.Step 3: Pi agent config with provider abstraction [done]
Notes:
resolve_agent_config(name, provider=None)returns the base config dict unchanged (same identity) for agents without aproviderskey — keeping claude/cursor behavior 100% backwards-compatible.providersstripped out) and the selected provider's keys are merged in. The selected provider name is also stored asprovideron the result so downstream code can branch on it.get_auth_modewas updated to delegate toresolve_agent_configinstead ofget_agent. This is a no-op for claude/cursor (resolve returns base) but lets pi reach into its claude-sdk provider'sauth_modes. No call sites needed to change.test_agents.pyfollow the file's existing class/style conventions.Files:
tools/ralph/src/ralph/agents.py— add pi agent entry,resolve_agent_config()Implement:
_pi_cli_flags(model)returning["--no-skills", "--no-session"].AGENTS["pi"]withcli_command: "pi",default_model: "claude-sonnet-4-6",default_provider: "claude-sdk", and aprovidersdict containing theclaude-sdkprovider config (sandbox_agent,base_image,uses_proxy,allowed_hosts,env_var_name,default_auth_mode,auth_modes).resolve_agent_config(name, provider=None):get_agent(name).providersdict exists, selects provider (fallback todefault_provider), validates it exists, returns a shallow copy with provider-specific keys merged in.providersdict, returns base config as-is (backwards-compatible for claude/cursor).ValueErrorfor unknown providers.VALID_AGENTSto include"pi".Acceptance:
get_agent("pi")returns the pi config dict withproviderskey.resolve_agent_config("pi")returns a flat dict withuses_proxy: True,sandbox_agent: "claude",allowed_hostsfrom claude-sdk provider, etc.resolve_agent_config("claude")returns the existing claude config unchanged.resolve_agent_config("pi", provider="unknown")raisesValueError.pytest tools/ralph/tests/ -v -k agentpasses.Step 4: Per-project config extension [done]
Notes:
AGENTSis imported lazily insideload_runtime_config()(rather than at module top) to keep the runtime module's import graph free of any agents-module dependency. Functionally identical, but avoids forcing every consumer ofruntimeto loadagents.pivalue (e.g.{"pi": "claude-sdk"}) raisesValueErrorrather than crashing later when.get()is called on a string. The error message prefix matches the project convention (ralph: 'pi' section in <path> must be an object).pi: {}is treated as "no pi config" (nopi_provider/pi_modelkeys surface). This keeps the keys absent rather thanNone, so downstream callers can use a singlecfg.get(...)lookup with their own default.pi_modelis not validated — pi/Claude has many model aliases and the agent itself rejects unknowns. Validating provider only is consistent with the spec.test_runtime_docker_sandbox.pyand were exercised end-to-end via stdlib equivalents that drive the same code paths (9/9 passed).Files:
tools/ralph/src/ralph/runtime/__init__.py— extendload_runtime_config()tools/ralph/tests/test_runtime_docker_sandbox.py— added pi-section coverage toTestLoadRuntimeConfigImplement:
load_runtime_config(), parse the optionalpisection from the config dict. Extractpi.providerandpi.modelif present.pi.provider(if present) is a known provider by importing and checking againstAGENTS["pi"]["providers"].pi_providerandpi_modelin the returned config dict.Acceptance:
pisection works unchanged.{"pi": {"provider": "claude-sdk"}}parses correctly, returnspi_provider: "claude-sdk".{"pi": {"provider": "unknown"}}raisesValueError.pytest tools/ralph/tests/ -v -k runtime_configpasses.Step 5: Loop integration [done]
Notes:
cli.pynow usesresolve_agent_config(agent)for theuses_proxycheck. For pi this resolves the default provider (claude-sdk), givinguses_proxy=True. The proxy reuses port 18080 viaproxy_port_for_agent("pi")(DEFAULT_PROXY_PORT fallback) and writes its PID to/tmp/ralph-proxy-pi.pid.ensure_proxy("pi", ...)will reuse a healthy claude proxy on the same port (mode/version match), so concurrent ralph invocations on the same auth mode share one proxy.--modelprecedence is tracked via a newexplicit_modelboolean: cli.py sets it toTrueonly when the user passed--model. It's forwarded toprocess_issue/poll_loop, where the order becomes: explicit--model>pi.modelfrom.agent-loop/config.json> agent default. This avoids interpreting "user passed the default value explicitly" as "use pi.model instead", because explicit-ness is tracked separately from the resolved value.pi_provider/pi_modelare popped from the runtime config dict inprocess_issuebefore the remaining kwargs are forwarded tocreate_runtime. Otherwise unknown keys would leak into runtime constructors (the existing kwargs are forwarded with**config).runtime.ensure_sandboxgained abase_image=Nonekwarg that is threaded intoensure_image→pull_base_image/compute_tag/build_image. Loop only forwardsbase_image=...when the resolved agent config has a non-emptybase_imagefield, keeping the existing claude/cursorassert_called_once_with(...)assertions matching exactly.DockerImageMixin._effective_base_image(agent, base_image)is the new resolution helper. It returns the override when supplied; otherwise it parses the Dockerfile'sFROMdirective and rejects ARG-style placeholders (e.g.${BASE_IMAGE}). This way, agents that template their base must supply an override at the call site instead of silently producing a brokendocker pull.compute_tagwas kept tolerant of ARG-style FROM with no override — it falls back to an emptybase_digestso existingTestComputeTagPiExtensions(which doesn't passbase_image) continues to pass and exercise the extensions-hash logic.token.py::_resolve_mode_stringswitched fromget_agenttoresolve_agent_config, so pi's auth_modes (which live on theclaude-sdkprovider) surface correctly when the user runsralph store-token --agent pi.test_loop.py::TestProcessIssuePiwere exercised end-to-end via stdlibunittest.mockequivalents (7/7 passed) plus targeted runtime checks for_effective_base_imageandensure_imagewith/withoutbase_image.Files:
tools/ralph/src/ralph/loop.py— pop pi keys, re-resolve agent config, model precedence, base_image kwargtools/ralph/src/ralph/cli.py—resolve_agent_config, trackexplicit_model, forward to looptools/ralph/src/ralph/runtime/__init__.py—_effective_base_imagehelper, threadbase_imagethroughensure_image/compute_tag/pull_base_image/needs_rebuildtools/ralph/src/ralph/runtime/docker_sandbox.py—resolve_agent_config;ensure_sandboxacceptsbase_imagetools/ralph/src/ralph/runtime/container.py—resolve_agent_config;ensure_sandboxacceptsbase_imagetools/ralph/src/ralph/token.py—_resolve_mode_stringusesresolve_agent_configtools/ralph/tests/test_loop.py— newTestProcessIssuePiclassImplement:
process_issue(), afterload_runtime_config(), extractpi_providerandpi_modelfrom the config.resolve_agent_config(agent, provider=pi_provider)to get the effective config. Use the resolved config for all downstream calls: proxy env building,ensure_sandbox(passbase_imagefrom resolved config),run_iteration, network policy.pi_modelis set in config, use it as the model (CLI--modeltakes precedence if explicitly provided).ensure_sandbox, pass the resolvedbase_imagetobuild_imageso the correct base is used per provider.cli.py, ensureensure_tokenandensure_proxyuse the resolved agent config. Since pi reuses claude's auth_modes, this should work without changes, but verify the code path.Acceptance:
ralph --agent pi --issue <N>runs pi in a Docker sandbox with the claude-sdk provider.CLAUDE_CODE_OAUTH_TOKEN=phantomandANTHROPIC_BASE_URL=http://host.docker.internal:18080as env vars.pi -p "<prompt>" --model claude-sonnet-4-6 --no-skills --no-session./tmp/spec.mdand read back after iteration.Step 6: Run all checks [done]
Notes:
-k build_context,-k agent,-k runtime_config) had skipped past. All four were fixed in this step:TestSecretFileLifecycle's parametrize fixture filteredVALID_AGENTSbyget_agent(a)["uses_proxy"], which KeyErrors on pi (itsuses_proxylives under theclaude-sdkprovider). Switched toresolve_agent_config.TestSandboxEnsureSandbox/TestContainerEnsureSandboxtest_force_rebuild_passed_throughassertions usedassert_called_once_with("claude", force_rebuild=True). The Step 5 implementation note promised the existing claude/cursor call signature would be preserved; in practice bothensure_sandboxmethods unconditionally forwardedbase_image=Nonetoensure_image. Updatedensure_sandboxto only forwardbase_imagewhen it's set, matching the spec's intent and the existing loop.py pattern.test_rejects_invalid_env_var_namepatchedralph.runtime.container.get_agentbut Step 5 switched the call site toresolve_agent_config. Updated the patch target.test_proxy_recovery_passes_auth_modewas failing as of commita90c82e("proxy resilience for idle shutdown") — that commit added a per-issueensure_proxy(...)call before the recovery path, soensure_proxyis now called twice (once at startup health check, once on recovery) but the test still usedassert_called_once_with. Rewrote the assertion to verify both calls forwardauth_mode="api_key".tests/test_sandbox_prune.pyandtests/test_tart_sandbox.py(top-level dotfiles tests, unrelated to ralph/pi) reference aralph.sandbox.tartmodule that no longer exists; verified they failed identically on0bd0801~1so they are out of scope for this spec.Acceptance:
pytest tools/ralph/tests/ -v— all tests pass ✓ (880 passed, 14 skipped)python3 -m py_compile tools/ralph/src/ralph/agents.py— no syntax errors ✓python3 -m py_compile tools/ralph/src/ralph/runtime/__init__.py— no syntax errors ✓python3 -m py_compile tools/ralph/src/ralph/loop.py— no syntax errors ✓python3 -m py_compile tools/ralph/src/ralph/cli.py— no syntax errors ✓Conventions
tools/ralph/tests/ralph: