svmbench is a benchmark and agent harness for finding and exploiting solana program bugs.
how it works | security | key services | repo layout | quickstart (local dev)
fork changes from evmbench
- solana/anchor instead of evm/solidity
- opencode instead of openai codex - uses opencode as the agent runtime
- multi-model support - run audits with any openrouter-compatible model (claude, gpt, gemini, deepseek, etc.)
- effort levels - low/medium/high runtime presets for cost control
- minimal worker image - no solana/anchor cli installed; agent only reads source code
- no foundry/slither - evm tooling removed since this is solana-focused
upload anchor/solana program source code, select an agent, and receive a structured vulnerability report rendered in the ui.
frontend (next.js)
│
├─ POST /v1/jobs/start ───► backend api (fastapi, port 1337)
│ ├─► postgresql (job state)
├─ GET /v1/jobs/{id} ├─► secrets service (port 8081)
│ └─► rabbitmq (job queue)
└─ GET /v1/jobs/history │
▼
instancer (consumer)
│
┌─────────┴──────────┐
▼ ▼
docker backend k8s backend (optional)
│ │
└────────┬───────────┘
▼
worker container
├─► secrets service (fetch bundle)
├─► (optional) oai proxy (port 8084) ──► openrouter api
└─► results service (port 8083)
- user uploads a zip of program files via the frontend. the ui sends the archive, selected model key, and api key to
/v1/jobs/start. - the backend creates a job record in postgres, stores a secret bundle in the secrets service, and publishes a message to rabbitmq.
- the instancer consumes the job and starts a worker (docker locally; kubernetes backend is optional).
- the worker fetches its bundle from the secrets service, unpacks the uploaded zip to
audit/, then runs opencode in "detect-only" mode:- prompt:
backend/worker_runner/detect.md(copied to$HOME/AGENTS.mdinside the container) - model map:
backend/worker_runner/model_map.json(maps ui model keys to openrouter model ids) - command wrapper:
backend/worker_runner/run_opencode.sh
- prompt:
- the agent writes
submission/audit.md. the worker validates that the output contains parseable json with{"vulnerabilities": [...]}and then uploads it to the results service. - the frontend polls job status and renders the report with file navigation and annotations.
svmbench runs an llm-driven agent against uploaded, untrusted code. treat the worker runtime (filesystem, logs, outputs) as an untrusted environment.
see SECURITY.md for the full trust model and operational guidance.
api credential handling:
- direct byok (default): worker receives a plaintext openrouter key (
OPENROUTER_API_KEY). - proxy-token mode (optional): worker receives an opaque token and routes requests through
oai_proxy(plaintext key stays outside the worker).
enabling proxy-token mode:
cd backend
cp .env.example .env
# set BACKEND_OAI_KEY_MODE=proxy and OAI_PROXY_AES_KEY=...
docker compose --profile proxy up -d --buildoperational note: worker runtime is bounded by default; override the max audit runtime with AUDIT_TIMEOUT (default: 900 seconds).
| service | default port | role |
|---|---|---|
backend |
1337 | main api: job submission, status, history, auth |
secretsvc |
8081 | stores and serves per-job secret bundles (zip + key material) |
resultsvc |
8083 | receives worker results, validates/parses, persists to db |
oai_proxy |
8084 | optional openai proxy for proxy-token mode |
instancer |
(n/a) | rabbitmq consumer that starts worker containers/pods |
worker |
(n/a) | executes the detect-only agent and uploads results |
| postgres | 5432 | job state persistence |
| rabbitmq | 5672 | job queue |
.
├── README.md
├── SECURITY.md
├── LICENSE
├── frontend/ next.js ui (upload zip, select model, view results)
├── backend/
│ ├── api/ main fastapi api (jobs, auth, integration)
│ ├── instancer/ rabbitmq consumer; starts workers (docker/k8s)
│ ├── secretsvc/ bundle storage service
│ ├── resultsvc/ results ingestion + persistence
│ ├── oai_proxy/ optional openai proxy (proxy-token mode)
│ ├── prunner/ optional cleanup of stale workers
│ ├── worker_runner/ detect prompts (effort levels) + model map + opencode runner
│ ├── docker/
│ │ ├── base/ base image: opencode, node, ripgrep
│ │ ├── backend/ backend services image
│ │ └── worker/ worker image + entrypoint
│ └── compose.yml full stack (db/mq + services)
└── deploy/ optional deployment scripts/examples
ensure docker and bun are available.
build the base and worker images first (required before starting the stack):
cd backend
docker build -t svmbench/base:latest -f docker/base/Dockerfile .
docker build -t svmbench/worker:latest -f docker/worker/Dockerfile .start backend stack (api + dependencies):
cp .env.example .env
# for local dev, the placeholder secrets in .env.example are sufficient.
# for internet-exposed deployments, replace them with strong values.
docker compose up -d --buildstart frontend dev server:
cd frontend
bun install
bun devopen:
http://127.0.0.1:3000(frontend)http://127.0.0.1:1337/v1/integration/frontend(backend config endpoint)
this is a fork of evmbench.