Skip to content

neko/svmbench

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

svmbench cover

svmbench is a benchmark and agent harness for finding and exploiting solana program bugs.

how it works | security | key services | repo layout | quickstart (local dev)

fork changes from evmbench

  • solana/anchor instead of evm/solidity
  • opencode instead of openai codex - uses opencode as the agent runtime
  • multi-model support - run audits with any openrouter-compatible model (claude, gpt, gemini, deepseek, etc.)
  • effort levels - low/medium/high runtime presets for cost control
  • minimal worker image - no solana/anchor cli installed; agent only reads source code
  • no foundry/slither - evm tooling removed since this is solana-focused

upload anchor/solana program source code, select an agent, and receive a structured vulnerability report rendered in the ui.

how it works

architecture

frontend (next.js)
    │
    ├─ POST /v1/jobs/start ───► backend api (fastapi, port 1337)
    │                               ├─► postgresql (job state)
    ├─ GET  /v1/jobs/{id}           ├─► secrets service (port 8081)
    │                               └─► rabbitmq (job queue)
    └─ GET  /v1/jobs/history                │
                                             ▼
                                        instancer (consumer)
                                              │
                                    ┌─────────┴──────────┐
                                    ▼                    ▼
                              docker backend       k8s backend (optional)
                                    │                    │
                                    └────────┬───────────┘
                                             ▼
                                      worker container
                                        ├─► secrets service (fetch bundle)
                                        ├─► (optional) oai proxy (port 8084) ──► openrouter api
                                        └─► results service (port 8083)

end-to-end flow

  1. user uploads a zip of program files via the frontend. the ui sends the archive, selected model key, and api key to /v1/jobs/start.
  2. the backend creates a job record in postgres, stores a secret bundle in the secrets service, and publishes a message to rabbitmq.
  3. the instancer consumes the job and starts a worker (docker locally; kubernetes backend is optional).
  4. the worker fetches its bundle from the secrets service, unpacks the uploaded zip to audit/, then runs opencode in "detect-only" mode:
    • prompt: backend/worker_runner/detect.md (copied to $HOME/AGENTS.md inside the container)
    • model map: backend/worker_runner/model_map.json (maps ui model keys to openrouter model ids)
    • command wrapper: backend/worker_runner/run_opencode.sh
  5. the agent writes submission/audit.md. the worker validates that the output contains parseable json with {"vulnerabilities": [...]} and then uploads it to the results service.
  6. the frontend polls job status and renders the report with file navigation and annotations.

security

svmbench runs an llm-driven agent against uploaded, untrusted code. treat the worker runtime (filesystem, logs, outputs) as an untrusted environment.

see SECURITY.md for the full trust model and operational guidance.

api credential handling:

  • direct byok (default): worker receives a plaintext openrouter key (OPENROUTER_API_KEY).
  • proxy-token mode (optional): worker receives an opaque token and routes requests through oai_proxy (plaintext key stays outside the worker).

enabling proxy-token mode:

cd backend
cp .env.example .env
# set BACKEND_OAI_KEY_MODE=proxy and OAI_PROXY_AES_KEY=...
docker compose --profile proxy up -d --build

operational note: worker runtime is bounded by default; override the max audit runtime with AUDIT_TIMEOUT (default: 900 seconds).

key services

service default port role
backend 1337 main api: job submission, status, history, auth
secretsvc 8081 stores and serves per-job secret bundles (zip + key material)
resultsvc 8083 receives worker results, validates/parses, persists to db
oai_proxy 8084 optional openai proxy for proxy-token mode
instancer (n/a) rabbitmq consumer that starts worker containers/pods
worker (n/a) executes the detect-only agent and uploads results
postgres 5432 job state persistence
rabbitmq 5672 job queue

repo layout

.
├── README.md
├── SECURITY.md
├── LICENSE
├── frontend/                 next.js ui (upload zip, select model, view results)
├── backend/
│   ├── api/                  main fastapi api (jobs, auth, integration)
│   ├── instancer/            rabbitmq consumer; starts workers (docker/k8s)
│   ├── secretsvc/            bundle storage service
│   ├── resultsvc/            results ingestion + persistence
│   ├── oai_proxy/            optional openai proxy (proxy-token mode)
│   ├── prunner/              optional cleanup of stale workers
│   ├── worker_runner/        detect prompts (effort levels) + model map + opencode runner
│   ├── docker/
│   │   ├── base/             base image: opencode, node, ripgrep
│   │   ├── backend/          backend services image
│   │   └── worker/           worker image + entrypoint
│   └── compose.yml           full stack (db/mq + services)
└── deploy/                   optional deployment scripts/examples

quickstart (local dev)

ensure docker and bun are available.

build the base and worker images first (required before starting the stack):

cd backend
docker build -t svmbench/base:latest -f docker/base/Dockerfile .
docker build -t svmbench/worker:latest -f docker/worker/Dockerfile .

start backend stack (api + dependencies):

cp .env.example .env
# for local dev, the placeholder secrets in .env.example are sufficient.
# for internet-exposed deployments, replace them with strong values.
docker compose up -d --build

start frontend dev server:

cd frontend
bun install
bun dev

open:

  • http://127.0.0.1:3000 (frontend)
  • http://127.0.0.1:1337/v1/integration/frontend (backend config endpoint)

acknowledgments

this is a fork of evmbench.

apache-2.0 license

About

a benchmark and harness for finding and exploiting bugs in anchor programs on the svm

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Contributors

Languages

  • TypeScript 65.1%
  • Python 30.1%
  • CSS 2.2%
  • Shell 1.6%
  • Dockerfile 0.7%
  • Makefile 0.2%
  • Other 0.1%