audit

An 8-stage vulnerability-discovery agent, driven by your Claude Pro / Max subscription through the official Claude Code Agent SDK. Many narrow agents, deliberate disagreement, and an explicit reachability gate.

MIT-licensed. No API key needed if you already use claude login.

Origin

This project is a from-scratch reimplementation of the pipeline described in Cloudflare's Project Glasswing post, which tested Anthropic's Mythos preview LLM against Cloudflare's own codebase. The blog argues that real-world vulnerability discovery does not come from asking one big model "find bugs here" — it comes from:

Many narrow agents working in parallel on tightly-scoped questions ("Look for command injection in this specific function, with this trust boundary above it") rather than one exhaustive agent.
Deliberate disagreement — a second agent, on a different model, that tries to disprove the first agent's findings.
A reachability trace as the gating step — most "is this code buggy?" findings are noise unless an attacker-controlled input can actually reach the sink from outside the system.
A feedback loop so reachable bugs in one place automatically seed hunts for the same pattern elsewhere.

This repo packages that pipeline into a runnable agent. The Cloudflare post showed the architecture; this codebase ships the prompts, schemas, state store, and orchestrator.

The 8 stages

_{Diagram from Cloudflare's Project Glasswing post, reproduced here for reference.}

#	Stage	Default model	Purpose
1	Recon	Opus 4.7	Map the repo, emit narrowly-scoped Hunt tasks
2	Hunt	Sonnet 4.6	One attack class per agent; compile/run PoCs
3	Validate	Opus 4.7	Adversarial re-read; tries to disprove (different model from Hunt)
4	Gapfill	Sonnet 4.6	Re-queue under-covered areas
5	Dedupe	Sonnet 4.6	Cluster findings by root cause
6	Trace	Opus 4.7	Prove attacker-controlled input reaches the sink
7	Feedback	Sonnet 4.6	Turn reachable traces into new Hunt tasks
8	Report	Sonnet 4.6	Schema-validated structured report

Each stage is one markdown prompt in prompts/ + one JSON Schema in schemas/. The orchestrator passes the schema into the system prompt so every output is shape-stable on the first try.

Quickstart

# 1. Install
python -m venv .venv && source .venv/bin/activate
pip install -e .

# 2. Auth (pick one)
#    (a) Already logged in via claude login? You're done.
#    (b) Or generate a 1-year OAuth token for CI / non-interactive use:
claude setup-token
echo "CLAUDE_CODE_OAUTH_TOKEN=<paste>" > .env

# 3. Verify
audit auth-check

# 4. Run
audit run --repo /path/to/target --run-id my-run
audit status --run-id my-run
audit report --run-id my-run --format md > report.md

By default the agent uses subscription billing via your Claude.ai login — it does not call the metered Anthropic API. The on-disk auth module scrubs ANTHROPIC_API_KEY from the environment so it can't silently route around the OAuth flow.

Using a different model / provider

The auth module picks one of three modes, in this order:

LLM gateway (OpenRouter, custom proxy, etc.) — when ANTHROPIC_BASE_URL points away from anthropic.com AND ANTHROPIC_AUTH_TOKEN is set. The gateway env is left intact; only ANTHROPIC_API_KEY is scrubbed (it would otherwise outrank the gateway token).
Subscription OAuth (headless) — CLAUDE_CODE_OAUTH_TOKEN from claude setup-token. Best for CI.
Subscription OAuth (interactive) — ~/.claude/.credentials.json from claude login. Best for local dev.

OpenRouter

OpenRouter exposes Claude-compatible Anthropic-API endpoints behind its own credit system; that lets you spend OpenRouter credits instead of an Anthropic subscription, and gives you access to Sonnet/Opus and other models through the same SDK path. See OpenRouter's Agent SDK guide.

export ANTHROPIC_BASE_URL="https://openrouter.ai/api"
export ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY"
export ANTHROPIC_API_KEY=""           # must be explicitly empty / unset
# optional: pick a non-Anthropic model
export ANTHROPIC_MODEL="anthropic/claude-sonnet-4-6"
# or e.g.: ANTHROPIC_MODEL="openai/gpt-5"
#         ANTHROPIC_MODEL="google/gemini-2.5-pro"
#         ANTHROPIC_MODEL="qwen/qwen3-coder-480b"

audit auth-check                       # confirms "using LLM gateway at https://openrouter.ai/api"
audit run --repo /path/to/target --run-id orun --max-cost-usd 30

Caveats:

Per-stage model overrides in config/stages.yaml are model names (e.g. claude-opus-4-7); OpenRouter accepts slash-prefixed forms like anthropic/claude-opus-4-7. Edit the YAML if you want different providers per stage. Otherwise ANTHROPIC_MODEL forces every stage onto one model.
Non-Claude models may not produce schema-compliant JSON as reliably. The runner's schema-validation + repair turn still applies; quality varies by model.
Tool-use semantics (Read/Grep/Glob/Bash) are part of the Claude Code CLI, not the model — they work as long as the gateway speaks the Anthropic Messages API.

Other gateways / cloud providers

Same recipe — anything that exposes the Anthropic Messages API at a URL

a bearer token works:

export ANTHROPIC_BASE_URL="https://your-proxy.example.com"
export ANTHROPIC_AUTH_TOKEN="$YOUR_TOKEN"
unset ANTHROPIC_API_KEY

For Amazon Bedrock / Google Vertex / Microsoft Foundry, Claude Code has first-class env-var flags (CLAUDE_CODE_USE_BEDROCK=1 etc.) that outrank everything else. See the Claude Code auth docs.

Cost containment

A real production codebase can produce 15-50 Hunt tasks and 25+ findings to validate. At default concurrency this gets expensive. Flags to keep it sane:

audit run --repo /path/to/target \
  --max-concurrency 1 \           # one claude subprocess at a time
  --max-recon-tasks 15 \          # cap initial Hunt fanout
  --max-cost-usd 30               # abort cleanly if exceeded

The budget guard fires between and within stages — a per-task check in Hunt cooperatively aborts rather than running 30 more tasks past the cap.

Live-target reproduction (optional)

If the target has a running deployment, point the agents at it. Hunt now reproduces each finding against the live service instead of compiling a local PoC, Validate rejects findings that don't reproduce, and Trace confirms reachability with real HTTP round-trips. The static path remains available — these flags are opt-in.

audit run --repo /path/to/target --run-id live \
  --max-concurrency 1 --max-cost-usd 30 \
  --target-url http://server.local:8888 \
  --target-creds email=admin@system.com \
  --target-creds password=changechangeme

Rules the agents follow when --target-url is set:

Network egress is restricted to that host + 127.0.0.1. No other external hosts.
A finding that doesn't reproduce against the live target is dropped or rejected (depending on stage) — "no fabrication".
Credentials flow into every relevant stage's user_input as a dict.

Scope notes (optional)

Targets often have intentionally-loose-by-design surfaces that aren't bugs (e.g. plaintext API keys when that's a feature, test-only Mailpit endpoints, anonymous-analytics ingest). Drop them in a text file and pass it in — the notes are appended verbatim to every stage's user_input, and Recon / Hunt / Validate honor exclusions you list.

audit run --repo /path/to/target --scope-notes target_scope.md

Example target_scope.md:

- Mailpit (port 1025) is test-only; ignore.
- Plaintext API keys in the database are a required feature.
- Don't flag rate-limit absence on anonymous /ping endpoints.
- Only consider critical/high severity.

Recon mines git history

Recon greps the git history for past security patches (CVE, sec:, fix.*auth, sanitize, …) — patched files are hardened, but sibling files with the same idiom often aren't. Findings get seeded against the unpatched copies. Adds zero cost on repos without that pattern; catches real cross-component bugs on repos that have it.

Logic chains

The pipeline's default is one-attack-class-per-task (the Cloudflare paper's narrow-scope rule). Recon can also emit logic_chain tasks for high-impact multi-component paths (auth-bypass + IDOR + path-traversal that compose into RCE, etc.) — one chain per task, with the scope_hint naming the specific chain. This is the one allowed exception to single-attack-class scoping.

Layout

prompts/        8 stage prompts (markdown, loaded as system prompts)
schemas/        9 JSON schemas — every agent output is validated
config/         stages.yaml — model + concurrency + tool allowlist per stage
audit/          Python package
  auth.py       OAuth check + ANTHROPIC_API_KEY scrubbing
  state.py      SQLite DAO (runs, tasks, findings, traces, dedupe, costs)
  runner.py     claude-agent-sdk wrapper with schema validation + repair turn
  orchestrator.py pipeline driver
  stages/       one module per stage
work/           per-Hunt-task scratch dirs (sandbox for PoC compile/run)
results/        JSONL artifacts per stage + final report.json
state.db        SQLite (gitignored)

Safety

Hunt agents have Bash and run inside per-task scratch dirs. They are not sandboxed at the OS level. Run the audit inside a disposable VM or container when you don't trust the target source — a target with malicious build scripts could otherwise execute on your host during PoC compilation.

The agent reads everything you --add-dir, including any .env or secrets/ directories in the target. Outputs land in results/<run-id>/ which is .gitignored but not scrubbed of those reads.

License

MIT. Reuse freely. No warranty.

Acknowledgements

The pipeline design is from Cloudflare's Project Glasswing blog post. The credit for the architecture goes there.
Built on the official Claude Code Agent SDK.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
audit		audit
config		config
docs		docs
prompts		prompts
schemas		schemas
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

audit

Origin

The 8 stages

Quickstart

Using a different model / provider

OpenRouter

Other gateways / cloud providers

Cost containment

Live-target reproduction (optional)

Scope notes (optional)

Recon mines git history

Logic chains

Layout

Safety

License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

audit

Origin

The 8 stages

Quickstart

Using a different model / provider

OpenRouter

Other gateways / cloud providers

Cost containment

Live-target reproduction (optional)

Scope notes (optional)

Recon mines git history

Logic chains

Layout

Safety

License

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages