cordon

Self-hostable, OpenAI-compatible AI gateway with policy-driven PII and secret guardrails. Sits between your apps and the LLM provider. Inspects every prompt; redacts, blocks, or transparently reroutes sensitive traffic to a local model.

┌──────────┐    /v1/chat/completions   ┌────────────────────────┐    ┌──────────┐
│   app    │ ────────────────────────▶ │       cordon           │ ─▶ │  openai  │
└──────────┘                           │  ┌──────────────────┐  │    └──────────┘
                                       │  │ 1. inspect       │  │    ┌──────────┐
                                       │  │ 2. policy match  │  │ ─▶ │ anthropic│
                                       │  │ 3. route         │  │    └──────────┘
                                       │  │ 4. log + audit   │  │    ┌──────────┐
                                       │  └──────────────────┘  │ ─▶ │  ollama  │  ← local
                                       └────────────────────────┘    └──────────┘
                                            postgres + redis

Why

Most "AI gateway" projects ship a proxy with allow/deny knobs and call it done. Cordon goes further on the four things that actually matter when you're trying to ship LLM features past a security review:

A real policy engine. YAML rules with allow | block | redact | route_local | route_provider actions, evaluated in priority order. The novel action is route_local — sensitive prompts are transparently rerouted to your local Ollama instead of being blocked outright. The default policy ships in src/cordon/policies/default.yaml.
Deterministic inspection. Regex + Shannon entropy + Luhn validation. Not an LLM-as-classifier. Runs in single-digit milliseconds, easy to reason about, and easy to extend — see src/cordon/inspection.py.
Graceful degradation. Rate limits and counters fall back from Redis to Postgres automatically — the gateway keeps working when Redis goes down. See src/cordon/cache.py.
Audit logs you can query. Every request lands a structured row in gateway_request_logs with provider, model, action, matched rule, findings, blocked/redacted flags, token estimate, cost estimate, cache hit. CSV export endpoints are built in.

Quickstart

docker compose up

Then point any OpenAI SDK at http://localhost:8080/v1 using the dev-seeded API key (default pk_live_dev_changeme — override via AI_GATEWAY_DEV_API_KEY):

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="pk_live_dev_changeme")
client.chat.completions.create(
    model="anthropic/claude-3-sonnet",
    messages=[{"role": "user", "content": "hello"}],
)

The admin dashboard lives at http://localhost:8080/admin. Paste the same API key to see request volume, blocked counts, action breakdown, and CSV exports.

The killer demo: `route_local`

Full transcript with response bodies for all three policy outcomes lives at docs/walkthrough.md.

Start the stack with the local Llama profile:

docker compose --profile ollama up
docker compose exec ollama ollama pull llama3.2

Case 1 — normal prompt routes to OpenAI.

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer pk_live_dev_changeme" \
  -d '{"model": "gpt-4o-mini", "messages": [
        {"role": "user", "content": "Summarize the CAP theorem in one sentence."}]}'

Response includes "gateway": {"provider": "openai", "action": "allow", "policy_rule": "default_allow", ...}. Nothing surprising.

Case 2 — same call, but the prompt contains PII. Cordon transparently reroutes to local Llama.

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer pk_live_dev_changeme" \
  -d '{"model": "gpt-4o-mini", "messages": [
        {"role": "user", "content": "Draft a reply for customer Jane Doe, SSN 123-45-6789."}]}'

Response now includes "gateway": {"provider": "ollama", "model": "llama3.2", "action": "route_local", "policy_rule": "pii_local_only", ...} — the SSN never left the box. The application code didn't change. The audit log records the decision.

Case 3 — same call, but with an obvious secret. Cordon blocks.

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer pk_live_dev_changeme" \
  -d '{"model": "gpt-4o-mini", "messages": [
        {"role": "user", "content": "Use AKIAIOSFODNN7EXAMPLE to deploy."}]}'

Returns 403 with "error": {"type": "policy_violation", "rule": "block_secrets"}. The secret never reaches the upstream. The full inspection findings are written to gateway_violations.

The end-to-end behavior of all three cases is covered by tests/test_proxy.py — it boots the gateway against SQLite, swaps in fake adapters, and asserts the routing decisions.

What's in the box

Layer	Where
OpenAI-compatible HTTP surface	`src/cordon/main.py`
Request pipeline (inspect → policy → route → log)	`src/cordon/service.py`
Inspection (regex, entropy, Luhn, prompt-injection heuristics)	`src/cordon/inspection.py`
Policy engine (YAML rules, priority-sorted)	`src/cordon/policy.py`
Provider routing	`src/cordon/routing.py`
Provider adapters (OpenAI, Anthropic, Ollama)	`src/cordon/adapters/`
Redis cache + rate-limit + Postgres fallback	`src/cordon/cache.py`
API-key auth (`pk_live_` / `pk_demo_`, PBKDF2, HMAC sessions)	`src/cordon/auth.py`
Audit log + admin endpoints + CSV exports	`src/cordon/main.py`
Single-file admin dashboard	`src/cordon/static/admin_dashboard.html`
Alembic migrations	`migrations/`

Configuration

Cordon reads everything from environment variables; see .env.example. The notable ones:

Variable	Default	Notes
`DATABASE_URL`	postgres in compose	Any SQLAlchemy URL. SQLite works for local dev.
`REDIS_URL`	redis in compose	Optional; rate-limit + cache degrade to Postgres if absent.
`OPENAI_API_KEY` / `ANTHROPIC_API_KEY`	—	Required for the relevant adapters.
`OLLAMA_BASE_URL`	http://ollama:11434	For the `route_local` action.
`AI_GATEWAY_POLICY_PATH`	default.yaml	Override the policy file.
`AI_GATEWAY_DEV_API_KEY`	`pk_live_dev_changeme`	Bootstrap key created on first boot.
`AI_GATEWAY_RATE_LIMIT_RPM`	120	Per-key requests per minute. 0 disables.
`AI_GATEWAY_STORE_RAW_PROMPTS`	false	When false, only a 500-char preview is persisted.

Roadmap

This is v0.1.0a. The proxy, policy engine, inspection, audit log, and admin surface are real. Known gaps:

v0.1.1 — full polish of the admin dashboard (port the rich panels currently archived at src/cordon/static/admin_dashboard_legacy.html).
v0.2.0 — MCP (Model Context Protocol) tool-use, currently stubbed at src/cordon/mcp_runtime.py.
v0.3.0 — streaming (stream: true) responses. Non-streaming works today.

Development

pip install -e ".[dev]"
pytest -q              # 23 tests, runs against in-process SQLite
ruff check src

CI runs lint, the test matrix on py3.11/3.12 against a real Postgres, and a Docker build smoke test on every push — see .github/workflows/ci.yml.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
docs		docs
migrations		migrations
src/cordon		src/cordon
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cordon

Why

Quickstart

The killer demo: `route_local`

What's in the box

Configuration

Roadmap

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cordon

Why

Quickstart

The killer demo: route_local

What's in the box

Configuration

Roadmap

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

The killer demo: `route_local`

Packages