Self-hostable, OpenAI-compatible AI gateway with policy-driven PII and secret guardrails. Sits between your apps and the LLM provider. Inspects every prompt; redacts, blocks, or transparently reroutes sensitive traffic to a local model.
┌──────────┐ /v1/chat/completions ┌────────────────────────┐ ┌──────────┐
│ app │ ────────────────────────▶ │ cordon │ ─▶ │ openai │
└──────────┘ │ ┌──────────────────┐ │ └──────────┘
│ │ 1. inspect │ │ ┌──────────┐
│ │ 2. policy match │ │ ─▶ │ anthropic│
│ │ 3. route │ │ └──────────┘
│ │ 4. log + audit │ │ ┌──────────┐
│ └──────────────────┘ │ ─▶ │ ollama │ ← local
└────────────────────────┘ └──────────┘
postgres + redis
Most "AI gateway" projects ship a proxy with allow/deny knobs and call it done. Cordon goes further on the four things that actually matter when you're trying to ship LLM features past a security review:
-
A real policy engine. YAML rules with
allow | block | redact | route_local | route_provideractions, evaluated in priority order. The novel action isroute_local— sensitive prompts are transparently rerouted to your local Ollama instead of being blocked outright. The default policy ships insrc/cordon/policies/default.yaml. -
Deterministic inspection. Regex + Shannon entropy + Luhn validation. Not an LLM-as-classifier. Runs in single-digit milliseconds, easy to reason about, and easy to extend — see
src/cordon/inspection.py. -
Graceful degradation. Rate limits and counters fall back from Redis to Postgres automatically — the gateway keeps working when Redis goes down. See
src/cordon/cache.py. -
Audit logs you can query. Every request lands a structured row in
gateway_request_logswith provider, model, action, matched rule, findings, blocked/redacted flags, token estimate, cost estimate, cache hit. CSV export endpoints are built in.
docker compose upThen point any OpenAI SDK at http://localhost:8080/v1 using the dev-seeded API key (default pk_live_dev_changeme — override via AI_GATEWAY_DEV_API_KEY):
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="pk_live_dev_changeme")
client.chat.completions.create(
model="anthropic/claude-3-sonnet",
messages=[{"role": "user", "content": "hello"}],
)The admin dashboard lives at http://localhost:8080/admin. Paste the same API key to see request volume, blocked counts, action breakdown, and CSV exports.
Full transcript with response bodies for all three policy outcomes lives at
docs/walkthrough.md.
Start the stack with the local Llama profile:
docker compose --profile ollama up
docker compose exec ollama ollama pull llama3.2Case 1 — normal prompt routes to OpenAI.
curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer pk_live_dev_changeme" \
-d '{"model": "gpt-4o-mini", "messages": [
{"role": "user", "content": "Summarize the CAP theorem in one sentence."}]}'Response includes "gateway": {"provider": "openai", "action": "allow", "policy_rule": "default_allow", ...}. Nothing surprising.
Case 2 — same call, but the prompt contains PII. Cordon transparently reroutes to local Llama.
curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer pk_live_dev_changeme" \
-d '{"model": "gpt-4o-mini", "messages": [
{"role": "user", "content": "Draft a reply for customer Jane Doe, SSN 123-45-6789."}]}'Response now includes "gateway": {"provider": "ollama", "model": "llama3.2", "action": "route_local", "policy_rule": "pii_local_only", ...} — the SSN never left the box. The application code didn't change. The audit log records the decision.
Case 3 — same call, but with an obvious secret. Cordon blocks.
curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer pk_live_dev_changeme" \
-d '{"model": "gpt-4o-mini", "messages": [
{"role": "user", "content": "Use AKIAIOSFODNN7EXAMPLE to deploy."}]}'Returns 403 with "error": {"type": "policy_violation", "rule": "block_secrets"}. The secret never reaches the upstream. The full inspection findings are written to gateway_violations.
The end-to-end behavior of all three cases is covered by tests/test_proxy.py — it boots the gateway against SQLite, swaps in fake adapters, and asserts the routing decisions.
| Layer | Where |
|---|---|
| OpenAI-compatible HTTP surface | src/cordon/main.py |
| Request pipeline (inspect → policy → route → log) | src/cordon/service.py |
| Inspection (regex, entropy, Luhn, prompt-injection heuristics) | src/cordon/inspection.py |
| Policy engine (YAML rules, priority-sorted) | src/cordon/policy.py |
| Provider routing | src/cordon/routing.py |
| Provider adapters (OpenAI, Anthropic, Ollama) | src/cordon/adapters/ |
| Redis cache + rate-limit + Postgres fallback | src/cordon/cache.py |
API-key auth (pk_live_ / pk_demo_, PBKDF2, HMAC sessions) |
src/cordon/auth.py |
| Audit log + admin endpoints + CSV exports | src/cordon/main.py |
| Single-file admin dashboard | src/cordon/static/admin_dashboard.html |
| Alembic migrations | migrations/ |
Cordon reads everything from environment variables; see .env.example. The notable ones:
| Variable | Default | Notes |
|---|---|---|
DATABASE_URL |
postgres in compose | Any SQLAlchemy URL. SQLite works for local dev. |
REDIS_URL |
redis in compose | Optional; rate-limit + cache degrade to Postgres if absent. |
OPENAI_API_KEY / ANTHROPIC_API_KEY |
— | Required for the relevant adapters. |
OLLAMA_BASE_URL |
http://ollama:11434 | For the route_local action. |
AI_GATEWAY_POLICY_PATH |
default.yaml | Override the policy file. |
AI_GATEWAY_DEV_API_KEY |
pk_live_dev_changeme |
Bootstrap key created on first boot. |
AI_GATEWAY_RATE_LIMIT_RPM |
120 | Per-key requests per minute. 0 disables. |
AI_GATEWAY_STORE_RAW_PROMPTS |
false | When false, only a 500-char preview is persisted. |
This is v0.1.0a. The proxy, policy engine, inspection, audit log, and admin surface are real. Known gaps:
- v0.1.1 — full polish of the admin dashboard (port the rich panels currently archived at
src/cordon/static/admin_dashboard_legacy.html). - v0.2.0 — MCP (Model Context Protocol) tool-use, currently stubbed at
src/cordon/mcp_runtime.py. - v0.3.0 — streaming (
stream: true) responses. Non-streaming works today.
pip install -e ".[dev]"
pytest -q # 23 tests, runs against in-process SQLite
ruff check srcCI runs lint, the test matrix on py3.11/3.12 against a real Postgres, and a Docker build smoke test on every push — see .github/workflows/ci.yml.
MIT — see LICENSE.