Skip to content

batbrainy/cordon

Repository files navigation

cordon

Self-hostable, OpenAI-compatible AI gateway with policy-driven PII and secret guardrails. Sits between your apps and the LLM provider. Inspects every prompt; redacts, blocks, or transparently reroutes sensitive traffic to a local model.

┌──────────┐    /v1/chat/completions   ┌────────────────────────┐    ┌──────────┐
│   app    │ ────────────────────────▶ │       cordon           │ ─▶ │  openai  │
└──────────┘                           │  ┌──────────────────┐  │    └──────────┘
                                       │  │ 1. inspect       │  │    ┌──────────┐
                                       │  │ 2. policy match  │  │ ─▶ │ anthropic│
                                       │  │ 3. route         │  │    └──────────┘
                                       │  │ 4. log + audit   │  │    ┌──────────┐
                                       │  └──────────────────┘  │ ─▶ │  ollama  │  ← local
                                       └────────────────────────┘    └──────────┘
                                            postgres + redis

Why

Most "AI gateway" projects ship a proxy with allow/deny knobs and call it done. Cordon goes further on the four things that actually matter when you're trying to ship LLM features past a security review:

  1. A real policy engine. YAML rules with allow | block | redact | route_local | route_provider actions, evaluated in priority order. The novel action is route_local — sensitive prompts are transparently rerouted to your local Ollama instead of being blocked outright. The default policy ships in src/cordon/policies/default.yaml.

  2. Deterministic inspection. Regex + Shannon entropy + Luhn validation. Not an LLM-as-classifier. Runs in single-digit milliseconds, easy to reason about, and easy to extend — see src/cordon/inspection.py.

  3. Graceful degradation. Rate limits and counters fall back from Redis to Postgres automatically — the gateway keeps working when Redis goes down. See src/cordon/cache.py.

  4. Audit logs you can query. Every request lands a structured row in gateway_request_logs with provider, model, action, matched rule, findings, blocked/redacted flags, token estimate, cost estimate, cache hit. CSV export endpoints are built in.

Quickstart

docker compose up

Then point any OpenAI SDK at http://localhost:8080/v1 using the dev-seeded API key (default pk_live_dev_changeme — override via AI_GATEWAY_DEV_API_KEY):

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="pk_live_dev_changeme")
client.chat.completions.create(
    model="anthropic/claude-3-sonnet",
    messages=[{"role": "user", "content": "hello"}],
)

The admin dashboard lives at http://localhost:8080/admin. Paste the same API key to see request volume, blocked counts, action breakdown, and CSV exports.

The killer demo: route_local

Full transcript with response bodies for all three policy outcomes lives at docs/walkthrough.md.

Start the stack with the local Llama profile:

docker compose --profile ollama up
docker compose exec ollama ollama pull llama3.2

Case 1 — normal prompt routes to OpenAI.

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer pk_live_dev_changeme" \
  -d '{"model": "gpt-4o-mini", "messages": [
        {"role": "user", "content": "Summarize the CAP theorem in one sentence."}]}'

Response includes "gateway": {"provider": "openai", "action": "allow", "policy_rule": "default_allow", ...}. Nothing surprising.

Case 2 — same call, but the prompt contains PII. Cordon transparently reroutes to local Llama.

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer pk_live_dev_changeme" \
  -d '{"model": "gpt-4o-mini", "messages": [
        {"role": "user", "content": "Draft a reply for customer Jane Doe, SSN 123-45-6789."}]}'

Response now includes "gateway": {"provider": "ollama", "model": "llama3.2", "action": "route_local", "policy_rule": "pii_local_only", ...} — the SSN never left the box. The application code didn't change. The audit log records the decision.

Case 3 — same call, but with an obvious secret. Cordon blocks.

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer pk_live_dev_changeme" \
  -d '{"model": "gpt-4o-mini", "messages": [
        {"role": "user", "content": "Use AKIAIOSFODNN7EXAMPLE to deploy."}]}'

Returns 403 with "error": {"type": "policy_violation", "rule": "block_secrets"}. The secret never reaches the upstream. The full inspection findings are written to gateway_violations.

The end-to-end behavior of all three cases is covered by tests/test_proxy.py — it boots the gateway against SQLite, swaps in fake adapters, and asserts the routing decisions.

What's in the box

Layer Where
OpenAI-compatible HTTP surface src/cordon/main.py
Request pipeline (inspect → policy → route → log) src/cordon/service.py
Inspection (regex, entropy, Luhn, prompt-injection heuristics) src/cordon/inspection.py
Policy engine (YAML rules, priority-sorted) src/cordon/policy.py
Provider routing src/cordon/routing.py
Provider adapters (OpenAI, Anthropic, Ollama) src/cordon/adapters/
Redis cache + rate-limit + Postgres fallback src/cordon/cache.py
API-key auth (pk_live_ / pk_demo_, PBKDF2, HMAC sessions) src/cordon/auth.py
Audit log + admin endpoints + CSV exports src/cordon/main.py
Single-file admin dashboard src/cordon/static/admin_dashboard.html
Alembic migrations migrations/

Configuration

Cordon reads everything from environment variables; see .env.example. The notable ones:

Variable Default Notes
DATABASE_URL postgres in compose Any SQLAlchemy URL. SQLite works for local dev.
REDIS_URL redis in compose Optional; rate-limit + cache degrade to Postgres if absent.
OPENAI_API_KEY / ANTHROPIC_API_KEY Required for the relevant adapters.
OLLAMA_BASE_URL http://ollama:11434 For the route_local action.
AI_GATEWAY_POLICY_PATH default.yaml Override the policy file.
AI_GATEWAY_DEV_API_KEY pk_live_dev_changeme Bootstrap key created on first boot.
AI_GATEWAY_RATE_LIMIT_RPM 120 Per-key requests per minute. 0 disables.
AI_GATEWAY_STORE_RAW_PROMPTS false When false, only a 500-char preview is persisted.

Roadmap

This is v0.1.0a. The proxy, policy engine, inspection, audit log, and admin surface are real. Known gaps:

Development

pip install -e ".[dev]"
pytest -q              # 23 tests, runs against in-process SQLite
ruff check src

CI runs lint, the test matrix on py3.11/3.12 against a real Postgres, and a Docker build smoke test on every push — see .github/workflows/ci.yml.

License

MIT — see LICENSE.

About

Self-hostable, OpenAI-compatible AI gateway with policy-driven PII and secret guardrails. Route sensitive prompts to a local model automatically.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages