cordon-rag

Cited RAG chat over a local corpus, routed through Cordon so PII and secret guardrails apply to every LLM call.

This is the companion app for cordon. It demonstrates a real, production-shaped pattern: a RAG service where every retrieval-augmented LLM call passes through a policy-aware gateway. The pitch in one sentence: your RAG output is only as safe as the prompts you feed the model, and "the retrieved context" is a prompt.

What's in the box

A 220-line lexical retriever (regex tokenization + bigram phrase bonus + path/title bonus, no embeddings, no vector DB). See src/cordon_rag/retrieval.py.
A seeded corpus of 45 Wikipedia-derived CC BY-SA documents across three categories: car manufacturing & repair, medical, tech/IT.
A FastAPI service that retrieves, asks Cordon to answer with [1]/[2]/... citations, persists the conversation, and surfaces the gateway's routing decision in the response.
A single-file HTML console at / so you can drive it without a frontend project.

┌──────────┐  POST /api/query   ┌───────────────────┐  retrieve   ┌──────────────────┐
│ console  │ ─────────────────▶ │   cordon-rag      │ ──────────▶ │  local corpus    │
│  / SDK   │                    │                   │             │  45 markdown     │
└──────────┘                    │  ┌─────────────┐  │             └──────────────────┘
                                │  │ build msgs  │  │
                                │  │ persist     │  │             ┌──────────────────┐
                                │  └─────────────┘  │ ─POST /v1─▶ │     cordon       │
                                └───────────────────┘             │  policy engine   │
                                       postgres                   │  PII/secret      │
                                                                  │  guardrails      │
                                                                  └──────────────────┘
                                                                       │
                                                                       ▼
                                                            openai / anthropic / ollama

Why this matters

Most RAG demos call the LLM provider directly. That's fine until your retrieved context happens to contain a customer email, an API key in a chat-log snippet, or a regulated identifier you didn't realize you had in your knowledge base. Once it's in the prompt, it's leaving your network.

Sending the LLM call through cordon means:

Sensitive context auto-reroutes to local Llama instead of being shipped to OpenAI/Anthropic.
Obvious secrets in retrieved chunks (API keys, credit cards) get blocked before they hit any provider.
Every retrieval-augmented response lands in cordon's gateway_request_logs with the matched rule, redactions, cost, and latency — auditable from one place.

The same compliance story that justifies cordon for chat APIs justifies cordon-rag for AI search.

Quickstart

cordon-rag needs cordon running. Easiest path: clone both and use the bundled compose stack.

git clone https://github.com/batbrainy/cordon.git
git clone https://github.com/batbrainy/cordon-rag.git
cd cordon-rag
docker compose up

The compose file starts cordon, postgres, redis, and cordon-rag together. Visit http://localhost:8090 for the console.

Try it (curl)

# Pick a category and ask a question — gets cited answer + gateway routing decision
curl -X POST http://localhost:8090/api/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How do regenerative braking systems recover energy?",
    "category": "car_manufacturing_repair"
  }'

Response shape:

{
  "answer": "Regenerative braking captures kinetic energy that would otherwise be lost as heat [1], using the electric motor in reverse as a generator [2]...",
  "sources": [
    {"index": 1, "doc_id": "car_manufacturing_repair-014", "title": "Regenerative braking", "path": "library/docs/...", "snippet": "...", "score": 12.4},
    {"index": 2, "doc_id": "car_manufacturing_repair-022", "title": "Hybrid drivetrains", "path": "library/docs/...", "snippet": "...", "score": 8.1}
  ],
  "gateway": {
    "provider": "openai",
    "model": "gpt-4o-mini",
    "action": "allow",
    "matched_rule": "default_allow",
    "request_id": "req-abc123"
  },
  "conversation_id": "0190a8d4-..."
}

The gateway block is the cordon audit trail — provider chosen, policy rule that matched, request id you can grep in gateway_request_logs. The same response shape applies whether cordon routed to OpenAI, Anthropic, or local Llama.

API

Method	Path	Notes
`POST`	`/api/query`	Run a RAG query. Pass `conversation_id` to continue a thread.
`GET`	`/api/conversations`	List recent conversations.
`GET`	`/api/conversations/{id}`	Fetch a conversation with all messages and citations.
`GET`	`/api/docs`	List the seeded corpus. Filter with `?category=`.
`GET`	`/`	Single-file HTML console.
`GET`	`/health`	Liveness probe.

Configuration

Env	Default	Notes
`DATABASE_URL`	postgres in compose	Any SQLAlchemy URL. SQLite works for dev.
`CORDON_BASE_URL`	`http://gateway:8080`	Where cordon lives.
`CORDON_API_KEY`	`pk_live_dev_changeme`	Match the key issued by your cordon instance.
`CORDON_DEFAULT_MODEL`	`gpt-4o-mini`	Forwarded to cordon; cordon picks the provider per policy.
`RETRIEVAL_TOP_K`	5	Number of chunks to include per query.
`ALLOWED_ORIGINS`	`http://localhost:8090`	CORS allowlist.

Development

pip install -e ".[dev]"
ruff check src

License

MIT for the code. The bundled corpus under src/cordon_rag/seeded_data/library/ is CC BY-SA 4.0 (Wikipedia); see per-file frontmatter and LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
migrations		migrations
scripts		scripts
src/cordon_rag		src/cordon_rag
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cordon-rag

What's in the box

Why this matters

Quickstart

Try it (curl)

API

Configuration

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cordon-rag

What's in the box

Why this matters

Quickstart

Try it (curl)

API

Configuration

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages