A multi-agent narrative engine where each AI character can only retrieve facts it personally witnessed — information-asymmetry enforced in the retrieval layer, not by prompt instructions, and proven leak-proof by an automated eval suite.
Play it in your terminal or browser in under 60 seconds — no API keys required.
▶ Play it live (no install, no key): arrya5-katha.hf.space
Most multi-agent demos share a single global context, meaning any agent can be prompted to surface any fact — leakage is a social contract enforced only by the system prompt, which any jailbreak or paraphrase can break. Katha takes a different approach: if the secret never enters an agent's context window, no decoding path can emit it.
The core mechanism is a witness gate applied at retrieval time. Every canon fact in the knowledge base is tagged with characters_present (the agents who witnessed that event). When an NPC queries the knowledge state, L1 retrieval filters on that metadata — the secret is structurally absent from the context, not merely discouraged. The OMNISCIENT narrator Betaal bypasses the gate and can see all facts, which is both the story mechanic and a clean test oracle for the guarantee.
This makes leakage provably impossible at the retrieval layer rather than probabilistically suppressed at the prompt layer — a meaningful engineering distinction in any system where agents must hold asymmetric information.
Zero installations beyond Python 3.12. Zero API keys.
git clone https://github.com/arrya5/KATHA.git
cd KATHA/backend
# 1. Run the full self-test (knowledge-leak guarantees, moderation evals, full story arc)
python -m app.selftest
# 2. Play the complete story arc in your terminal
python -m app.demo
# 3. Play in your browser (interactive visual novel UI)
python -m app.webserver
# Open http://127.0.0.1:8000To verify the leak-proofness metric directly:
python -m app.eval_leak
# Prints the adversarial probe results and exits 0 if the invariant holdsMobile (optional): cd frontend && npm install && npm start — scan the QR code with Expo Go on a real device. Your phone and computer need to be on the same Wi-Fi network.
flowchart LR
Player([Player input]) --> Mod[Moderation]
Mod -- block --> Defl["Deflection<br/>authored fallback"]
Mod -- allow --> NR[Narrator / Router]
NR --> Agent["Agent node<br/>retrieves bounded context<br/>from Knowledge-State engine"]
Agent --> WS["World-State<br/>writes known_to"]
WS --> Val[Validator]
Val -- fail --> AF[Authored fallback]
Val -- pass --> Synth[Synthesizer]
Synth --> SR(["SceneRender<br/>to client"])
Runs on a stdlib runner (zero extra installs) or LangGraph (KATHA_ORCHESTRATOR=langgraph) — same node functions, swappable wiring.
flowchart TD
Q[Player query] --> Gate{"Witness gate<br/>L1 canon filter"}
subgraph L1 ["L1 canon (characters_present tag)"]
F1["Fact A — characters_present: [Betaal, Vikram]"]
F2["Secret B — characters_present: [Betaal]"]
end
Gate -- "NPC: Vikram's wife<br/>not in characters_present for Secret B" --> Miss["Secret B absent<br/>from context window"]
Gate -- "Teller: Betaal<br/>omniscient — bypasses gate" --> Hit["Secret B present<br/>in context window"]
Miss --> NPC["NPC response:<br/>'I did not witness that.'"]
Hit --> Teller["Betaal response:<br/>correct, grounded answer"]
The engine uses three retrieval layers:
| Layer | What it indexes | Gate |
|---|---|---|
| L1 — witnessed canon | Hard facts from the tales (story beats, secrets) | characters_present — only agents who were there get the fact |
| L2 — world events | Events that happened during gameplay | known_to — written at event time; only witnesses get it |
| L3 — per-agent memory | Each agent's own conversational history | No cross-agent access by construction |
This is enforced in code, not in a prompt. The gate lives in backend/rag/knowledge_state.py. An agent cannot retrieve a fact that was never added to its retrieval set, regardless of what the LLM is told or how the user phrases the question.
Proof from python -m app.selftest:
Knowledge-leak (L1 witnessed canon):
[PASS] Betaal (teller) can access the secret fact
[PASS] The wife CANNOT access the secret she didn't witness
Running python -m app.eval_leak probes the full adversarial set:
LEAK-PROOFNESS (security invariant -- gates this run)
forbidden facts withheld from non-witnesses : 28/28
information leaks : 0
RESULT: 28/28 secrets withheld across the probe set -- 0 leaks. Leak-proof by construction.
0 information leaks across the full 28-probe adversarial set — the guarantee is enforced in the retrieval layer, not requested in a prompt.
| Layer | Choice |
|---|---|
| Language | Python 3.12 |
| Agent orchestration | 6-node turn graph; stdlib runner (default) + LangGraph (production) |
| RAG | 3-layer (L1 witnessed canon / L2 world-events / L3 memory); lexical default, Ollama semantic optional |
| LLM providers | Mock (offline, deterministic, zero keys) / Ollama (local) / Gemini (cloud) — provider-swappable via one env var |
| Moderation | 3-layer: input classifier → output validator → authored-fallback safety net |
| Persistence | In-memory (default) / SQLite (DATABASE_URL=sqlite:///katha.db) |
| Client | Browser (stdlib HTTP + web/index.html) / Expo + React Native (mobile) |
| Voice | Sarvam TTS (key-gated, plug-and-play); browser TTS fallback offline |
| Command | What it covers |
|---|---|
python -m app.selftest |
Knowledge-leak invariants (witnessed-canon gate, world-event gate); moderation red-team inputs (40/40 caught); moderation false-positive set (20/20 allowed); full story arc end-to-end |
pytest backend/tests/ |
Unit and integration tests for engine nodes, RAG layers, and persistence |
python -m app.eval_leak |
Adversarial probe suite (28 probes) — systematically attempts to extract secrets through every NPC; reports leak count and exits 0 if 0 leaks (currently 28/28 withheld, 0 leaks) |
All three run offline with the mock LLM provider. The eval suite is the canonical proof of the knowledge-isolation guarantee.
Katha is a cultural-preservation project as much as a game. The goal is to bring Indian mythology to life in a form that respects the source texts and is rigorous enough to survive scrutiny from scholars and the community alike. Phase 1 (shipping now) is Vikram aur Betaal — low-risk folklore whose riddle-and-moral structure maps cleanly onto the investigation gameplay loop. Phase 2 is the Mahabharata, gated behind a cultural-review checklist, an advisory board, and proven retention from Phase 1. The engine is content-agnostic; Phase 2 design is preserved, not wasted.
Leak-proof AI agents: information asymmetry in a multi-agent RAG game — why prompt-level "don't reveal X" fails, and how moving the boundary into the retrieval layer makes leakage structurally impossible (with the invariant test that proves it).
Arrya Thakur — SRM Chennai CS '27
MIT License — see LICENSE for details.
Contributions welcome — see CONTRIBUTING.md for guidelines.