A 3-agent AI pipeline that transforms university curriculum materials into market-enriched learning guides. Feed in lecture PDFs or PPTXs, get out structured PDFs with industry context, gap analysis PowerPoints, video scripts, and rendered videos with AI voiceover.
Curriculum PDF/PPTX ──> [Ingest] ──> [Research] ──> [Generate] ──> Learning Guide PDF
Agent 1 Agent 2 Agent 3 + PPT + Scripts + Videos (MP4)
Built with LangGraph, OpenAI, ChromaDB, Tavily, fpdf2, Kokoro TTS, and FastAPI.
Full Documentation | Architecture | API Reference
# Setup (requires uv — https://docs.astral.sh/uv/)
make install # uv sync --all-extras
# Configure
cp .env.example .env # Add your API keys
# Run the pipeline
python -m backend.run_pipeline path/to/lecture.pdf
# Or start the web UI
make dev # http://localhost:8080Output is saved to outputs/<timestamp>/.
make test # 1063 pytest (backend + frontend + GPU/CPU video services)
make e2e # 17 Playwright E2E tests
npx vitest run # 124 Vitest component tests (from frontend/react-app/)- 3-agent pipeline — Ingest, Research, Generate, orchestrated by LangGraph
- Multi-model routing — GPT-5-nano (extraction), GPT-5-mini (analysis), GPT-5.1 (generation) with severity-based routing
- Chained outputs — PDF → PPT → Scripts → Videos (MP4), each building on the previous
- Kokoro TTS video pipeline — Open-source voiceover + slide backgrounds → rendered MP4 videos at zero API cost
- 2-tier video fallback — GPU Primary (europe-west4) → CPU Video (europe-west2)
- React SPA frontend — React 19 + Vite 7 + Tailwind v4 + Tanstack Query, all pages wired to real API
- Content viewers — Inline PDF iframe, PPT slide carousel with keyboard navigation, HTML5 video player
- Quiz platform — On-demand MCQ generation (Bloom's taxonomy, difficulty distribution, gap targeting), one-attempt scoring with per-question feedback
- JWT + session dual auth — JWT Bearer tokens for the SPA + legacy session cookie for backward compatibility
- PDF and PPTX upload — Drag-drop with magic byte validation
- Vector-backed context — ChromaDB stores curriculum and research for semantic retrieval
- Evaluation framework — L1 structural checks (free) + L2 DeepSeek-V3 judge (~$0.02/run)
- 1,204-test suite — 1063 pytest + 124 Vitest + 17 Playwright E2E, zero real API calls
Full documentation lives in docs/:
| Section | Description |
|---|---|
| Getting Started | Installation, quickstart, configuration, web UI |
| Architecture | Pipeline design, data flow, model routing, output chain |
| Agents | Deep dive into Ingest, Research, and Generate agents |
| Services | LLM wrapper, ChromaDB, PDF/PPT/Video builders |
| Evaluation | Two-layer eval system, CLI, prompt registry, A/B testing |
| Deployment | Docker and GCP Cloud Run deployment |
| Design Decisions | Research documents and architectural rationale |
| API Reference | Auto-generated API documentation |
| Contributing | Dev setup, testing, updating docs |
make install # uv sync --all-extras (includes docs deps)
make docs-serve # http://localhost:8000| Command | Description |
|---|---|
make install |
Install package in editable mode with dev deps |
make test |
Run all tests |
make dev |
Start web UI (hot-reload) |
make run ARGS="file.pdf" |
Run pipeline via CLI |
make clean |
Remove chroma_db/, outputs/, caches |
make docs-serve |
Preview documentation site locally |
make docs-build |
Build documentation with strict checks |
Software/
├── backend/
│ ├── config.py # Pydantic-settings configuration
│ ├── run_pipeline.py # CLI entry point
│ ├── pipeline/ # LangGraph agents (ingest, research, generate, quiz)
│ ├── services/ # LLM, ChromaDB, file parser, builders, TTS, GPU client
│ ├── prompts/ # All prompt templates (pipeline + quiz)
│ ├── evals/ # Evaluation framework
│ └── tests/ # Backend tests
├── frontend/
│ ├── app.py # FastAPI server + ProgressCapture
│ ├── templates/ # Web UI
│ └── tests/ # Frontend tests
├── gpu_service/ # GPU microservice (Kokoro TTS + ffmpeg on NVIDIA L4)
├── docs/ # Documentation (MkDocs + Material)
├── mkdocs.yml # Documentation config
├── Makefile # Dev shortcuts
└── pyproject.toml # Python project config