| title | Document AI Analyst |
|---|---|
| emoji | 🧠 |
| colorFrom | indigo |
| colorTo | purple |
| sdk | docker |
| app_port | 7860 |
| pinned | true |
| license | mit |
| short_description | Enterprise Agentic RAG — upload PDFs and chat with AI |
██████╗ ██████╗ ███████╗ █████╗ ███████╗███████╗██╗███████╗████████╗ █████╗ ███╗ ██╗████████╗
██╔══██╗██╔══██╗██╔════╝ ██╔══██╗██╔════╝██╔════╝██║██╔════╝╚══██╔══╝██╔══██╗████╗ ██║╚══██╔══╝
██████╔╝██║ ██║█████╗ ███████║███████╗███████╗██║███████╗ ██║ ███████║██╔██╗ ██║ ██║
██╔═══╝ ██║ ██║██╔══╝ ██╔══██║╚════██║╚════██║██║╚════██║ ██║ ██╔══██║██║╚██╗██║ ██║
██║ ██████╔╝██║ ██║ ██║███████║███████║██║███████║ ██║ ██║ ██║██║ ╚████║ ██║
╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝╚══════╝╚══════╝╚═╝╚══════╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═══╝ ╚═╝
██████╗ █████╗ ██████╗
██╔══██╗██╔══██╗██╔════╝
██████╔╝███████║██║ ███╗
██╔══██╗██╔══██║██║ ██║
██║ ██║██║ ██║╚██████╔╝
╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝
Upload · Embed · Retrieve · Chat — A production-grade AI document assistant built end-to-end with an agentic RAG pipeline, streaming responses, and per-user data isolation.
Features · Tech Stack · Getting Started · Architecture · RAG Pipeline · API Reference · Deployment · Contributing
Thanks to all the amazing people who have contributed to PDF-Assistant-RAG! 🎉
🌟 GSSOC Contributors — This project is open for GirlScript Summer of Code. Check out our CONTRIBUTING.md to get started and browse open issues tagged
good first issue.
PDF-Assistant-RAG is a complete, production-ready AI document assistant that lets users upload complex PDFs, financial reports, legal contracts, and research papers — then chat with an AI that provides accurate, cited answers powered by a multi-stage Retrieval-Augmented Generation pipeline.
The system uses semantic search + cross-encoder reranking to find the most relevant document chunks, streams AI-generated answers token-by-token, and highlights exact source citations with page numbers — all inside a sleek Next.js UI with JWT-secured per-user data isolation.
| Technology | Purpose | |
|---|---|---|
| all-MiniLM-L6-v2 | Local sentence embeddings | |
| ms-marco-MiniLM-L-6-v2 | Cross-encoder reranker | |
| Qwen2.5-72B-Instruct | LLM (HuggingFace Inference API) | |
| PyMuPDF + python-docx | Document parsing |
| Technology | Purpose | |
|---|---|---|
| Docker Multi-Stage | Containerized deployment | |
| GitHub Actions | CI pipeline (dev branch) | |
| Git LFS | Binary asset management | |
| HuggingFace Spaces | Production deployment |
|
|
|
PDF-Assistant-RAG/
│
├── backend/ # FastAPI + RAG server
│ ├── app/
│ │ ├── main.py # App entrypoint, middleware, static files
│ │ ├── config.py # Pydantic settings (env vars)
│ │ ├── database.py # SQLAlchemy async engine
│ │ ├── models.py # ORM models (User, Document, Message)
│ │ ├── schemas.py # Pydantic request/response schemas
│ │ ├── auth.py # JWT creation & verification
│ │ │
│ │ ├── routes/
│ │ │ ├── auth.py # POST /register, /login, /me
│ │ │ ├── documents.py # Upload, list, delete, retrieve
│ │ │ └── chat.py # Streaming chat + history
│ │ │
│ │ └── rag/
│ │ ├── agent.py # Main RAG orchestrator
│ │ ├── chunker.py # Recursive text splitter
│ │ ├── embeddings.py # SentenceTransformer wrapper
│ │ ├── vectorstore.py # ChromaDB collection manager
│ │ ├── retriever.py # Semantic search + reranking
│ │ └── prompts.py # System & user prompt templates
│ │
│ ├── requirements.txt
│ └── .env # Local env (never committed)
│
├── frontend/ # Next.js 16 App Router
│ └── src/
│ ├── app/
│ │ ├── layout.tsx # Root layout + fonts
│ │ ├── page.tsx # Landing / redirect
│ │ ├── login/ # Auth pages
│ │ ├── register/
│ │ └── dashboard/ # Main app page
│ │
│ ├── components/
│ │ ├── chat/
│ │ │ ├── ChatPanel.tsx # Chat UI + SSE streaming
│ │ │ ├── MessageBubble.tsx # User / assistant message
│ │ │ └── SourceCard.tsx # Citation cards
│ │ ├── document/ # Upload + sidebar components
│ │ └── layout/ # Navbar, sidebar shell
│ │
│ └── lib/
│ └── api.ts # Typed API client + SSE stream helper
│
├── .github/
│ ├── workflows/
│ │ ├── ci.yml # CI — runs on dev branch only
│ │ ├── deploy.yml # Docker build — main branch only
│ │ └── devsecops.yml # Security scans — main branch only
│ ├── ISSUE_TEMPLATE/ # Bug report & feature request forms
│ ├── pull_request_template.md # PR checklist
│ └── CODEOWNERS # Auto-review assignment
│
├── Dockerfile # Multi-stage: Node build → Python serve
├── docker-compose.yml # Local Docker stack
├── CONTRIBUTING.md # GSSOC contributor guide
└── .env.example # Template for environment variables
git clone https://github.com/param20h/PDF-Assistant-RAG.git
cd PDF-Assistant-RAGcp .env.example backend/.envEdit backend/.env:
SECRET_KEY=your-strong-random-secret
DATABASE_URL=sqlite:///./data/app.db
HF_TOKEN=hf_your_huggingface_token_here
UPLOAD_DIR=./data/uploads
CHROMA_PERSIST_DIR=./data/chroma_dbGet your free HuggingFace token at huggingface.co/settings/tokens
Open two terminals:
# Terminal A — Backend
cd backend
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000
# → API running at http://localhost:8000
# → Swagger docs at http://localhost:8000/docs# Terminal B — Frontend
cd frontend
npm install
npm run dev
# → App running at http://localhost:3000docker compose up --build
# → Full stack at http://localhost:7860 ┌─────────────────────────────────────────────┐
│ PDF / DOCX Upload │
└───────────────────┬─────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ PyMuPDF / python-docx Parser │
│ (text extraction per page) │
└───────────────────┬─────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Recursive Character Text Splitter │
│ chunk_size=1000 | overlap=200 │
└───────────────────┬─────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ all-MiniLM-L6-v2 (local embeddings) │
│ 384-dim dense vectors │
└───────────────────┬─────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ ChromaDB — per-user persistent collection │
└─────────────────────────────────────────────┘
── At Query Time ──
User Question ──▶ Embed ──▶ Semantic Search (Top-K=10)
│
▼
Cross-Encoder Reranker (Top-K=5)
ms-marco-MiniLM-L-6-v2
│
▼
Prompt Assembly (system + context + question)
│
▼
Qwen2.5-72B-Instruct (HF Inference API)
│
▼
Streamed SSE tokens ──▶ Frontend ChatPanel
| Method | Endpoint | Auth | Description |
|---|---|---|---|
POST |
/api/v1/auth/register |
❌ | Create a new user account |
POST |
/api/v1/auth/login |
❌ | Login and receive JWT token |
GET |
/api/v1/auth/me |
✅ | Get current user profile |
POST |
/api/v1/documents/upload |
✅ | Upload PDF/DOCX and trigger indexing |
GET |
/api/v1/documents |
✅ | List all documents for current user |
DELETE |
/api/v1/documents/{id} |
✅ | Delete a document and its vector data |
POST |
/api/v1/chat/ask/stream |
✅ | Ask a question (SSE streaming response) |
GET |
/api/v1/chat/history/{doc_id} |
✅ | Get chat history for a document |
DELETE |
/api/v1/chat/history/{doc_id} |
✅ | Clear chat history for a document |
GET |
/health |
❌ | Health check (db + chroma status) |
Full interactive docs available at
/docs(Swagger UI) when running locally.
| Variable | Required | Default | Description |
|---|---|---|---|
HF_TOKEN |
✅ | — | HuggingFace API token for LLM inference |
SECRET_KEY |
✅ | — | JWT signing secret (use a strong random string) |
DATABASE_URL |
❌ | sqlite:///./data/app.db |
SQLAlchemy database URL |
UPLOAD_DIR |
❌ | ./data/uploads |
Directory for uploaded files |
CHROMA_PERSIST_DIR |
❌ | ./data/chroma_db |
ChromaDB persistence path |
LLM_MODEL |
❌ | Qwen/Qwen2.5-72B-Instruct |
HuggingFace model ID |
LLM_TEMPERATURE |
❌ | 0.3 |
LLM sampling temperature |
LLM_MAX_NEW_TOKENS |
❌ | 1024 |
Max tokens per response |
EMBEDDING_MODEL |
❌ | all-MiniLM-L6-v2 |
SentenceTransformer model |
CHUNK_SIZE |
❌ | 1000 |
Document chunk size (characters) |
CHUNK_OVERLAP |
❌ | 200 |
Overlap between chunks |
TOP_K_RETRIEVAL |
❌ | 10 |
Candidates retrieved from vector store |
TOP_K_RERANK |
❌ | 5 |
Final chunks passed to LLM after reranking |
MAX_FILE_SIZE_MB |
❌ | 50 |
Maximum upload file size |
| Command | Description |
|---|---|
uvicorn app.main:app --reload |
Start FastAPI with hot reload |
uvicorn app.main:app --port 8000 |
Start FastAPI on port 8000 |
| Command | Description |
|---|---|
npm run dev |
Start Next.js dev server |
npm run build |
Production build → out/ (static export) |
npm run lint |
Run ESLint |
| Command | Description |
|---|---|
docker compose up --build |
Build and start the full stack |
docker compose down |
Stop all containers |
This project is deployed on HuggingFace Spaces using Docker.
- Fork this repo and create a new Space at huggingface.co/new-space (SDK: Docker)
- Set the following Space secrets:
HF_TOKEN— your HuggingFace API tokenSECRET_KEY— a strong random string
- Push to the
hfremote — the Space will auto-build
git remote add hf https://<username>:<HF_TOKEN>@huggingface.co/spaces/<username>/<space-name>
git push hf maindocker compose up -d --build
# App available at http://your-server:7860This project is participating in GirlScript Summer of Code! We welcome contributors of all skill levels.
Branch Strategy:
| Branch | Purpose |
|---|---|
main |
Production — HuggingFace deployed (admin only) |
dev |
All contributor PRs target here |
feature/* / fix/* / docs/* |
Your working branches |
# Always branch from dev
git checkout -b feature/my-feature upstream/devQuick links:
Distributed under the MIT License. See LICENSE for more information.
Built with 💙 as a flagship AI engineering project
If you found this project helpful, please give it a ⭐ — it helps GSSOC contributors discover it!