GitHub - param20h/PDF-Assistant-RAG

title	Document AI Analyst
emoji	🧠
colorFrom	indigo
colorTo	purple
sdk	docker
app_port	7860
pinned	true
license	mit
short_description	Enterprise Agentic RAG — upload PDFs and chat with AI

██████╗ ██████╗ ███████╗     █████╗ ███████╗███████╗██╗███████╗████████╗ █████╗ ███╗   ██╗████████╗
██╔══██╗██╔══██╗██╔════╝    ██╔══██╗██╔════╝██╔════╝██║██╔════╝╚══██╔══╝██╔══██╗████╗  ██║╚══██╔══╝
██████╔╝██║  ██║█████╗      ███████║███████╗███████╗██║███████╗   ██║   ███████║██╔██╗ ██║   ██║
██╔═══╝ ██║  ██║██╔══╝      ██╔══██║╚════██║╚════██║██║╚════██║   ██║   ██╔══██║██║╚██╗██║   ██║
██║     ██████╔╝██║         ██║  ██║███████║███████║██║███████║   ██║   ██║  ██║██║ ╚████║   ██║
╚═╝     ╚═════╝ ╚═╝         ╚═╝  ╚═╝╚══════╝╚══════╝╚═╝╚══════╝   ╚═╝   ╚═╝  ╚═╝╚═╝  ╚═══╝   ╚═╝
                                                                                                    
                        ██████╗  █████╗  ██████╗
                        ██╔══██╗██╔══██╗██╔════╝
                        ██████╔╝███████║██║  ███╗
                        ██╔══██╗██╔══██║██║   ██║
                        ██║  ██║██║  ██║╚██████╔╝
                        ╚═╝  ╚═╝╚═╝  ╚═╝ ╚═════╝

Enterprise Agentic Retrieval-Augmented Generation System

Upload · Embed · Retrieve · Chat — A production-grade AI document assistant built end-to-end with an agentic RAG pipeline, streaming responses, and per-user data isolation.

Features · Tech Stack · Getting Started · Architecture · RAG Pipeline · API Reference · Deployment · Contributing

🤝 Contributors

Thanks to all the amazing people who have contributed to PDF-Assistant-RAG! 🎉

🌟 GSSOC Contributors — This project is open for GirlScript Summer of Code. Check out our CONTRIBUTING.md to get started and browse open issues tagged good first issue.

🌟 Overview

PDF-Assistant-RAG is a complete, production-ready AI document assistant that lets users upload complex PDFs, financial reports, legal contracts, and research papers — then chat with an AI that provides accurate, cited answers powered by a multi-stage Retrieval-Augmented Generation pipeline.

The system uses semantic search + cross-encoder reranking to find the most relevant document chunks, streams AI-generated answers token-by-token, and highlights exact source citations with page numbers — all inside a sleek Next.js UI with JWT-secured per-user data isolation.

🛠 Tech Stack

Backend

	Technology	Purpose
	FastAPI 0.115+	Async REST API framework
	Python 3.11	Runtime environment
	SQLite + SQLAlchemy	User & document metadata storage
	JWT + Passlib	Authentication & authorization
	LangChain	RAG orchestration
	ChromaDB	Persistent vector store (per-user)
	HuggingFace Hub	LLM inference API

Frontend

	Technology	Purpose
	Next.js 16	React framework (App Router)
	Tailwind CSS v4	Utility-first styling
	shadcn/ui	Accessible component library
	TypeScript	Type-safe frontend
	react-pdf	In-browser PDF viewer
	react-markdown + GFM	Markdown-rendered AI responses

AI / ML Pipeline

	Technology	Purpose
	all-MiniLM-L6-v2	Local sentence embeddings
	ms-marco-MiniLM-L-6-v2	Cross-encoder reranker
	Qwen2.5-72B-Instruct	LLM (HuggingFace Inference API)
	PyMuPDF + python-docx	Document parsing

DevOps & Tooling

	Technology	Purpose
	Docker Multi-Stage	Containerized deployment
	GitHub Actions	CI pipeline (dev branch)
	Git LFS	Binary asset management
	HuggingFace Spaces	Production deployment

✨ Key Features

👤 Users

🔐 JWT-secured register & login
📄 Upload PDF and DOCX documents
💬 Ask questions in natural language
🌊 Streaming AI responses token-by-token
📚 Inline source citations with page numbers
🗂️ Per-user complete data isolation

🤖 RAG Pipeline

🔪 Smart recursive text chunking (configurable size & overlap)
🧠 Local embeddings — no data leaves your machine
🔍 Two-stage retrieval — semantic search → cross-encoder rerank
✂️ Top-K filtering for precision answers
📝 Custom system prompts with citation instructions
🧾 Source scoring with confidence levels

⚙️ Engineering

🚀 Async FastAPI with Server-Sent Events streaming
🗄️ ChromaDB with persistent per-user collections
🐳 Multi-stage Docker build (Node → Python)
🔄 GitHub Actions CI on dev branch
🛡️ CORS, file validation, JWT expiry
📊 Chat history persistence per document

📁 Project Structure

PDF-Assistant-RAG/
│
├── backend/                          # FastAPI + RAG server
│   ├── app/
│   │   ├── main.py                   # App entrypoint, middleware, static files
│   │   ├── config.py                 # Pydantic settings (env vars)
│   │   ├── database.py               # SQLAlchemy async engine
│   │   ├── models.py                 # ORM models (User, Document, Message)
│   │   ├── schemas.py                # Pydantic request/response schemas
│   │   ├── auth.py                   # JWT creation & verification
│   │   │
│   │   ├── routes/
│   │   │   ├── auth.py               # POST /register, /login, /me
│   │   │   ├── documents.py          # Upload, list, delete, retrieve
│   │   │   └── chat.py               # Streaming chat + history
│   │   │
│   │   └── rag/
│   │       ├── agent.py              # Main RAG orchestrator
│   │       ├── chunker.py            # Recursive text splitter
│   │       ├── embeddings.py         # SentenceTransformer wrapper
│   │       ├── vectorstore.py        # ChromaDB collection manager
│   │       ├── retriever.py          # Semantic search + reranking
│   │       └── prompts.py            # System & user prompt templates
│   │
│   ├── requirements.txt
│   └── .env                          # Local env (never committed)
│
├── frontend/                         # Next.js 16 App Router
│   └── src/
│       ├── app/
│       │   ├── layout.tsx            # Root layout + fonts
│       │   ├── page.tsx              # Landing / redirect
│       │   ├── login/                # Auth pages
│       │   ├── register/
│       │   └── dashboard/            # Main app page
│       │
│       ├── components/
│       │   ├── chat/
│       │   │   ├── ChatPanel.tsx     # Chat UI + SSE streaming
│       │   │   ├── MessageBubble.tsx # User / assistant message
│       │   │   └── SourceCard.tsx    # Citation cards
│       │   ├── document/             # Upload + sidebar components
│       │   └── layout/               # Navbar, sidebar shell
│       │
│       └── lib/
│           └── api.ts                # Typed API client + SSE stream helper
│
├── .github/
│   ├── workflows/
│   │   ├── ci.yml                    # CI — runs on dev branch only
│   │   ├── deploy.yml                # Docker build — main branch only
│   │   └── devsecops.yml             # Security scans — main branch only
│   ├── ISSUE_TEMPLATE/               # Bug report & feature request forms
│   ├── pull_request_template.md      # PR checklist
│   └── CODEOWNERS                    # Auto-review assignment
│
├── Dockerfile                        # Multi-stage: Node build → Python serve
├── docker-compose.yml                # Local Docker stack
├── CONTRIBUTING.md                   # GSSOC contributor guide
└── .env.example                      # Template for environment variables

🚀 Getting Started

Prerequisites

Python 3.11+
Node.js 20+
HuggingFace account (free) for LLM inference

1. Clone the Repository

git clone https://github.com/param20h/PDF-Assistant-RAG.git
cd PDF-Assistant-RAG

2. Configure Environment

cp .env.example backend/.env

Edit backend/.env:

SECRET_KEY=your-strong-random-secret
DATABASE_URL=sqlite:///./data/app.db
HF_TOKEN=hf_your_huggingface_token_here
UPLOAD_DIR=./data/uploads
CHROMA_PERSIST_DIR=./data/chroma_db

Get your free HuggingFace token at huggingface.co/settings/tokens

3. Run Locally

Open two terminals:

# Terminal A — Backend
cd backend
python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000
# → API running at http://localhost:8000
# → Swagger docs at http://localhost:8000/docs

# Terminal B — Frontend
cd frontend
npm install
npm run dev
# → App running at http://localhost:3000

4. Run with Docker

docker compose up --build
# → Full stack at http://localhost:7860

🧠 RAG Pipeline

                    ┌─────────────────────────────────────────────┐
                    │              PDF / DOCX Upload               │
                    └───────────────────┬─────────────────────────┘
                                        │
                                        ▼
                    ┌─────────────────────────────────────────────┐
                    │         PyMuPDF / python-docx Parser         │
                    │         (text extraction per page)           │
                    └───────────────────┬─────────────────────────┘
                                        │
                                        ▼
                    ┌─────────────────────────────────────────────┐
                    │      Recursive Character Text Splitter       │
                    │   chunk_size=1000  |  overlap=200            │
                    └───────────────────┬─────────────────────────┘
                                        │
                                        ▼
                    ┌─────────────────────────────────────────────┐
                    │    all-MiniLM-L6-v2  (local embeddings)      │
                    │    384-dim dense vectors                      │
                    └───────────────────┬─────────────────────────┘
                                        │
                                        ▼
                    ┌─────────────────────────────────────────────┐
                    │   ChromaDB  — per-user persistent collection │
                    └─────────────────────────────────────────────┘

                              ── At Query Time ──

  User Question ──▶ Embed ──▶ Semantic Search (Top-K=10)
                                        │
                                        ▼
                         Cross-Encoder Reranker (Top-K=5)
                         ms-marco-MiniLM-L-6-v2
                                        │
                                        ▼
                    Prompt Assembly (system + context + question)
                                        │
                                        ▼
                    Qwen2.5-72B-Instruct (HF Inference API)
                                        │
                                        ▼
                    Streamed SSE tokens ──▶ Frontend ChatPanel

📡 API Reference

Method	Endpoint	Auth	Description
`POST`	`/api/v1/auth/register`	❌	Create a new user account
`POST`	`/api/v1/auth/login`	❌	Login and receive JWT token
`GET`	`/api/v1/auth/me`	✅	Get current user profile
`POST`	`/api/v1/documents/upload`	✅	Upload PDF/DOCX and trigger indexing
`GET`	`/api/v1/documents`	✅	List all documents for current user
`DELETE`	`/api/v1/documents/{id}`	✅	Delete a document and its vector data
`POST`	`/api/v1/chat/ask/stream`	✅	Ask a question (SSE streaming response)
`GET`	`/api/v1/chat/history/{doc_id}`	✅	Get chat history for a document
`DELETE`	`/api/v1/chat/history/{doc_id}`	✅	Clear chat history for a document
`GET`	`/health`	❌	Health check (db + chroma status)

Full interactive docs available at /docs (Swagger UI) when running locally.

📦 Environment Variables

Variable	Required	Default	Description
`HF_TOKEN`	✅	—	HuggingFace API token for LLM inference
`SECRET_KEY`	✅	—	JWT signing secret (use a strong random string)
`DATABASE_URL`	❌	`sqlite:///./data/app.db`	SQLAlchemy database URL
`UPLOAD_DIR`	❌	`./data/uploads`	Directory for uploaded files
`CHROMA_PERSIST_DIR`	❌	`./data/chroma_db`	ChromaDB persistence path
`LLM_MODEL`	❌	`Qwen/Qwen2.5-72B-Instruct`	HuggingFace model ID
`LLM_TEMPERATURE`	❌	`0.3`	LLM sampling temperature
`LLM_MAX_NEW_TOKENS`	❌	`1024`	Max tokens per response
`EMBEDDING_MODEL`	❌	`all-MiniLM-L6-v2`	SentenceTransformer model
`CHUNK_SIZE`	❌	`1000`	Document chunk size (characters)
`CHUNK_OVERLAP`	❌	`200`	Overlap between chunks
`TOP_K_RETRIEVAL`	❌	`10`	Candidates retrieved from vector store
`TOP_K_RERANK`	❌	`5`	Final chunks passed to LLM after reranking
`MAX_FILE_SIZE_MB`	❌	`50`	Maximum upload file size

📜 Scripts

Backend (`backend/`)

Command	Description
`uvicorn app.main:app --reload`	Start FastAPI with hot reload
`uvicorn app.main:app --port 8000`	Start FastAPI on port 8000

Frontend (`frontend/`)

Command	Description
`npm run dev`	Start Next.js dev server
`npm run build`	Production build → `out/` (static export)
`npm run lint`	Run ESLint

Docker

Command	Description
`docker compose up --build`	Build and start the full stack
`docker compose down`	Stop all containers

🌐 Deployment

This project is deployed on HuggingFace Spaces using Docker.

HuggingFace Spaces

Fork this repo and create a new Space at huggingface.co/new-space (SDK: Docker)
Set the following Space secrets:
- HF_TOKEN — your HuggingFace API token
- SECRET_KEY — a strong random string
Push to the hf remote — the Space will auto-build

git remote add hf https://<username>:<HF_TOKEN>@huggingface.co/spaces/<username>/<space-name>
git push hf main

Self-Hosted / VPS

docker compose up -d --build
# App available at http://your-server:7860

🤝 Contributing — GSSOC

This project is participating in GirlScript Summer of Code! We welcome contributors of all skill levels.

Branch Strategy:

Branch	Purpose
`main`	Production — HuggingFace deployed (admin only)
`dev`	All contributor PRs target here
`feature/` / `fix/` / `docs/*`	Your working branches

# Always branch from dev
git checkout -b feature/my-feature upstream/dev

Quick links:

📄 License

Distributed under the MIT License. See LICENSE for more information.

Built with 💙 as a flagship AI engineering project

If you found this project helpful, please give it a ⭐ — it helps GSSOC contributors discover it!

⬆ Back to top

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github		.github
backend		backend
frontend		frontend
instance		instance
static		static
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.MD		CHANGELOG.MD
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
config.py		config.py
docker-compose.yml		docker-compose.yml
license		license
make_admin.py		make_admin.py
models.py		models.py
render.yaml		render.yaml
requirements.txt		requirements.txt
start.sh		start.sh
users.db		users.db

Folders and files

Latest commit

History

Repository files navigation

Enterprise Agentic Retrieval-Augmented Generation System

🤝 Contributors

🌟 Overview

🛠 Tech Stack

Backend

Frontend

AI / ML Pipeline

DevOps & Tooling

✨ Key Features

👤 Users

🤖 RAG Pipeline

⚙️ Engineering

📁 Project Structure

🚀 Getting Started

Prerequisites

1. Clone the Repository

2. Configure Environment

3. Run Locally

4. Run with Docker

🧠 RAG Pipeline

📡 API Reference

📦 Environment Variables

📜 Scripts

Backend (backend/)

Frontend (frontend/)

Docker

🌐 Deployment

HuggingFace Spaces

Self-Hosted / VPS

🤝 Contributing — GSSOC

📄 License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Backend (`backend/`)

Frontend (`frontend/`)

Packages