M.Sc. Computer Science Final Year Project — Ramniranjan Jhunjhunwala College, Mumbai (2025–2026)
A fully offline, CPU-friendly RAG (Retrieval-Augmented Generation) system that reads PDF documents, answers questions using a local LLM, and supports voice interaction — no GPU, no cloud, no paid API.
| Dark Theme | Light Theme |
|---|---|
![]() |
![]() |
| Answer with Citations | Split View — PDF + Chat |
|---|---|
![]() |
![]() |
- 📄 Multi-Document Support — Upload multiple PDFs per session, query across all of them
- 🧠 RAG Pipeline — FAISS vector search + phi3:mini for document-grounded answers
- 🚫 Anti-Hallucination — Strict 4-rule prompt: LLM answers ONLY from uploaded documents
- 🎙️ Voice In + Voice Out — Speak questions, hear answers via Web Speech API
- ⚡ 3 Speed Modes — Quick (150 tokens), Standard (350), Deep (700)
- 📑 Split View — View PDF and chat side by side
- 💾 Persistent Sessions — Sessions saved to disk, resume after restart
- 📤 Export to PDF — Download full Q&A conversation as PDF
- 🌙 Dark/Light Theme — Toggle anytime
PDF Upload → PyMuPDF text extraction → 500-word chunks (50-word overlap)
→ all-MiniLM-L6-v2 embeddings (384-dim) → FAISS IndexFlatL2
User Question → same embedding model → FAISS Top-K retrieval
→ Strict anti-hallucination prompt → phi3:mini via Ollama (offline)
→ SSE token streaming → React frontend → Source citations
| Layer | Technology | Purpose |
|---|---|---|
| LLM | phi3:mini (3.8B) via Ollama | Local offline inference |
| Embeddings | all-MiniLM-L6-v2 | 384-dim semantic vectors |
| Vector Search | FAISS IndexFlatL2 | CPU-based similarity search |
| PDF Parsing | PyMuPDF (fitz) | Page-by-page text extraction |
| Backend | FastAPI + Python | REST API + SSE streaming |
| Frontend | React 18.3 + Vite 5.4 | Single-page application |
| Voice | Web Speech API + pyttsx3 | Speech input and output |
| Export | jsPDF 4.2 | Chat-to-PDF export |
Hardware: Intel Core i3+ · 8GB RAM · No GPU required · Fully offline after setup
- Python 3.10+
- Node.js 16+
- Ollama installed
git clone https://github.com/Harh2646/TellMe.ai.git
cd TellMe.aipython -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
ollama pull phi3:minifrom sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
model.save("./models/all-MiniLM-L6-v2")cd frontend
npm installTerminal 1 — Start Ollama:
ollama run phi3:miniTerminal 2 — Start Backend:
python backend/main.pyTerminal 3 — Start Frontend:
cd frontend
npm run devOpen browser → http://localhost:3000
TellMe.ai/
├── backend/
│ └── main.py
├── frontend/
│ ├── src/
│ │ ├── App.jsx
│ │ └── main.jsx
│ ├── index.html
│ ├── package.json
│ └── vite.config.js
├── docs/
│ └── screenshots/
├── requirements.txt
└── README.md
| Method | Endpoint | Description |
|---|---|---|
| POST | /sessions |
Create new session |
| GET | /sessions |
List all sessions |
| DELETE | /sessions/{id} |
Delete session |
| POST | /sessions/{id}/upload |
Upload PDF |
| POST | /sessions/{id}/query_stream |
Ask question (SSE streaming) |
| GET | /sessions/{id}/summary |
Summarize document |
| GET | /sessions/{id}/history |
Get chat history |
Harsh Sanjay Singh M.Sc. Computer Science — Ramniranjan Jhunjhunwala College, Mumbai
MIT License — see LICENSE file for details.



