An AI-powered document question-answering app built with Retrieval-Augmented Generation (RAG).
Upload any PDF and ask natural language questions β the app finds the most relevant sections and answers using LLaMA 3.3.
π Try it here
- LangChain β RAG pipeline orchestration
- FAISS β Vector database for semantic search
- Groq (LLaMA 3.3 70B) β LLM for answer generation
- HuggingFace Embeddings β Local text embeddings (all-MiniLM-L6-v2)
- Streamlit β Frontend UI
- PyMuPDF β PDF text extraction
- Upload a PDF β text is extracted using PyMuPDF
- Text is split into 300-character chunks with 80-character overlap
- Chunks are embedded and stored in a FAISS vector index
- User asks a question β top 6 relevant chunks are retrieved
- Chunks + question are sent to LLaMA 3.3 via Groq API
- Answer is displayed with source chunks for transparency
git clone https://github.com/deekshith-8/rag-doc-qa.git
cd rag-doc-qa
pip install -r requirements.txtCreate a .env file:
Run the app:
streamlit run app.py