GitHub - Jayanth710/Document-QnA

DocuMind is an intelligent Q&A platform designed to solve a critical problem with modern Large Language Models (LLMs): their inability to answer questions about private, specific documents. The project addresses the tendency of general-purpose LLMs to "hallucinate" or provide incorrect answers when queried on information they haven't been trained on.

By allowing users to upload their own files (PDFs, DOCX, TXT), DocuMind creates a secure, personalized knowledge base. It then uses a Retrieval-Augmented Generation (RAG) architecture to provide accurate, context-aware answers drawn directly from the user's content, effectively turning a general LLM into a specialized expert on your documents.

Project Overview:

Advanced RAG Pipeline: Engineered an end-to-end RAG system that improved response relevance by 30%. The pipeline processes user documents through: • Semantic Chunking: Intelligently breaking down documents into meaningful, context-rich chunks. • Vectorization & Storage: Converting text chunks into vector embeddings and storing them in a Weaviate vector database for efficient searching. • Retrieval & Re-ranking: Upon a user query, the system retrieves the most relevant chunks using semantic search and re-ranks them to prioritize the best context for the LLM.

Flexible LLM Integration: Built a modular backend with an Ollama adapter, allowing for "plug-and-play" support of various powerful, open-source LLMs like Llama 3 and Mistral.

Engineered a dynamic and responsive frontend using Next.js and TailwindCSS, creating a seamless user experience inspired by modern AI assistants. The interface features a real-time, conversational chat with streaming responses for fluid interaction. It also includes a multi-document, drag-and-drop upload system with progress indicators, allowing users to effortlessly create and query their personalized knowledge base. DocuMind is an intelligent Q&A platform designed to solve a critical problem with modern Large Language Models (LLMs): their inability to answer questions about private, specific documents. The project addresses the tendency of general-purpose LLMs to "hallucinate" or provide incorrect answers when queried on information they haven't been trained on. By allowing users to upload their own files (PDFs, DOCX, TXT), DocuMind creates a secure, personalized knowledge base. It then uses a Retrieval-Augmented Generation (RAG) architecture to provide accurate, context-aware answers drawn directly from the user's content, effectively turning a general LLM into a specialized expert on your documents. Project Overview: Advanced RAG Pipeline: Engineered an end-to-end RAG system that improved response relevance by 30%. The pipeline processes user documents through: • Semantic Chunking: Intelligently breaking down documents into meaningful, context-rich chunks. • Vectorization & Storage: Converting text chunks into vector embeddings and storing them in a Weaviate vector database for efficient searching. • Retrieval & Re-ranking: Upon a user query, the system retrieves the most relevant chunks using semantic search and re-ranks them to prioritize the best context for the LLM. Flexible LLM Integration: Built a modular backend with an Ollama adapter, allowing for "plug-and-play" support of various powerful, open-source LLMs like Llama 3 and Mistral. Engineered a dynamic and responsive frontend using Next.js and TailwindCSS, creating a seamless user experience inspired by modern AI assistants. The interface features a real-time, conversational chat with streaming responses for fluid interaction. It also includes a multi-document, drag-and-drop upload system with progress indicators, allowing users to effortlessly create and query their personalized knowledge base. Skills: Python (Programming Language) · Flask · React.js · Fast API · Ollama · Google Cloud Platform (GCP) · Docker · Large Language Models (LLM) · Next.js · React

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
bin		bin
data		data
databases		databases
models-download		models-download
uploads		uploads
.dockerignore		.dockerignore
.env		.env
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
requirements-1.txt		requirements-1.txt
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages