📚 Ask My Docs

A production-grade Retrieval-Augmented Generation (RAG) application that lets you chat with your documents using AI. Upload PDFs, Markdown, or text files and ask questions - the system will search through your documents and provide accurate answers with source citations.

What This Project Does

This is an intelligent document Q&A system that:

📄 Ingests your documents - Upload PDFs, Markdown (.md), or text files
🔍 Smart search - Uses hybrid retrieval combining keyword (BM25) and semantic (vector) search
🎯 Accurate answers - Reranks results for precision and generates answers using AI
📎 Source citations - Every answer includes references to specific document chunks
💬 Chat interface - Clean, modern UI with conversation history

Technology Stack

FREE APIs (no credit card required!):

Groq API - Ultra-fast LLM inference with Llama 3.3 70B (completely free!)
Sentence-Transformers - Local embeddings, runs on your machine (no API needed!)
Cohere - Cross-encoder reranking (has free tier)

Framework & Libraries:

LangChain - RAG pipeline orchestration
FastAPI - Backend REST API
Streamlit - Interactive web UI
ChromaDB - Vector database for semantic search
BM25 - Keyword search algorithm
Python 3.13 - Core language

How It Works

User Question
    │
    ▼
┌──────────────────────────┐
│     Hybrid Retrieval     │
│  ┌─────────┬───────────┐ │
│  │  BM25   │  Vector   │ │
│  │(keyword)│(semantic) │ │
│  └────┬────┴─────┬─────┘ │
│       └────┬─────┘       │
│    Reciprocal Rank       │
│       Fusion (RRF)       │
└───────────┬──────────────┘
            ▼
┌──────────────────────────┐
│  Cross-Encoder Reranker  │
│  (Cohere rerank-v3.0)    │
└───────────┬──────────────┘
            ▼
┌──────────────────────────┐
│   LLM Generation with    │
│  Citation Enforcement    │
│   (Groq Llama 3.3 70B)   │
└───────────┬──────────────┘
            ▼
  Answer + [Source: file, Chunk N]

Document Ingestion - Your documents are split into chunks and indexed
Hybrid Search - When you ask a question, both keyword and semantic search run in parallel
Fusion - Results are merged using Reciprocal Rank Fusion for better relevance
Reranking - A cross-encoder model reranks the top results for maximum precision
Generation - The LLM generates an answer based on the most relevant chunks
Citations - Every answer includes source references so you can verify the information

Features

Hybrid Retrieval — Combines BM25 keyword search with dense vector search for best results
Cross-Encoder Reranking — Uses Cohere's reranker to boost precision
Citation Enforcement — Every answer includes traceable [Source: file, Chunk N] references
Chat History — Save and load your conversations
Modern UI — Clean Streamlit interface with expandable citation cards
REST API — FastAPI backend with /ask, /ingest, and /health endpoints
100% FREE — Uses Groq API (free) and local embeddings (no API costs!)
Easy Setup — One-click run with RUN_PROJECT.bat

Here's quick view of my project works

Quick Start (Windows)

Prerequisites

Python 3.13 (or 3.10+)
Git (optional, for cloning)

1. Clone or Download the Project

git clone https://github.com/2024yuva/AskMyDocs.git
cd AskMyDocs

2. Create Virtual Environment

python -m venv venv
venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

This will install all necessary packages including:

LangChain and Groq integration
Sentence-transformers for embeddings
FastAPI and Streamlit
ChromaDB and other dependencies

4. Configure API Keys

Edit the .env file in the project root:

# Get your FREE Groq API key at: https://console.groq.com/keys
GROQ_API_KEY=your_groq_api_key_here

# Optional: Get Cohere API key at: https://dashboard.cohere.com/api-keys
COHERE_API_KEY=your_cohere_api_key_here

Note: Groq API is completely free! Just sign up and get your key.

5. Run the Project

Simply double-click RUN_PROJECT.bat or run in terminal:

RUN_PROJECT.bat

This will:

Stop any running servers
Start the FastAPI backend on port 8000
Start the Streamlit UI on port 8501
Open two terminal windows (one for API, one for UI)

6. Use the Application

Open your browser to http://localhost:8501
Upload documents using the sidebar (PDF, Markdown, or text files)
Click "🔄 Ingest Documents" to process them
Start asking questions in the chat!

Stopping the Application

Press any key in the main terminal window, or close both terminal windows.

Project Structure

AskMyDocs/
├── RUN_PROJECT.bat          # ⭐ Main entry point - run this!
├── README.md                # This file
├── .env                     # Your API keys (create from .env.example)
├── .env.example             # Template for environment variables
├── requirements.txt         # Python dependencies
│
├── app/                     # Main application code
│   ├── config.py           # Configuration and environment variables
│   ├── ingest.py           # Document loading and chunking
│   ├── retriever.py        # Hybrid retrieval (BM25 + Vector) + reranking
│   ├── chain.py            # RAG pipeline orchestration
│   ├── prompts.py          # Prompt templates with citation enforcement
│   ├── chat_history.py     # Chat history management
│   │
│   ├── api/                # FastAPI backend
│   │   ├── main.py         # API endpoints
│   │   └── schemas.py      # Request/response models
│   │
│   └── ui/                 # Streamlit frontend
│       └── app.py          # Web interface
│
├── docs/                   # 📁 Put your documents here!
│   ├── rag_overview.md     # Sample documents
│   ├── langchain_guide.md
│   └── evaluation_metrics.md
│
├── tests/                  # Test suite
│   ├── test_ingest.py
│   ├── test_retriever.py
│   ├── test_chain.py
│   └── test_api.py
│
├── eval/                   # Evaluation pipeline
│   ├── golden_qa.json      # Test Q&A dataset
│   └── evaluate.py         # Ragas evaluation
│
├── chroma_db/              # Vector database (auto-generated)
├── chat_history/           # Saved conversations (auto-generated)
└── bm25_index.pkl          # BM25 index (auto-generated)

Usage Guide

Adding Documents

Place your documents in the docs/ folder
- Supported formats: PDF (.pdf), Markdown (.md), Text (.txt)
- Can organize in subfolders
Open the Streamlit UI (http://localhost:8501)
Use the sidebar to upload files or click "🔄 Ingest Documents"
Wait for ingestion to complete (you'll see a success message)

Asking Questions

Type your question in the chat input at the bottom
The system will:
- Search through your documents
- Find the most relevant chunks
- Generate an answer with citations
Click on "📎 Citations" to see which documents were used
Click on "📄 Source Documents" to see the actual text chunks

Saving Conversations

Click "💾 Save Conversation" in the sidebar
Your chat history is saved to chat_history/ folder
Click "📜 View History" to see past conversations
Load previous conversations by clicking "Load"

API Usage

You can also use the REST API directly:

# Ask a question
curl -X POST http://localhost:8000/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "What is RAG?"}'

# Trigger ingestion
curl -X POST http://localhost:8000/ingest

# Health check
curl http://localhost:8000/health

API documentation available at: http://localhost:8000/docs

Advanced Configuration

All settings can be customized via environment variables in .env:

Variable	Default	Description
`GROQ_API_KEY`	–	Groq API key (FREE at console.groq.com)
`COHERE_API_KEY`	–	Cohere API key for reranking (optional)
`LLM_MODEL`	`llama-3.3-70b-versatile`	Groq chat model
`EMBEDDING_MODEL`	`sentence-transformers/all-MiniLM-L6-v2`	Local embedding model
`CHUNK_SIZE`	`1000`	Chunk size in characters
`CHUNK_OVERLAP`	`200`	Overlap between chunks
`RETRIEVER_K`	`10`	Number of documents to retrieve
`RERANK_TOP_N`	`5`	Number of documents after reranking
`BM25_WEIGHT`	`0.5`	Weight for BM25 in hybrid search
`VECTOR_WEIGHT`	`0.5`	Weight for vector search in hybrid search

Testing & Evaluation

Run Tests

pytest tests/ -v

Run Evaluation

python eval/evaluate.py

This evaluates the RAG pipeline using Ragas metrics:

Faithfulness (answer accuracy)
Answer relevancy
Context precision
Context recall

Troubleshooting

API Server Won't Start

Check if port 8000 is already in use
Verify virtual environment is activated: venv\Scripts\activate
Check for errors in the API terminal window

Ingestion Fails

Ensure documents are in the docs/ folder
Verify your Groq API key is valid in .env
Check that sentence-transformers is installed: pip install sentence-transformers

Queries Return Errors

Make sure you've ingested documents first (click "🔄 Ingest Documents")
Verify both API server and UI are running
Check your Groq API key is correct

Out of Memory

Reduce CHUNK_SIZE in .env
Reduce RETRIEVER_K to retrieve fewer documents
Process fewer documents at once

Getting API Keys

Groq (Required - FREE!)

Go to https://console.groq.com/keys
Sign up for a free account
Create an API key
Copy to .env as GROQ_API_KEY

Cohere (Optional - FREE tier available)

Go to https://dashboard.cohere.com/api-keys
Sign up for a free account
Create an API key
Copy to .env as COHERE_API_KEY

Contributing

Contributions are welcome! Feel free to:

Report bugs
Suggest features
Submit pull requests

License

MIT

Acknowledgments

Built with:

LangChain - RAG framework
Groq - Ultra-fast LLM inference
Cohere - Cross-encoder reranking
Sentence-Transformers - Local embeddings
FastAPI - Backend framework
Streamlit - UI framework
ChromaDB - Vector database

Made with ❤️ by 2024yuva

Repository: https://github.com/2024yuva/AskMyDocs

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
app		app
docs		docs
eval		eval
tests		tests
.gitignore		.gitignore
HOW_TO_USE.md		HOW_TO_USE.md
QUICK_START.txt		QUICK_START.txt
README.md		README.md
RUN_PROJECT.bat		RUN_PROJECT.bat
SETUP_COMPLETE.md		SETUP_COMPLETE.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📚 Ask My Docs

What This Project Does

Technology Stack

How It Works

Features

Here's quick view of my project works

Quick Start (Windows)

Prerequisites

1. Clone or Download the Project

2. Create Virtual Environment

3. Install Dependencies

4. Configure API Keys

5. Run the Project

6. Use the Application

Stopping the Application

Project Structure

Usage Guide

Adding Documents

Asking Questions

Saving Conversations

API Usage

Advanced Configuration

Testing & Evaluation

Run Tests

Run Evaluation

Troubleshooting

API Server Won't Start

Ingestion Fails

Queries Return Errors

Out of Memory

Getting API Keys

Groq (Required - FREE!)

Cohere (Optional - FREE tier available)

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages