Skip to content

econexpert/LLMwithRAG

Repository files navigation

PDF Files RAG Chatbot

A Streamlit web app that enables Retrieval-Augmented Generation (RAG) on PDF documents using:

  • PDF text extraction
  • Sentence embeddings with SentenceTransformers
  • Large Language Model (LLM) for answer generation (HuggingFace transformers)
  • Vector similarity search on MongoDB Atlas
  • Chunking of long documents for efficient retrieval

Features

  • Upload one or more PDF files and index their contents by splitting into chunks and storing embeddings in MongoDB.
  • Perform semantic search over PDF content using vector similarity search.
  • Ask natural language questions and receive context-aware answers generated by a pretrained LLM.
  • Delete all indexed PDFs and embeddings from the database.
  • View already indexed PDF files in the app.
  • Supports GPU acceleration (if available) for faster model inference.

LLMRAG


System Workflow (RAG Pipeline)

flowchart TD
    A[Upload PDF] --> B[Extract Text]
    B --> C[Chunk Text]
    C --> D[Create Embeddings]
    D --> E[Store in MongoDB Vector Search]

    F[Ask Question] --> G[Embed Query]
    G --> H[Vector Search]
    H --> I[Top-k Chunks]
    I --> J[LLM Answer]

Loading

Installation

  1. Clone the repo:

    git clone https://github.com/econexpert/LLMwithRAG.git
    cd LLMwithRAG
    
  2. Add vector search index:

vectorsearch

  • Open MongoDB Atlas Console and navigate to your cluster.

  • Go to your database and then your collection (e.g., vectors).

  • Click the Indexes tab.

  • Click Create Index.

  • Choose JSON Editor mode.

  • Paste the JSON configuration above.

  • Give the index a name, for example, "vector_index".

  • Click Create to build the index.

{
  "fields": [
    {
      "numDimensions": 1024,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    }
  ]
}
  1. Create a Virtual Environment

It’s best practice to create a virtual environment to isolate your project dependencies.

On macOS/Linux:

python3 -m venv venv
source venv/bin/activate

On Windows (Command Prompt):

python -m venv venv
venv\Scripts\activate

On Windows (PowerShell):

python -m venv venv
.\venv\Scripts\Activate.ps1
  1. Install Dependencies

Make sure you have a requirements.txt file in your project folder listing the required packages. Then run:

pip install -r requirements.txt
  1. Run the Streamlit App

With the environment activated, start your app by running:

streamlit run app.py

This will launch the Streamlit server and open your app in a browser window (usually at http://localhost:8501).

Requirements

  • Python 3.8+
  • MongoDB Atlas account (free tier works)
  • CUDA-enabled GPU (optional, for faster inference)
  • Python packages listed in requirements.txt including:
    • streamlit
    • pypdf
    • sentence-transformers
    • transformers
    • torch
    • pymongo
    • langchain_text_splitters
    • numpy

Installation Flow Chart

flowchart TD
    A[Start Installation] --> B[Install Python 3.9+]
    B --> C[Create Virtual Environment]
    C --> D[Activate Virtual Environment]
    D --> E[Install Python Dependencies]
    E --> F[Install PyTorch]
    F --> G[Install Sentence-Transformers]
    G --> H[Install Streamlit]
    H --> I[Create MongoDB Atlas Account]
    I --> J[Create MongoDB Cluster]
    J --> K[Create Database and Collection]
    K --> L[Create Vector Search Index<br/>Dimensions = 384]
    L --> M[Generate MongoDB Connection String]
    M --> N[Export MONGO_URI Environment Variable]
    N --> O[First Run Downloads Embedding Model]
    O --> P[First Run Downloads LLM Model]
    P --> Q[Installation Complete]

Loading

About google/flan-t5-base

google/flan-t5-base is a fine-tuned variant of the T5 (Text-to-Text Transfer Transformer) model developed by Google, part of the FLAN family designed for improved instruction-following capabilities.

Key Features

  • Text-to-Text Framework: All tasks are framed as text input to text output.
  • Instruction Tuning: Fine-tuned on a diverse set of instructions to better understand and follow prompts.
  • Base Model Size: Provides a balanced trade-off between speed and performance.
  • Versatile Applications: Suitable for text generation, summarization, translation, question answering, and more.

Links

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages