AI Agent - RAG (Retrieval-Augmented Generation) System

A comprehensive RAG implementation using LangChain, featuring document processing, vector search, and intelligent question-answering capabilities with multi-provider support.

🚀 Features

Document Processing: PDF document loading and text chunking
Vector Search: FAISS-based similarity search with multiple embedding models
Multi-Provider Support: Groq, Anthropic (Claude), and OpenAI integration
Reranking: Advanced document reranking using cross-encoders
RAG Evaluation: Comprehensive evaluation using Ragas metrics
Streamlit Interface: Web-based user interface for easy interaction
Cloud Deployment: Ready for Streamlit Cloud deployment

📋 Overview

This project implements a complete RAG pipeline with the following components:

Document Loader - PDF document processing
Text Splitter - Intelligent text chunking with overlap
Embedding Model - Multiple embedding options (HuggingFace models)
Vector Store - FAISS for efficient similarity search
Retriever - Dense and sparse retrieval methods
Reranker - Cross-encoder based document reranking
Prompt Template - Customizable prompt engineering
LLM Integration - Multiple language model providers
Chain - End-to-end RAG pipeline
Evaluator - Performance assessment using Ragas

🛠️ Installation

Local Development

Clone the repository:

git clone <repository-url>
cd AI-Agent

Install dependencies:

pip install -r requirements.txt

Set up environment variables (optional for local development): Create a .env file with the following variables:

# API Keys (optional - can be entered in the UI)
GROQ_API_KEY=your_groq_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
OPENAI_API_KEY=your_openai_api_key

Streamlit Cloud Deployment

Fork this repository to your GitHub account
Go to Streamlit Cloud
Deploy your app:
- Connect your GitHub account
- Select this repository
- Set the main file path to rag_agent_streamlit.py

Configure Secrets in Streamlit Cloud:

Go to your app's settings
Add the following secrets:

groq_api_key = "your_groq_api_key_here"
anthropic_api_key = "your_anthropic_api_key_here"
openai_api_key = "your_openai_api_key_here"

Deploy! Your app will be available at https://your-app-name.streamlit.app

📚 Usage

Basic RAG Pipeline

# Load and process documents
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

# Load PDF document
loader = PyPDFLoader("path/to/document.pdf")
docs = loader.load()

# Split documents into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, 
    chunk_overlap=200
)
texts = text_splitter.split_documents(docs)

# Create embeddings
embeddings = HuggingFaceEmbeddings(
    model_name='all-MiniLM-L6-v2',
    model_kwargs={'device': 'cpu'},
    encode_kwargs={'normalize_embeddings': True}
)

# Create vector store
vectorstore = FAISS.from_documents(texts, embeddings)

# Create retriever
retriever = vectorstore.as_retriever()

Advanced RAG with Reranking

from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CrossEncoderReranker
from langchain_community.cross_encoders import HuggingFaceCrossEncoder

# Initialize reranker
model = HuggingFaceCrossEncoder(model_name="BAAI/bge-reranker-v2-m3")
compressor = CrossEncoderReranker(model=model, top_n=3)

# Create compression retriever
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, 
    base_retriever=retriever
)

RAG Chain Implementation

from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate

# Define prompt template
template = """<|system|>
You are an assistant for question-answering tasks. 
Using the information contained in the context,
give a comprehensive answer to the question.
Respond only to the question asked, response should be concise and relevant to the question.
Provide the number of the source document when relevant.
If you don't know the answer, just say that you don't know. 
Answer in Korean. <|end|>

<|user|>
{question}<|end|>
<|assistant|>"""

prompt = PromptTemplate.from_template(template)

# Create RAG chain
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Query the system
result = chain.invoke("Your question here")

🔧 Configuration

Embedding Models

The system supports multiple embedding models:

all-MiniLM-L6-v2 - Fast and efficient (default)
jhgan/ko-sroberta-multitask - Korean language optimized

Language Models

Multiple LLM providers are supported:

Groq (Fast & Free)

from langchain_groq import ChatGroq

llm = ChatGroq(
    model="llama3-8b-8192",  # Fastest
    temperature=0,
    max_tokens=1024,
    groq_api_key=os.environ["GROQ_API_KEY"]
)

Anthropic (Claude - High Quality)

from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(
    model="claude-3-5-sonnet-20241022",  # Latest Sonnet
    temperature=0,
    max_tokens=1024,
    api_key=os.environ["ANTHROPIC_API_KEY"]
)

OpenAI (GPT Models)

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o",  # Latest GPT-4o
    temperature=0,
    max_tokens=1024,
    api_key=os.environ["OPENAI_API_KEY"]
)

📊 Evaluation

The system includes comprehensive evaluation using Ragas metrics:

Faithfulness - How well the answer is grounded in retrieved context
Answer Relevancy - How pertinent the answer is to the query
Context Precision - Relevance of retrieved contexts
Context Recall - How much relevant information is captured

from ragas import evaluate
from ragas.metrics import (
    answer_relevancy,
    faithfulness,
    context_recall,
    context_precision,
)

# Evaluate RAG performance
result = evaluate(
    dataset=dataset,
    metrics=[
        faithfulness,
        answer_relevancy,
        context_recall,
        context_precision
    ],
    llm=llm,
    embeddings=embeddings
)

🎯 Performance Metrics

Excellent/Production-Ready: >0.85
Good/Acceptable: 0.7–0.85
Needs Improvement: 0.5–0.7
Poor: <0.5

📁 Project Structure

AI-Agent/
├── agent.ipynb                 # Main RAG implementation notebook
├── rag_agent_streamlit.py     # Streamlit web interface
├── rag_with_rerank.py         # Reranking implementation
├── rerank_module.py           # Reranking utilities
├── requirements.txt           # Python dependencies
├── README.md                  # This file
├── RERANKING_GUIDE.md         # Reranking documentation
└── Amazon-2024-Annual-Report.pdf  # Sample document

🚀 Quick Start

Local Development

Launch Streamlit app:
```
streamlit run rag_agent_streamlit.py
```
Open your browser to http://localhost:8501
Select AI Provider and enter your API key
Upload a PDF and start asking questions!

Streamlit Cloud Deployment

Fork this repository on GitHub
Deploy on Streamlit Cloud
Configure API keys in the app settings
Share your app with others!

Test with Sample Document

The system comes with Amazon's 2024 Annual Report for testing.

🔍 Key Features Explained

Text Chunking Strategy

Chunk Size: 1000 characters (optimal for context preservation)
Overlap: 200 characters (prevents information loss at boundaries)
Method: Recursive character splitting (respects natural boundaries)

Retrieval Methods

Dense Retrieval: Semantic similarity using embeddings
Sparse Retrieval: Keyword-based search (BM25)
Hybrid: Combines both methods for optimal results

Reranking Benefits

Improves retrieval quality by 15-20%
Reduces noise in retrieved contexts
Better semantic understanding of query-document relationships

☁️ Deployment Guide

Streamlit Cloud (Recommended)

Prepare your repository:
- Ensure all files are committed to GitHub
- Verify requirements.txt is up to date
- Test locally first
Deploy on Streamlit Cloud:
- Go to share.streamlit.io
- Sign in with GitHub
- Click "New app"
- Select your repository and branch
- Set main file path to rag_agent_streamlit.py

Configure secrets:

In your app's settings, add these secrets:

groq_api_key = "your_groq_api_key"
anthropic_api_key = "your_anthropic_api_key"  
openai_api_key = "your_openai_api_key"

Advanced settings (optional):
- Python version: 3.8
- Memory: 1GB (default)
- Timeout: 30 seconds

Other Deployment Options

Heroku

# Add Procfile (already included)
web: streamlit run rag_agent_streamlit.py --server.port=$PORT --server.address=0.0.0.0

# Deploy
git add .
git commit -m "Deploy to Heroku"
git push heroku main

Docker

FROM python:3.8-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 8501

CMD ["streamlit", "run", "rag_agent_streamlit.py", "--server.port=8501", "--server.address=0.0.0.0"]

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📄 License

This project is open source and available under the MIT License.

🙏 Acknowledgments

LangChain for the RAG framework
HuggingFace for embedding models
FAISS for vector search
Ragas for evaluation metrics

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.devcontainer		.devcontainer
PRD		PRD
Amazon-2024-Annual-Report.pdf		Amazon-2024-Annual-Report.pdf
README.md		README.md
agent.ipynb		agent.ipynb
gemini.ipynb		gemini.ipynb
gpt_oss.ipynb		gpt_oss.ipynb
langchain.ipynb		langchain.ipynb
llama.ipynb		llama.ipynb
rag.ipynb		rag.ipynb
rag_agent_streamlit.py		rag_agent_streamlit.py
rag_with_rerank.ipynb		rag_with_rerank.ipynb
requirements.txt		requirements.txt
rerank_module.py		rerank_module.py
sample.txt		sample.txt

Folders and files

Latest commit

History

Repository files navigation

AI Agent - RAG (Retrieval-Augmented Generation) System

🚀 Features

📋 Overview

🛠️ Installation

Local Development

Streamlit Cloud Deployment

📚 Usage

Basic RAG Pipeline

Advanced RAG with Reranking

RAG Chain Implementation

🔧 Configuration

Embedding Models

Language Models

Groq (Fast & Free)

Anthropic (Claude - High Quality)

OpenAI (GPT Models)

📊 Evaluation

🎯 Performance Metrics

📁 Project Structure

🚀 Quick Start

Local Development

Streamlit Cloud Deployment

Test with Sample Document

🔍 Key Features Explained

Text Chunking Strategy

Retrieval Methods

Reranking Benefits

☁️ Deployment Guide

Streamlit Cloud (Recommended)

Other Deployment Options

Heroku

Docker

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages