rag-kit is a simple, modular Python library for building PDF-based RAG applications with conversational memory and flexible LLM provider support.
It is designed to hide most of the LangChain complexity behind a clean API:
from ragkit import PDFRAG
rag = PDFRAG("data/sample.pdf")
print(rag.ask("What is LangChain?"))- PDF-based RAG
- Conversational chat with session memory
- Follow-up handling for queries like:
hindi m bataotell me in englishwhat did I ask earlier?
- Query rewriting for better retrieval
- Source return support
- Configurable chunking and retrieval
- Multiple LLM provider support:
- Sarvam (default)
- OpenAI
- Anthropic / Claude
- Custom LangChain-compatible chat models
pip install rag-kitpip install "rag-kit[openai]"
pip install "rag-kit[anthropic]"
pip install "rag-kit[all]"pip install -e .Create a .env file in your project root:
SARVAM_API_KEY=
OPENAI_API_KEY=
ANTHROPIC_API_KEY=An example template is provided in .env.example.
from ragkit import PDFRAG
rag = PDFRAG("data/sample.pdf")
answer = rag.ask("What is memory?")
print(answer)from ragkit import PDFRAG
rag = PDFRAG("data/sample.pdf")
session_id = "user1"
print(rag.chat("What is memory?", session_id=session_id))
print(rag.chat("hindi m batao", session_id=session_id))
print(rag.chat("tell me in english", session_id=session_id))from ragkit import PDFRAG
rag = PDFRAG("data/sample.pdf")
result = rag.ask("What is memory?", return_sources=True)
print(result["answer"])
print(result["sources"])Example shape:
{
"answer": "Memory in LangChain stores previous conversation turns...",
"sources": [
{
"content": "Memory in chat applications is created by storing earlier conversation turns...",
"page": 2,
"source": "data/sample.pdf",
"metadata": {
"page": 2,
"source": "data/sample.pdf"
}
}
]
}| Method | Purpose |
|---|---|
ask() |
Stateless document Q&A |
chat() |
History-aware conversational interaction |
Use ask() when you want a direct answer from the document.
Use chat() when you want:
- follow-up questions
- translation of the previous answer
- history-based conversation
from ragkit import PDFRAG
rag = PDFRAG("file.pdf")from ragkit import PDFRAG
rag = PDFRAG(
"file.pdf",
llm_provider="openai",
llm_config={
"model": "gpt-4o-mini",
"temperature": 0.1,
},
)from ragkit import PDFRAG
rag = PDFRAG(
"file.pdf",
llm_provider="claude",
llm_config={
"model": "claude-3-5-haiku-latest",
"temperature": 0.2,
},
)from langchain_openai import ChatOpenAI
from ragkit import PDFRAG
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
rag = PDFRAG("file.pdf", llm=llm)from ragkit import PDFRAG, RAGConfig
config = RAGConfig(
chunk_size=800,
chunk_overlap=150,
top_k=5,
use_multi_query=True,
enable_query_rewrite=True,
)
rag = PDFRAG("file.pdf", config=config)Configurable options currently include:
persist_directorychunk_sizechunk_overlaptop_kuse_multi_queryenable_query_rewritecollection_nameverbosellm_providerllm_modelllm_temperaturellm_kwargs
rag.add_documents("data/another.pdf")rag.reset_chat("user1")rag-kit/
├── .env.example
├── .gitignore
├── README.md
├── pyproject.toml
├── examples/
├── data/
├── src/
│ └── ragkit/
└── third_party/
Not necessarily.
For modern Python packaging, pyproject.toml is enough and should be the main source of dependencies.
Use requirements.txt only if you want one of these:
- easier local setup for teammates
- pinned development environment
- quick install for people who do not use packaging workflows
Keep:
pyproject.tomlas the main dependency file
Optional:
requirements-dev.txtfor local development and testing
Example requirements-dev.txt:
pytest
black
ruff
build
twineIf you want, you can also generate a plain requirements.txt, but it should not replace pyproject.toml.
- Primarily optimized for PDF-based RAG
- Sarvam support may depend on vendored or local integration setup
- No streaming support yet
- No FastAPI server or UI layer yet
- Agent support is planned, but not included in the current public API
- Better source citations
- Improved multi-file indexing isolation
- Streaming responses
- FastAPI server mode
- Playground / UI
- Agent support via
ragkit.agent
Check the examples/ folder for runnable examples such as:
basic_ask.pychat_example.pyprovider_openai.py
MIT License
Current version: 0.1.0-beta
APIs may evolve in future releases.