QuantMinds is a financial question-answering system that combines Retrieval-Augmented Generation (RAG), agentic orchestration, human-in-the-loop review, and Signal messaging integration.
It ingests PDFs, extracts text page by page, chunks and embeds content, indexes vectors with FAISS, serves a modern Gradio chatbot UI with source citations, and offers an advanced agentic mode with 6 specialized agents for hybrid internal/external research and visualization.
- PDF ingestion with incremental sync (10-50x faster updates via MD5 hash tracking)
- Hybrid retrieval (FAISS dense + BM25 lexical with Reciprocal Rank Fusion)
- Classic RAG mode for fast, grounded answers with 83% factual accuracy
- Agentic mode with 6 specialist agents: Router, Internal, External, Synthesizer, Visualizer, Orchestrator
- Human-in-the-loop review (approve/rewrite/research actions)
- Live source panel and session history in UI
- Signal bot integration for messaging
- Real-time analytics dashboard and cost tracking
- Full execution tracing and report generation
QuantMinds/
app/
app.py # Gradio UI
evaluate.py # Evaluation runner
data/
pdfs/ # Input PDFs
corpus.json # Extracted pages
chunks.json # Chunk metadata used by retrieval
my_index.faiss # Vector index
pipeline_state.json # Change-detection state (auto-generated)
scripts/
extract.py # PDF to corpus extractor
rag_pipeline.py # Main smart-sync pipeline entrypoint
rag/
chunking.py
embedding.py
indexing.py
retrieval.py
generation.py
pipeline.py # Incremental sync + full build logic
Install dependencies from requirements.txt:
pip install -r requirements.txtRequired environment variable:
OPENAI_API_KEY=your_api_key_hereYou can place this in a .env file at project root.
The pipeline uses change detection on data/pdfs/.
- If PDFs are added, removed, or modified: rebuild extraction + chunks + embeddings + index
- If pipeline config changes: rebuild
- If nothing changed: skip rebuild and reuse existing artifacts
This is tracked in data/pipeline_state.json.
Recommended command:
python scripts/rag_pipeline.pyThis runs smart sync automatically and then performs a retrieval sanity check.
Only sync/rebuild logic (no retrieval sanity check):
python scripts/rag/pipeline.py --syncForce rebuild:
python scripts/rag/pipeline.py --sync --forceStart chatbot UI:
python app/app.pyOn startup, app.py runs smart pipeline sync first.
- PDFs changed -> rebuild pipeline
- No changes -> skip rebuild
Then the UI opens.
- Two-pane interface:
- Left: chat interaction
- Right: source PDF pages for latest answer
- Query result caching in memory for repeated questions during the same run
- Index/chunks are loaded once per server process and reused
Run evaluation suite:
python app/evaluate.pyCategories covered:
- factual
- cross-reference
- out-of-scope
- ambiguous
- no-answer
- prompt-injection
The answer generation prompt is tuned to:
- use only provided context
- consider all provided sources
- stay concise (2-3 sentences)
- refuse when context is insufficient
- ignore malicious instructions in user input/context
- cite sources
- Major cost comes from embeddings and generation calls
- Rebuilds are skipped when PDFs/config are unchanged
- Repeated identical chat queries in one app session are served from in-memory cache
- Run commands from project root (
QuantMinds/) - Use the provided entrypoints as documented
- Ensure
data/pdfs/contains PDFs - Run:
python scripts/rag_pipeline.py- Ensure
OPENAI_API_KEYis set in environment or.env
- Python standard library imports (such as
os,sys,json,argparse) are not listed inrequirements.txt - Only third-party packages belong in
requirements.txt