Skip to content

Feature/faiss index caching#5

Open
ahuou wants to merge 2 commits into
eric11eca:mainfrom
ahuou:feature/faiss-index-caching
Open

Feature/faiss index caching#5
ahuou wants to merge 2 commits into
eric11eca:mainfrom
ahuou:feature/faiss-index-caching

Conversation

@ahuou

@ahuou ahuou commented Jun 8, 2025

Copy link
Copy Markdown

🚀 Add Optional FAISS Index Caching to RAG Evaluations

📌 Summary

This PR adds the ability to reuse or save FAISS indexes in RAG mode, reducing repeated embedding and chunking overhead when running multiple evaluations on the same dataset.

✅ Key Features

  • New config option: faiss_index_path under rag_params
  • Automatically:
    • Loads existing index if available
    • Saves new index to disk for reuse
  • Fully backward compatible — if no path is provided, behavior remains unchanged

🔧 Modified Files

  • main_accelerate.py: Passes faiss_index_path from YAML
  • embed_model.py: Builds or loads FAISS index based on the given path

🧪 Tested

  • Run once: builds & saves FAISS index
  • Run again: loads index successfully and skips rebuild

📁 Example rag_model.yaml

rag_params:
  embedding_model: "BAAI/bge-base-en"
  docs_name_or_path: "your_dataset"
  top_k: 5
  num_chunks: 100000
  similarity_fn: "cosine"
  faiss_index_path: "indexes/my_index"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant