Task
Add a benchmark evaluation script that runs VORTEXRAG against the standard BEIR benchmark suite.
What's needed:
- Script at
benchmarks/eval_beir.py
- Load BEIR datasets via the
beir library
- Run VORTEXRAG on each dataset
- Output a results table: dataset → NDCG@10, Recall@100, MAP
Relevant BEIR datasets: MSMARCO, NQ, HotpotQA, FiQA, SCIDOCS, FEVER, SciFact
Skills needed: Python, familiarity with BEIR
Reference: https://arxiv.org/abs/2104.08663
Comment below if you'd like this assigned to you!
Task
Add a benchmark evaluation script that runs VORTEXRAG against the standard BEIR benchmark suite.
What's needed:
benchmarks/eval_beir.pybeirlibraryRelevant BEIR datasets: MSMARCO, NQ, HotpotQA, FiQA, SCIDOCS, FEVER, SciFact
Skills needed: Python, familiarity with BEIR
Reference: https://arxiv.org/abs/2104.08663
Comment below if you'd like this assigned to you!