Add BEIR benchmark evaluation script

## Task

Add a benchmark evaluation script that runs VORTEXRAG against the standard BEIR benchmark suite.

**What's needed:**
- Script at `benchmarks/eval_beir.py`
- Load BEIR datasets via the `beir` library
- Run VORTEXRAG on each dataset
- Output a results table: dataset → NDCG@10, Recall@100, MAP

**Relevant BEIR datasets:** MSMARCO, NQ, HotpotQA, FiQA, SCIDOCS, FEVER, SciFact

**Skills needed:** Python, familiarity with BEIR

Reference: https://arxiv.org/abs/2104.08663

**Comment below if you'd like this assigned to you!**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add BEIR benchmark evaluation script #3

Task

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add BEIR benchmark evaluation script #3

Description

Task

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions