feat: add async evaluation and BM25 baseline to eval_beir.py#10
Open
vaishnavidesai09 wants to merge 1 commit into
Open
feat: add async evaluation and BM25 baseline to eval_beir.py#10vaishnavidesai09 wants to merge 1 commit into
vaishnavidesai09 wants to merge 1 commit into
Conversation
- --async-eval flag: evaluates datasets concurrently via asyncio + ThreadPoolExecutor - --bm25 flag: adds BM25 baseline columns to output table (requires rank-bm25) - --max-workers param: controls thread-pool size for async mode - print_table() extended to show side-by-side VORTEXRAG vs BM25 columns - save_csv() handles variable columns gracefully Closes vignesh2027#3 Signed-off-by: Vaishnavi Desai <vaishnavidesai957@gmail.com>
Collaborator
Author
|
Hi @vignesh2027 ! I've submitted the PR for #3. Implemented async dataset evaluation ( Validated with:
The output shows the expected side-by-side VORTEXRAG vs BM25 metrics. Looking forward to your feedback! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3
Extends
benchmarks/eval_beir.pyas suggested in #3.What's added
--async-evalflag: evaluates multiple datasets concurrently viaasyncio+ThreadPoolExecutorinstead of sequential execution--bm25flag: runs BM25 alongside VORTEXRAG and adds side-by-side comparison columns to the output table (requirespip install rank-bm25)--max-workersparameter: controls thread-pool size for async mode (default: 4)print_table()extended to display VORTEXRAG vs BM25 metrics side-by-sidesave_csv()updated to handle variable columns gracefully when BM25 comparison is enabled or disabledValidation
Executed:
Results:
The benchmark completed successfully in async mode and produced side-by-side VORTEXRAG vs BM25 metrics as expected.
Example usage
Async + BM25 comparison
Specific datasets with CSV export