feat: wire BM25F cross-field aggregation with configurable per-field weights#9
Merged
Merged
Conversation
…weights Introduce QueryNode::TermExpansion to preserve the full searched default field set separately from resolved term children. When execution sees a TermExpansion under a BM25F scorer it aggregates per-document field stats (including zero-frequency entries for searched fields absent from the posting) into a single BM25F scoring call. Explicit boolean OR remains a sum-of-child-scores operator. Add configurable per-field weights via PlanningContext::with_field_weights. Weights flow through TermExpansion into FieldStats.weight for the BM25F scorer. Fields not explicitly weighted default to 1.0. Also adds integration tests covering aggregation correctness, boost propagation, duplicate-field fallback, explicit cross-field OR, absent-field length inclusion, field weight ranking effects, and exact-score verification against hand-computed Bm25FScorer output.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
QueryNode::TermExpansionto preserve the full searched default field set separately from resolved term children, enabling aggregate scoringTermExpansionunder a BM25F scorer, it collects per-document field stats (including zero-frequency entries for absent fields) into a singleBm25FScorer::scorecallPlanningContext::with_field_weights; weights flow throughTermExpansionintoFieldStats.weight. Fields not explicitly weighted default to1.0ORremains a sum-of-child-scores operator; only planner-generated default-field expansions use aggregate scoringTest plan
cargo test --workspace)search_behavior.rsverify:Bm25FScorer::scoreoutputBm25FScorer::scoreoutputTermExpansioncarries field weights🤖 Generated with Claude Code