Add a rerank stage to search: ~half of LoCoMo benchmark misses are ranking failures, and there is ~20x latency headroom

## Context: fresh benchmark run (2026-06-10)

Full LoCoMo retrieval run via `basic-memory-benchmarks` (run `3e11241b9d56`, 1,986 queries): Basic Memory main @ 0.21.6 (commit de53e0ec) vs mem0ai 2.0.5, both at current latest with out-of-the-box defaults. Reproducible via the benchmarks repo with basic-memory-benchmarks#13 and basic-memory-benchmarks#14 applied.

Headline (LoCoMo categories 1–4):

| | recall@5 | recall@10 | MRR | content-hit | mean latency | p95 |
|---|---|---|---|---|---|---|
| bm-local | 0.733 | 0.839 | 0.619 | 0.277 | **45ms** | 53ms |
| mem0-local | 0.791 | 0.891 | 0.648 | 0.344 | 882ms | 1,603ms |

(Good news vs. the earlier benchmark issue basic-memory-benchmarks#2: content-hit went from 15.5% to ~30% overall since February.)

## The finding

Head-to-head per query, BM uniquely missed 281 queries (recall@5) that mem0 got; mem0 uniquely missed 160. Decomposing BM's 281 misses:

- **135 (~48%) are ranking failures, not retrieval failures** — the gold doc IS in BM's results, just at rank 6–10 (clustered at 6–8), or partially retrieved for multi-doc answers. recall@10 is already 0.843; the problem is converting it to recall@5.
- 146 are true top-10 misses (separate issue on entity boosting).

## Proposal

Add an optional rerank stage over the top-N (e.g. 20) hybrid candidates before final ordering. BM answers in 45ms mean vs mem0's 882ms — a cross-encoder reranker (fastembed ships bge/jina reranker models, consistent with the existing fastembed dependency) costs roughly 50–150ms on this corpus size, leaving BM still ~5–10x faster while directly targeting the largest single bucket of quality loss. Converting even most of the 135 rank-6–10 misses flips the headline comparison.

Worth a diagnostic pass on hybrid fusion weights at the same time — near-misses clustered just below the cutoff suggest the semantic/keyword legs may fuse suboptimally for short queries.

Measurable commit-to-commit with the benchmarks repo's worktree workflow (`--bm-local-path` + deterministic run IDs).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a rerank stage to search: ~half of LoCoMo benchmark misses are ranking failures, and there is ~20x latency headroom #950

Context: fresh benchmark run (2026-06-10)

The finding

Proposal

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

	recall@5	recall@10	MRR	content-hit	mean latency	p95
bm-local	0.733	0.839	0.619	0.277	45ms	53ms
mem0-local	0.791	0.891	0.648	0.344	882ms	1,603ms

Add a rerank stage to search: ~half of LoCoMo benchmark misses are ranking failures, and there is ~20x latency headroom #950

Description

Context: fresh benchmark run (2026-06-10)

The finding

Proposal

Metadata

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Issue actions