Skip to content

fix(core): repair FTS half of hybrid search for natural-language queries#994

Open
groksrc wants to merge 2 commits into
mainfrom
fix/hybrid-fts-natural-language
Open

fix(core): repair FTS half of hybrid search for natural-language queries#994
groksrc wants to merge 2 commits into
mainfrom
fix/hybrid-fts-natural-language

Conversation

@groksrc

@groksrc groksrc commented Jun 13, 2026

Copy link
Copy Markdown
Member

Summary

Hybrid search has been silently running vector-only on natural-language queries — the full-text (FTS) branch contributed zero candidates. This restores it, with a large, regression-free retrieval improvement.

Two causes in FTS query preparation (SQLite, with parallel Postgres fixes):

  1. Sentence punctuation forced exact-phrase matching. A question like When did Melanie paint a sunrise? reached FTS5 as the phrase "When did Melanie paint a sunrise?"*, which matches no document. The FTS5 tokenizer ignores that punctuation in the index anyway, so stripping it from word edges loses nothing — but leaving it disabled the entire FTS contribution. _prepare_single_term now strips ?!.,;: from the edges of multi-word query terms (interior -// for permalinks/paths untouched).

  2. No relaxation when strict all-terms-AND matched nothing. Questions rarely have every word in one document, so even after (1) the strict AND returned zero rows. The hybrid path now retries once with an OR-joined, stopword-filtered, content-term query when the strict query is empty. bm25/ts_rank still rank multi-term matches first, and fusion with the vector branch keeps relaxed lexical candidates from dominating precision.

Scope / safety

  • The relaxation is gated behind a new allow_relaxed=False parameter on SearchRepositoryBase.search; only _search_hybrid opts in.
  • Strict FTS paths (search_type=text, title, permalink, link resolution) are unchanged — the service layer keeps its own conservative fallback.
  • No config flag, default-safe. The punctuation fix is a pure improvement on every path (a phrase-quoted question returned nothing before).

How it was found

The benchmark harness produced two byte-identical rankings across 1,986 queries for two different fusion algorithms — impossible with two live retrieval sources. Instrumentation then confirmed fts=0 candidates on 40/40 sampled LoCoMo queries.

Benchmark impact

Corrected LoCoMo (1,986 queries, same index, deterministic retrieval metrics). Every category improves; no regression:

metric before after Δ
recall@5 (overall) 0.745 0.823 +7.9
MRR (overall) 0.618 0.718 +10.0
recall@5 (headline, excl. adversarial) 0.734 0.801 +6.7
MRR (headline) 0.621 0.706 +8.5

Per category (recall@5): open_domain +0.10, adversarial +0.12, multi_hop +0.05, single_hop +0.03, temporal +0.003. (Retrieval metrics only; QA-accuracy re-run is queued separately.)

Tests

  • Punctuation no longer phrase-quotes question-form queries.
  • Relaxation builds the expected OR query, drops stopwords, and respects boolean / quoted / short-query intent.
  • The hybrid opt-in surfaces a partial-overlap document while the default strict path still returns empty (isolation preserved).
  • Parallel coverage for Postgres (_relaxed_tsquery_text, punctuation, relaxed retry).
  • Full SQLite unit suite green (2968 passed, 35 skipped); ty + ruff clean.

🤖 Generated with Claude Code

Hybrid search was silently running vector-only on natural-language
queries — the FTS branch contributed zero candidates. Two causes in the
SQLite (and parallel Postgres) FTS query preparation:

1. Sentence punctuation forced phrase matching. A question like
   "When did Melanie paint a sunrise?" reached FTS5 as the exact phrase
   '"When did Melanie paint a sunrise?"*', which matches no document.
   The FTS5 tokenizer ignores this punctuation in the index, so
   stripping it from word edges loses nothing — but leaving it disabled
   the entire FTS contribution. _prepare_single_term now strips
   ?!.,;: from word edges of multi-word queries (interior characters —
   hyphens, slashes in permalinks/paths — untouched).

2. No relaxation when strict all-terms-AND matched nothing. Questions
   rarely have every word in one document, so even after (1) the
   strict AND returned zero rows. The hybrid path now retries once with
   an OR-joined, stopword-filtered, content-term query when the strict
   query is empty. bm25/ts_rank still rank multi-term matches first, and
   fusion with the vector branch keeps relaxed lexical candidates from
   dominating precision.

The relaxation is gated behind a new allow_relaxed=False parameter on
SearchRepositoryBase.search; only _search_hybrid opts in. Strict FTS
behavior (search_type=text, title, permalink, link resolution) is
unchanged — the service layer keeps its own conservative fallback.
No config flag, default-safe.

Discovered via the benchmark harness: two different fusion algorithms
produced byte-identical rankings across 1,986 queries (impossible with
two live sources), and instrumentation confirmed fts=0 on 40/40
sampled LoCoMo queries.

Benchmark impact (corrected LoCoMo, 1,986 queries, same index,
retrieval metrics — every category improves, no regression):
  recall@5  0.745 -> 0.823  (+7.9)
  MRR       0.618 -> 0.718  (+10.0)
  headline  r5 0.734 -> 0.801, MRR 0.621 -> 0.706
Largest gains on open_domain (+0.10 r5) and adversarial (+0.12 r5);
smallest on temporal (+0.003 r5 / +0.02 MRR).

Tests: punctuation no longer phrase-quotes; relaxation builds the
expected OR query and respects boolean/quoted/short-query intent; the
hybrid opt-in surfaces a partial-overlap document while the default
strict path still returns empty. Parallel coverage for Postgres. Full
SQLite unit suite green (2968 passed); ty + ruff clean.

Signed-off-by: Drew Cain <groksrc@gmail.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2462844beb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

retrieval_mode=SearchRetrievalMode.FTS,
limit=candidate_limit,
offset=0,
allow_relaxed=True,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate relaxed hybrid FTS with existing eligibility

When a HYBRID query has a strict FTS miss, this now enables OR-relaxation for every query shape, including short titles and numeric identifiers such as SPEC 16 or root note 1. The service-level relaxed FTS path explicitly rejects those cases in SearchService._is_relaxed_fts_fallback_eligible because OR-relaxing them over-broadens results; in hybrid, the relaxed FTS-only rows are then normalized up to 1.0 and can outrank the vector result the user actually needed. Please apply the same eligibility constraints before opting the hybrid FTS branch into relaxation.

Useful? React with 👍 / 👎.

CI Postgres shard caught two issues invisible to the local SQLite suite:

1. Postgres _prepare_single_term regression: the new edge-punctuation
   strip ran after special-character cleaning, so an all-special-char
   term ("()&!:") collapsed to empty and skipped the existing
   NOSPECIALCHARS:* guard, emitting a malformed ":*". Folded the strip
   into the word handlers so every guard survives, and added a
   single-word empty guard.

2. Backend-specific test assumptions. Four tests in
   test_search_repository.py (run under both backends via the
   search_repository fixture) asserted SQLite FTS5 syntax and
   SQLite-only strict-miss behavior. Postgres to_tsquery('english', ...)
   auto-strips stopwords, so "When did Melanie paint a sunrise?" already
   matches under strict AND. Made the four tests backend-aware via the
   existing is_postgres_backend() helper, and switched the relaxation
   integration test to a query with a word absent from the doc
   ("hiking") so the strict miss holds on both backends.

Reproduced and fixed against real Postgres (testcontainers): full
search test surface green on both backends (53 passed Postgres,
2968 SQLite), ruff + ty clean.

Signed-off-by: Drew Cain <groksrc@gmail.com>
@groksrc groksrc force-pushed the fix/hybrid-fts-natural-language branch from df79850 to 8d4d1f1 Compare June 13, 2026 04:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant