Skip to content

feat(search): rank results by relevance, surfacing name/title matches first#209

Open
SimplyThomas wants to merge 1 commit into
mainfrom
worktree-search-ranking-weights
Open

feat(search): rank results by relevance, surfacing name/title matches first#209
SimplyThomas wants to merge 1 commit into
mainfrom
worktree-search-ranking-weights

Conversation

@SimplyThomas

Copy link
Copy Markdown
Owner

What this changes

Adds relevance ranking to the finder's free-text search. Previously search was a pure boolean substring filter and matching saints were ordered by feast day, so searching a name or title surfaced whoever had the earliest feast — e.g. "Theotokos" ranked the Virgin Mary #2 and "Virgin Mary" ranked her #5, behind saints who merely mention the words in their notes or carry "Virgin" in a rank. (The search haystack mixes name + Also-Known-As + every facet value + brief + notes into one string, so any of them matched equally.)

New scoreMatch() grades where the query matched — display name (exact > prefix > whole-word > substring) far above Also-Known-As, above name variants, above a deep-haystack-only hit. sortByRelevance() orders by that score with the reader's chosen Sort dropdown as the tiebreak. The finder uses it whenever a query is present; with the box empty, the Sort dropdown drives order exactly as before.

Match location Score
Exact name 100
Name prefix / whole-word / substring 90 / 80 / 70
All query words in name (not contiguous) 65
Also Known As (exact→substring) 50–35
Name variant only 20
Facets / brief / notes only 1

Result, verified against the real 2740-saint dataset: the Virgin Mary (OS-0001) now ranks #1 for both "Theotokos" (was #2) and "Virgin Mary" (was #5). Runners-up are sensible name matches (e.g. "Righteous Mary, grandmother of the Theotokos").

This directly supports the "find a saint they share a name with" discovery path (CLAUDE.md §1).

Preview

🔎 Preview: (paste the Cloudflare Pages URL once its check is green)

Checklist

  • Edited only source (src/) — no generated public//dist/ files.
  • New controlled-vocabulary terms were added to data/vocabulary.csv first. (n/a — no vocab change)
  • make validate is CLEAN — zero violations. (n/a — no data/build.py change; data.json unchanged)
  • make test passes (if build.py changed). (n/a — build.py untouched)
  • New saints were added with blank Saint IDs. (n/a — no new saints)
  • Sources filled; no fabricated facts; no copyrighted hymns/images. (n/a — no data change)
  • I dedicate my data contributions to the public domain under CC0 1.0. (n/a — code-only)

Notes for review

  • Frontend-only change (src/lib/filter.ts, src/islands/finder.client.ts, + new src/lib/filter.test.ts). No data, no build.py, no schema changes.
  • TDD: 8 new Vitest ranking tests (written failing first); full unit suite 17/17 green; ESLint + Prettier clean; astro build succeeds (2768 pages).
  • Playwright e2e was not run locally (slow, needs a served build); the change doesn't alter DOM structure, so CI's frontend gate will cover it.
  • Design choice: when a query is present, relevance overrides the Sort dropdown (the dropdown then acts as the tiebreak). With the box empty, the dropdown behaves as before.

🤖 Generated with Claude Code

… first

The finder's free-text search was a pure boolean substring filter: a saint
either contained all query tokens somewhere in its haystack or it didn't, with
no notion of where the match landed. Matching saints were then ordered by feast
day, so searching a name or title surfaced whoever had the earliest feast — e.g.
'Theotokos' ranked the Virgin Mary #2 and 'Virgin Mary' ranked her #5, behind
saints who merely mention the words in their notes or carry 'Virgin' in a rank.

Add scoreMatch(): a tiered relevance score grading WHERE the query matched —
display name (exact > prefix > whole-word > substring) far above Also-Known-As,
above name variants, above a deep-haystack-only hit (the haystack mixes in every
facet value, brief, and notes). sortByRelevance() orders by that score with the
reader's chosen Sort mode as the tiebreak. The finder uses it whenever a query
is present; with the box empty, the Sort dropdown drives order as before.

The Virgin Mary now ranks #1 for both 'Theotokos' and 'Virgin Mary'.
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying orthodox-saints with  Cloudflare Pages  Cloudflare Pages

Latest commit: fbfdf71
Status: ✅  Deploy successful!
Preview URL: https://09f6b2a4.orthodox-saints.pages.dev
Branch Preview URL: https://worktree-search-ranking-weig.orthodox-saints.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant