Skip to content

Prioritize exact substring matches over fuzzy-only matches#484

Open
tavian-dev wants to merge 1 commit into
cantino:masterfrom
tavian-dev:exact-match-priority
Open

Prioritize exact substring matches over fuzzy-only matches#484
tavian-dev wants to merge 1 commit into
cantino:masterfrom
tavian-dev:exact-match-priority

Conversation

@tavian-dev
Copy link
Copy Markdown

Summary

When fuzzy matching is enabled (MCFLY_FUZZY), commands containing the exact search string as a substring are now always ranked above fuzzy-only matches. This directly addresses the core issue: exact matches getting buried by fuzzy results.

How it works

The fuzzy re-ranking comparator now has two tiers:

  1. Exact matches first: If command A contains the search string as an exact substring and command B doesn't, A always ranks higher — regardless of the fuzzy position/length scoring.

  2. Existing logic within tiers: When both commands are exact matches, or both are fuzzy-only, the existing ranking applies (shorter matches, earlier position, neural network rank, fuzzy factor weighting).

Case sensitivity follows existing behavior: if the search input contains uppercase characters, exact matching is case-sensitive; otherwise case-insensitive.

Example

Searching for git push with MCFLY_FUZZY=2:

Before: git pull --set-upstream-history could rank above git push origin main because the fuzzy position/length heuristic favored it.

After: git push origin main (exact substring match) always ranks above git pull --set-upstream-history (fuzzy-only match). Within exact matches, the neural network rank + fuzzy heuristics still determine order.

Testing

  • All 21 existing tests pass
  • No clippy warnings
  • Builds cleanly on stable Rust

Fixes #183

When fuzzy matching is enabled, commands containing the exact search
string as a substring are now always ranked above fuzzy-only matches.
This preserves the benefit of fuzzy matching (finding commands you
don't exactly remember) while ensuring that the most relevant results
aren't buried.

Within each tier (exact or fuzzy), the existing ranking logic is
preserved: shorter and earlier matches are preferred, weighted by
the configurable fuzzy factor on top of the neural network rank.

Case sensitivity follows the existing behavior: if the search input
contains uppercase characters, exact match checking is case-sensitive;
otherwise it is case-insensitive.

Fixes cantino#183
@cantino
Copy link
Copy Markdown
Owner

cantino commented May 9, 2026

Thanks @tavian-dev!

What do you think of this @dmfay?

@dmfay
Copy link
Copy Markdown
Contributor

dmfay commented May 9, 2026

I like the idea! The comment it blew away describing the weighting in detail is important though -- overeager LLM?

@cantino
Copy link
Copy Markdown
Owner

cantino commented May 13, 2026

Any chance you could test it @dmfay and make sure it still works for your mcfly use? I don't use fuzzy match.

Comment thread src/history/history.rs
names = names
.into_iter()
.sorted_unstable_by(|a, b| {
// Fuzzy matches impose new ordering criteria on top of the
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add this comment back?

@dmfay
Copy link
Copy Markdown
Contributor

dmfay commented May 14, 2026

having taken it for a spin:

  • the exact matching is checked after the query is limited to MCFLY_RESULTS. In other words, it ranks exact matches within the current search results instead of prioritizing exact matches in the search itself. If your search has more exact matches than MCFLY_RESULTS allows, it's difficult to distinguish the effect from that of cranking MCFLY_FUZZY up. I'm not sure there's a way around this outside making both an exact and a fuzzy query and combining the results, but even that runs into the issue of figuring out how much to take from each bucket, because:
  • "all exact and fuzzies fill room left over" makes it much harder to formulate fuzzy queries! This goes as well for this branch; it's great for searching up vi src, but if you're trying to skip to a file you know, vi h returns all the completely different files starting with h available (within the MCFLY_RESULTS limit here, but in a dual-query world would be all on the system) before it gets to src/history/history.rs. Any solution where exact matches always and automatically win opens up this problem where in order to make a fuzzy search you have to twist yourself in knots to avoid exact matches.

I still like the idea of thumbing the scale harder for exact matches above and beyond MCFLY_FUZZY but it's a longer way off than I thought coming into this fresh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Option to priorize exact matches over fuzzy ones

3 participants