fix: support MatchPhrase filter in local mode by ATOM00blue · Pull Request #1213 · qdrant/qdrant-client

ATOM00blue · 2026-05-22T02:30:22Z

Problem

Using a MatchPhrase condition in a filter against a local-mode client
(QdrantClient(":memory:") or local persistence) raises:

ValueError: Unknown match condition: phrase='...'

Phrase matching is supported by the server and is already handled by the
gRPC/REST converters (RestToGrpc/GrpcToRest), but check_match in
qdrant_client/local/payload_filters.py does not handle models.MatchPhrase,
so it falls through to the "Unknown match condition" error. This makes local
mode diverge from server behavior for any filter that uses a phrase match.

Minimal reproduction:

from qdrant_client import QdrantClient, models

client = QdrantClient(":memory:")
client.create_collection(
    "t", vectors_config=models.VectorParams(size=2, distance=models.Distance.COSINE)
)
client.upsert("t", [
    models.PointStruct(id=1, vector=[0.1, 0.2], payload={"text": "quick brown fox"}),
])

client.scroll(
    "t",
    scroll_filter=models.Filter(
        must=[models.FieldCondition(key="text", match=models.MatchPhrase(phrase="brown fox"))]
    ),
)  # -> ValueError: Unknown match condition

Fix

Handle MatchPhrase in check_match. The phrase is matched as a contiguous,
order-preserving sub-sequence of the field value's tokens (whitespace
tokenization, consistent with the existing MatchTextAny handling). This
matches the documented phrase semantics, e.g. "quick brown fox" is matched
by "brown fox" but not by "fox brown".

Tests

Added test_match_phrase_filter_query in
qdrant_client/local/tests/test_payload_filters.py covering contiguous
matches, wrong order, non-contiguous tokens, partial tokens, list-valued
fields and missing fields. The test fails before the change (with the
ValueError) and passes after.

Checklist

Targets the dev branch.
Added a test for the change.
pre-commit (ruff-format) and mypy pass locally; local-mode test suite passes.

Local mode raised "Unknown match condition" when a filter used MatchPhrase, even though phrase matching is supported by the server and by the gRPC/REST converters. Handle MatchPhrase in check_match by matching the phrase tokens as a contiguous, ordered sub-sequence of the field value tokens, and add tests covering it. Signed-off-by: ATOM00blue <219721791+ATOM00blue@users.noreply.github.com>

netlify · 2026-05-22T02:30:27Z

✅ Deploy Preview for poetic-froyo-8baba7 ready!

Name	Link
🔨 Latest commit	`d123317`
🔍 Latest deploy log	https://app.netlify.com/projects/poetic-froyo-8baba7/deploys/6a0fbfc1fce5b80008fe0d61
😎 Deploy Preview	https://deploy-preview-1213--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai · 2026-05-22T02:31:54Z

📝 Walkthrough

Walkthrough

This PR introduces phrase-based text matching to payload filter evaluation in the qdrant-client. A new check_phrase_match helper function tokenizes both the search phrase and the candidate value, returns True for empty phrases, rejects cases where the phrase has more tokens than the value, and checks whether the phrase token sequence occurs contiguously within the value tokens. The existing check_match function is extended to recognize models.MatchPhrase and evaluate it using the new helper, while preserving all existing match behavior. Test coverage validates correct matching of contiguous sub-phrases, rejection of wrong token order and non-contiguous tokens, handling of list-valued fields, and missing fields.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely summarizes the main change: adding support for MatchPhrase filtering in local mode, which directly addresses the bug described in the pull request.
Description check	✅ Passed	The description thoroughly explains the problem (ValueError when using MatchPhrase in local mode), provides a minimal reproduction case, details the fix implementation, and documents comprehensive test coverage.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

qdrant_client/local/tests/test_payload_filters.py (1)

192-225: ⚡ Quick win

Add a non-string payload regression case for MatchPhrase.

This test is solid, but it currently doesn’t assert behavior when text is non-string (e.g., 123), which is a realistic payload shape and guards against crashes.

Proposed test addition

 def test_match_phrase_filter_query():
@@
     # missing field does not match
     assert matches("brown fox", {"other": "value"}) is False
+
+    # non-string field does not match
+    assert matches("brown fox", {"text": 123}) is False

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@qdrant_client/local/tests/test_payload_filters.py` around lines 192 - 225,
Test test_match_phrase_filter_query lacks a regression case for non-string
payloads and may not cover crashes when the field is an int; update the test (in
test_match_phrase_filter_query and its nested matches helper usage) to include
at least one assertion where payload's "text" is a non-string (e.g., 123) and
assert that matches("brown fox", {"text": 123}) returns False (and does not
raise), ensuring MatchPhrase handling of non-string/list values is validated.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@qdrant_client/local/payload_filters.py`:
- Around line 171-172: The MatchPhrase branch currently calls
check_phrase_match(condition.phrase, value) for any non-None payload value which
will raise if value is not a str; update the MatchPhrase handler (the branch
that checks isinstance(condition, models.MatchPhrase)) to first ensure value is
an instance of str (e.g., isinstance(value, str)) and only call
check_phrase_match when it is, returning False for non-string payloads so
non-string values are safely treated as no match.

---

Nitpick comments:
In `@qdrant_client/local/tests/test_payload_filters.py`:
- Around line 192-225: Test test_match_phrase_filter_query lacks a regression
case for non-string payloads and may not cover crashes when the field is an int;
update the test (in test_match_phrase_filter_query and its nested matches helper
usage) to include at least one assertion where payload's "text" is a non-string
(e.g., 123) and assert that matches("brown fox", {"text": 123}) returns False
(and does not raise), ensuring MatchPhrase handling of non-string/list values is
validated.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2afcab61-352e-4904-b58e-ec0702ff02db

📥 Commits

Reviewing files that changed from the base of the PR and between 790328b and d123317.

📒 Files selected for processing (2)

qdrant_client/local/payload_filters.py
qdrant_client/local/tests/test_payload_filters.py

coderabbitai · 2026-05-22T02:31:57Z

+    if isinstance(condition, models.MatchPhrase):
+        return value is not None and check_phrase_match(condition.phrase, value)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard MatchPhrase against non-string payload values to avoid runtime errors.

At Line 172, check_phrase_match(..., value) is called for any non-None value, but check_phrase_match expects str and calls .split(). Non-string payloads will raise at runtime instead of evaluating to False.

Proposed fix

if isinstance(condition, models.MatchPhrase): - return value is not None and check_phrase_match(condition.phrase, value) + return isinstance(value, str) and check_phrase_match(condition.phrase, value)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if isinstance(condition, models.MatchPhrase):

return value is not None and check_phrase_match(condition.phrase, value)

if isinstance(condition, models.MatchPhrase):

return isinstance(value, str) and check_phrase_match(condition.phrase, value)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@qdrant_client/local/payload_filters.py` around lines 171 - 172, The MatchPhrase branch currently calls check_phrase_match(condition.phrase, value) for any non-None payload value which will raise if value is not a str; update the MatchPhrase handler (the branch that checks isinstance(condition, models.MatchPhrase)) to first ensure value is an instance of str (e.g., isinstance(value, str)) and only call check_phrase_match when it is, returning False for non-string payloads so non-string values are safely treated as no match.

Copilot

Pull request overview

This PR brings local-mode payload filtering in line with server behavior by adding support for models.MatchPhrase in the local filter evaluation logic.

Changes:

Add a check_phrase_match helper and handle models.MatchPhrase in check_match.
Add a dedicated local-mode test covering phrase matching semantics and edge cases.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`qdrant_client/local/payload_filters.py`	Implements local evaluation for `MatchPhrase` via whitespace tokenization and contiguous, order-preserving matching.
`qdrant_client/local/tests/test_payload_filters.py`	Adds coverage for phrase matching behavior (contiguous matches, wrong order, non-contiguous tokens, list fields, missing fields).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

        return value is not None and condition.text in value
    if isinstance(condition, models.MatchTextAny):
        return value is not None and any(word in value for word in condition.text_any.split())
+    if isinstance(condition, models.MatchPhrase):
+        return value is not None and check_phrase_match(condition.phrase, value)


joein · 2026-05-22T12:06:59Z

Hey @ATOM00blue

Thank you for pointing it out!
According to the docs when there is no full-text index, phrase matching is supposed to work as an exact substring match.

The current implementation is a bit different from the docs, so we'd need to update the PR to match Qdrant server's behaviour.

Copilot AI review requested due to automatic review settings May 22, 2026 02:30

Copilot started reviewing on behalf of ATOM00blue May 22, 2026 02:30 View session

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

Copilot AI reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: support MatchPhrase filter in local mode#1213

fix: support MatchPhrase filter in local mode#1213
ATOM00blue wants to merge 1 commit into
qdrant:devfrom
ATOM00blue:fix-matchphrase-local

ATOM00blue commented May 22, 2026

Uh oh!

netlify Bot commented May 22, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 22, 2026

Walkthrough

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

joein commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if isinstance(condition, models.MatchPhrase):
		return value is not None and check_phrase_match(condition.phrase, value)

Conversation

ATOM00blue commented May 22, 2026

Problem

Fix

Tests

Checklist

Uh oh!

netlify Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for poetic-froyo-8baba7 ready!

Uh oh!

coderabbitai Bot commented May 22, 2026

Walkthrough

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

joein commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify Bot commented May 22, 2026 •

edited

Loading