Fix/validate nan query vectors#1156
Conversation
✅ Deploy Preview for poetic-froyo-8baba7 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
This PR prevents NaN values in query vectors from being sent via the gRPC client, aligning gRPC behavior with the local and HTTP clients to avoid returning results with nan scores.
Changes:
- Add NaN validation to REST→gRPC vector conversions and raise an error before issuing gRPC requests.
- Add/extend tests to ensure gRPC queries (and conversion helpers) reject NaN vectors.
- Minor formatting/whitespace cleanups in tests, docs tooling, proto comments, and docs assets.
Reviewed changes
Copilot reviewed 6 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
qdrant_client/conversions/conversion.py |
Adds NaN validation for dense/sparse vectors during REST→gRPC conversion. |
tests/congruence_tests/test_query.py |
Updates congruence test to assert gRPC also rejects NaN queries. |
tests/conversions/test_validate_conversions.py |
Adds unit tests ensuring conversion helpers raise on NaN vectors. |
tests/test_migrate.py |
Minor formatting fix (trailing comma / compact init). |
tests/congruence_tests/test_sparse_updates.py |
Minor formatting simplification. |
tests/congruence_tests/test_multivector_updates.py |
Minor formatting simplification. |
tools/generate_docs.sh |
Minor formatting/line normalization. |
qdrant_client/proto/points_service.proto |
Whitespace cleanup in comments. |
docs/images/api-icon.svg |
Line normalization (no functional change). |
docs/.gitignore |
Line normalization (no functional change). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis pull request introduces NaN validation for vector conversions in the REST-to-gRPC conversion layer. A new static helper method Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 6/8 reviews remaining, refill in 11 minutes and 21 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/congruence_tests/test_query.py (1)
1550-1551: Consider adding sparse and multi-dense NaN coverage to this congruence test.The congruence test
test_query_with_nancurrently exercises only the dense vector NaN path for gRPC. Sparse and multi-dense NaN rejections are only covered at the unit level (test_validate_conversions.py). Adding end-to-end congruence assertions for those paths here would provide stronger integration confidence.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/congruence_tests/test_query.py` around lines 1550 - 1551, The congruence test only covers the dense-vector NaN path; extend test_query_with_nan to also exercise sparse and multi-dense NaN rejection by invoking grpc_client.query_points with queries that include NaN in a sparse vector and in a multi-dense payload and asserting they raise UnexpectedResponse (same as the existing dense case using COLLECTION_NAME and query), i.e., add additional pytest.raises(UnexpectedResponse) calls around grpc_client.query_points for the sparse and multi-dense query variants so the end-to-end gRPC behavior matches the unit-level tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@qdrant_client/conversions/conversion.py`:
- Around line 3601-3619: _in _validate_no_nan replace the NaN-only check with a
finite check by using math.isfinite (i.e., raise when not math.isfinite(val)) so
±Inf are rejected too, and move the lazy imports (json, Headers from httpx, and
UnexpectedResponse from qdrant_client.http.exceptions) to the module top-level
instead of importing them inside the exception branch; keep the same raised
UnexpectedResponse construction and message but use the module-level names.
---
Nitpick comments:
In `@tests/congruence_tests/test_query.py`:
- Around line 1550-1551: The congruence test only covers the dense-vector NaN
path; extend test_query_with_nan to also exercise sparse and multi-dense NaN
rejection by invoking grpc_client.query_points with queries that include NaN in
a sparse vector and in a multi-dense payload and asserting they raise
UnexpectedResponse (same as the existing dense case using COLLECTION_NAME and
query), i.e., add additional pytest.raises(UnexpectedResponse) calls around
grpc_client.query_points for the sparse and multi-dense query variants so the
end-to-end gRPC behavior matches the unit-level tests.
3aea359 to
7f42dc8
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/congruence_tests/test_multivector_updates.py (1)
56-63: Inconsistent subscript formatting betweenlocal_old_pointandremote_old_point.Line 60 now uses compact
)[0][0], butremote_old_point(lines 61–63) still uses the split)[0][\n 0\n]style. The same asymmetry exists forremote_new_point(lines 82–84).♻️ Proposed consistency fix
- remote_old_point = remote_client.scroll(COLLECTION_NAME, scroll_filter=id_filter, limit=1)[0][ - 0 - ] + remote_old_point = remote_client.scroll(COLLECTION_NAME, scroll_filter=id_filter, limit=1)[0][0]Apply the same change to
remote_new_pointat lines 82–84.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/congruence_tests/test_multivector_updates.py` around lines 56 - 63, The subscripting style is inconsistent: change the split multi-line subscripts for remote_old_point and remote_new_point to the compact form used for local_old_point (i.e., replace the current ")[0][\n 0\n]" style with ")[0][0]") so both local_* and remote_* assignments use the same compact indexing; update the occurrences for remote_old_point and remote_new_point to match local_old_point's ")[0][0]" pattern.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/congruence_tests/test_multivector_updates.py`:
- Around line 56-63: The subscripting style is inconsistent: change the split
multi-line subscripts for remote_old_point and remote_new_point to the compact
form used for local_old_point (i.e., replace the current ")[0][\n 0\n]" style
with ")[0][0]") so both local_* and remote_* assignments use the same compact
indexing; update the occurrences for remote_old_point and remote_new_point to
match local_old_point's ")[0][0]" pattern.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
98f2b54 to
e226991
Compare
|
Hey @Goodnight77 Thanks for the attempt to fix it, but I'd rather find some other solution than make a check on each and every vector in python client since it might quickly cause a performance degradation :( |
|
@joein thnx for your return, that makes sense, i will drop this approach now and think of another solution that avoids the performance cost |
description
validate query vectors for nan values in grpc client before sending to the server
before : grpc client would return points with nan scores when given vector that contains nan, while local and http clients properly rejected it
those nan values in query vectors would propagate through distance calculations, this can causes confusion for users who receives invalid results without clear error message
All Submissions:
devbranch. Did you create your branch fromdev?New Feature Submissions:
pre-commitwithpip3 install pre-commitand set up hooks withpre-commit install?Changes to Core Features: