fix: reject non-finite numeric metadata supplied as strings#405
Open
Mubashirrrr wants to merge 1 commit into
Open
fix: reject non-finite numeric metadata supplied as strings#405Mubashirrrr wants to merge 1 commit into
Mubashirrrr wants to merge 1 commit into
Conversation
_coerce_number rejects NaN and infinite values for numeric inputs because they break JSON serialization and Postgres double precision storage, but the string branch returned float(text) directly. Strings like "inf", "-inf", "Infinity", "nan", and overflowing literals such as "1e400" (which Python parses to inf) slipped through type coercion when a field was declared as a number, corrupting the stored document. Apply the same finite-value check to the parsed string result. Extends the existing rejection test to cover string inputs; the new cases fail before and pass after the fix. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
typed_metadata._coerce_numberis used bynormalize_metadata/merge_metadatato coerce user-supplied document metadata according to a declared type. For numeric inputs it explicitly rejects NaN/infinite values:…because those values break JSON serialization (
json.dumps(float('inf'))emitsInfinity, which is invalid JSON) and Postgresdouble precisionstorage. But the string branch did not apply the same guard:So when a field is declared as
number(or an alias likeint/float) and the value arrives as a string,"inf","-inf","Infinity","nan", and overflowing literals such as"1e400"(Python parses this toinf) were coerced into non-finite floats and stored, corrupting the document row at write time. Metadata and type hints both come straight from the ingestion request body (seecore/services/v2_document_service.pyandcore/services/ingestion_service.py), so this is reachable from a normal API call.Reproduction
Fix
Apply the existing finite-value check to the parsed string result, raising the same
TypedMetadataErrorthe numeric path uses. Four lines added, one changed; valid numeric strings (including scientific notation like"1e5") are unaffected.Regression test
Extended
test_number_coercion_rejects_nan_and_infinitywith a string-input case (test_number_coercion_rejects_nan_and_infinity_strings) coveringinf,-inf,Infinity,nan, and1e400. It fails before the fix (DID NOT RAISE) and passes after. Fulltest_typed_metadata.pysuite stays green (47 passed).🤖 Generated with Claude Code