Skip to content

More lenient float multipleOf validation#878

Closed
tdamsma wants to merge 1 commit intopython-jsonschema:mainfrom
tdamsma:lenient-multipleOf-float-validation
Closed

More lenient float multipleOf validation#878
tdamsma wants to merge 1 commit intopython-jsonschema:mainfrom
tdamsma:lenient-multipleOf-float-validation

Conversation

@tdamsma
Copy link
Copy Markdown

@tdamsma tdamsma commented Nov 5, 2021

Many user have ran into floating point precision errors when validation multipleOf. E.g. #818, #810, #687, #185, #320, #247. Technically it can be argued that it is silly to check if any number is an integer multiple of e.g. 0.1. Ask silly questions, get silly answers. This approach is a bit unhelpful though.

Looking at other implementations in javascript and python, other libraries struggle with this too

Checking if 10.1 is a multiple of 0.1 yields mixed results:

>>> import fastjsonschema
>>> fastjsonschema.validate({"multipleOf": 0.1}, 10.1)
10.1

>>> import jsonschema_rs
>>> jsonschema_rs.JSONSchema.from_str('{"multipleOf": 0.1}').is_valid(10.1)
False
>>> jsonschema_rs.JSONSchema({"multipleOf": 0.1}).is_valid(10.1)

https://jsonschema.dev/ fails validation
https://www.jsonschemavalidator.net/ accepts it

And react-jsonschema-form accepts it.

So there is no real consensus amongst the libraries. Still I find the number of issues a strong indication that the current behaviour of this library is unexpected. This PR's modifies the multipleOf behaviour to allow for float tolerance (epsilon) to be taken into account.

@Julian
Copy link
Copy Markdown
Member

Julian commented Nov 5, 2021

Hi. Thanks for this but I'm going to politely decline.

As I mentioned in the other issues, if someone wants non-float behavior, there is already a way to get it via Decimals.

Otherwise, this change just moves which floats are incorrect multiples from some to others.

If you indeed are interested in this, I'd point you to this PR which claims there is a better algorithm entirely for us to use.

I don't really think that looking at the number of tickets filed is a good metric for surprisability (and I'm not sure I believe surprise is the right metric at all in this case, this library implements a spec, so for better or worse it's compliance with the spec that trumps).

But just looking at issues filed is going to be biased against the non-current behavior -- people who expect this behavior aren't filing tickets saying things worked as expected :)

But do appreciate the PR nonetheless.

@Julian Julian closed this Nov 5, 2021
@tdamsma
Copy link
Copy Markdown
Author

tdamsma commented Nov 8, 2021

@Julian, I understand and sympathize with your points, but am going to try to convince you otherwise nonetheless.

I see now that you have been involved in discussions on the jsonschema spec itself, without reaching a meaningful course of action. I just see many valid points raised, linked issues, calls for clarification of the spec, other languages sturggling with this etc.

My take is that it is up to the language to interpret/implement this as they want (correct me if I'm wrong here). The spec offers no real guidance here (and whilest I think that should be improved, that is another discussion), for the python implementation I think python's best practices should be guiding.

And that is where it becomes a bit unclear I guess. On first sight, python works like this:

>>> 0.1*round(10.1/0.1)
10.100000000000001
>>> 0.1*(10.1/0.1)
10.1

And it seems fair to reason that 10.1 is indeed not an integer multiple of 0.1.

But digging a bit deeper, both python itself and the native json library already do some implicit rounding:

>>> import json
>>> v = json.loads("0.100000000000000001")
>>> print(v)
0.1
>>> print(f"{v:.20g}")
0.10000000000000000555
>>> print(json.dumps(v))
0.1

There are three occurences of implicit rounding here:

  • the input is rounded to the nearest float.
  • then printing the float, the value is by default represented as 0.1. This behaviour was explcitily introduced in python https://docs.python.org/3/tutorial/floatingpoint.html
  • when converting it back into json, the number is also represented as 0.1

So the underlying philosophy seems (to a certain degree) to be to treat float(0.1) as exactly 1/10.

So even though the behaviour below is strict/correct/explainable:

 >>> 1/10 == 0.1
True
>>> 10.1 / 101 == 0.1
False

This is also the case:

jsonschema.validate(instance=json.loads("10.00000000000000005"), schema={"multipleOf": 1})

If I really cared, I should have used a Decimal representation:

>>> import simplejson 
>>> import jsonschema
>>> jsonschema.validate(instance=simplejson.loads("10.00000000000000005",use_decimal=True), schema={"multipleOf": 1})

Failed validating 'multipleOf' in schema:
    {'multipleOf': 1}

On instance:
    Decimal('10.00000000000000005')

So whilest you have argued that is you want more correct behaviour of multipleOf one should use Decimal and not float, I woul like to argue that I think it is more pythonic to silently add some leniency to multipleOf validation (equal to the float representation error) just like python displays float(0.1) differently from Decimal('0.1'). If you really care about exactness here use Decimal, otherwise do some rounding and get on with it.

this change just moves which floats are incorrect multiples from some to others.

Not really, it just checks (or should check) if the number is equal to the nearest representable float.

Now there are many different ways to implement this, and perhaps one based on Decimal would feel less arbitrary:

instance = 10.1
dB = 0.1
instance_d = Decimal(repr(instance))
dB_d = Decimal(repr(dB))
quotient = instance_d / dB_d
failed = int(quotient) != quotient

If the users intention actually is to use arbitrary precision decimal representaion they can do so by using the Decimal package explicitly, something they should be doing anyways as this holds:

>>> Decimal(1) / Decimal(0.1)
Decimal('9.999999999999999444888487687')
>>> 1 / 0.1
10.0

Which in the current implementation actually leads to the arbitrary behaviour where some numbers are multiples of some irrepresentable numbers and others are not.

@Julian
Copy link
Copy Markdown
Member

Julian commented Nov 12, 2021

Apologies for not having more time to respond in detail, but there's no implicit rounding in your examples -- what Python does in recent versions is a very targeted, very specific display change -- https://bugs.python.org/issue1580

It reprs floats by using the shortest input that produces the given float. But no rounding is happening, 0.1 isn't a representable float, so you always have gotten the same float, before or after the repr change. And absolutely no tolerance is added -- it's strictly a display change.

There's as far as I know no precedence anywhere in Python for introducing some implicit tolerance when doing float operations unless the user explicitly asks for it, and what jsonschema does matches that -- it does float operations and gives the same answers Python does if you do them yourself:

>>> v = 10.00000000000000005 / 1
>>> print(f"{v:.50g}")
10

Julian added a commit that referenced this pull request May 3, 2026
c7257e92 refOfUnknownKeyword.json rm unused $id (#890)
bf54c9bd support negative multiples (#888)
e819f329 Fix broken test in v1/ref.json from PR #873
29965c01 Merge pull request #873 from jdesrosiers/fix-v1-meta-schema-ref
3b08f79b test: add some missing edge cases to json-pointer format tests (#877)
a1de34ae test(format): allow consecutive hyphens in hostname (RFC1123) (#874)
aac8a1bd chore: remove duplicate idn-email test and fix ambiguous test descriptions (#881)
78048ead format: add RFC 3339 unknown local offset (-00:00) test for time (#878)
ee3fc3d8 test(draft2020-12): add boundary coverage for maxContains = 0 (#857)
54ed4d1f Add additional URI edge cases and include them in v1 (#850)
87f09d65 Add missing email format edge cases across drafts (#849)
f2720d32 more tests for the duration format: missing units (#875)
e391238a Draft-06/7 don't allow empty enums (#876)
c7fc5095 Add tests for empty enum validation (#870)
20f4897c docs: clarify intent of format-assertion vocabulary test cases (#854)
eb018552 Add annotation tests for $dynamicRef and $dynamicAnchor (#862)
0a1dc5e9 fix: remove duplicate allOf and maxItems entries in draft2020-12 KNOWN set (#865)
06e3102d docs: fix typos and wording in suite documentation (#867)
031e9e9a test(duration): expand RFC 3339 grammar coverage for duration format (#868)
3181cefa Fix issues with v1 tests
06481b14 test: add unicode pattern and patternProperties tests for draft2020-12 (#837)
601aa708 Add exhaustive ABNF-driven tests with RFC 2673 compliance for ipv4 format (#840)
583d7c62 Merge pull request #851 from Anshikakalpana/fix/maximum-test-description-mismatch
61bfe27a fix: apply description fix to v1 maximum test
516d71f7 fix: align test description with actual valid field in maximum.json across drafts
4253477c Merge pull request #836 from VIDIT45AGARWAL/date-time
9e787e38 Add tests for out-of-range time components in date-time format
bce6a47c Merge pull request #793 from 414owen/os/fix-unevaluatedproperties-test-case
a17eb1f6 Merge pull request #833 from JulesGosnell/add-m3-validator
648811b2 Add M3 JSON Schema validator to Who Uses section
32fa6e28 Merge pull request #827 from Shristibot/required-null
086bf9a1 Revert unintended .gitignore change
909b58d7 Remove unrelated files from PR
c957e914 Add boolean and null cases to draft-04 , draft-06 and v1
0650027c Add boolean case to required non-object validation
d8d15620 Reset README to upstream to avoid unrelated changes
f6523449 Add required + null test case for issue #821
75995a1c Merge pull request #818 from fperrad/lua-schema
7e6b3a15 Add lua-schema in README.md
8b826d6b Merge pull request #816 from kpavlov/kotlinx-schema-link
b3ae21c2 Update README.md
646f11da Merge pull request #812 from Shristibot/clarification-on-test-suite
ba3f9505 Update README.md
3fc54179 Merge pull request #808 from json-schema-org/ether/more-fix-remotes
3550ca73 clarify optional tests and document integer example
fac4f6f6 Clarify that the test-suite is not  astyle guide
ece5bd10 Merge pull request #804 from Shristibot/add-if-then-else-tests
525adc90 move aside remotes that are not valid in earlier drafts
bbe22f88 remove redundant remotes that also exist in draft-specific directories
ea0baf8d these files are no longer used, and contain an unresolveable $ref on earlier drafts
bef1e2ed Remove redundant then/else false test cases
910af732 Merge pull request #803 from json-schema-org/gregsdennis/proposals
23ff2750 Fix incorrect test condition, simplify test, and copy tests to all drafts as requested
ccfb91af Fix description style and add false-handling tests for issue #767
bc1763ed Fix description formatting in false-handling tests
bd13fcf7 Add if/then/else test cases for issue #767
dc659514 added documentation about folder in root readme and folder readme
c4a99e00 move propertyDependencies tests to proposals folder
d69537ac Merge pull request #802 from json-schema-org/ether/draft4-boolean-schema
79ee846d Merge pull request #801 from json-schema-org/ether/missing-not-tests
2da8f9a6 remove unused remotes files
15cc161f add missing not tests from #714
47ff61c2 Merge pull request #799 from frawa/frawa-patch-1
4a640206 Merge pull request #800 from frawa/frawa-patch-2
fdabd3f9 Merge pull request #795 from anmolbhadoriya5849/fix/issue-768-additionalItems-case
cd02dece Add Unison section to README
4da52d58 Remove 'typed-json' link from README
232f0795 Revert removal of additionalProperties applicator tests in v1
b4032b96 Revert removal of additionalProperties applicator tests in draft2020-12
df833be4 Remove test case for additionalProperties behavior
0d54c828 Remove test for additionalProperties behavior
756eebd6 Remove valid case for additionalItems validation
a3feff79 Remove valid case for additionalItems validation
8ba96359 Remove test for additionalItems applicators
f3488882 Remove valid test case for additionalItems
da4a5684 Remove valid case for additionalItems validation
a247442b Merge pull request #794 from json-schema-org/ether/dynamic-scopes-subschemas
19810ec1 Add a test for respecting dynamic scopes while avoiding the root of each schema
e0e431f8 Fix unevaluatedProperties & additionalProperties test case
8da0a7bc Merge pull request #792 from jdesrosiers/more-v1-transistion
94c0b0e1 Fix more "draft/next" occurances that we missed
8fd4fcec Merge pull request #786 from jdesrosiers/a-label-tests
be58fa98 Merge pull request #791 from json-schema-org/ether/fix-draft-next-removal
acaece38 stop using the term "draft" to mean "version", and fix the remaining mentions of draft-next (now v1)
980e1052 Merge pull request #790 from dylankerr-bis/add-test-sibling-nested-ref-id
b5037ed7 Add test for sibling $ref and $id in nested schema to v1 suite
b8bb8585 Add more comprehensive tessts for A-labels in the "hostname" format
9b8ef536 Cleanup hostname tests (remove dups, fix descriptions, reorder)
4cf55996 Merge pull request #776 from davishmcclurg/idn-hostname-separators
a930db43 Shorten test descriptions to appease ci
aab08752 Updates form PR feedback
081a16a7 Test IDN label separators separate labels
658c8cf9 Test IDN label separator in `hostname` format
38c04b83 Test label separator position in hostname formats
cabbd65c Merge pull request #735 from zaplapl/main
ed74ee74 Add unicode tests to all the dialects
fd0df6cf Update unicode tests per PR feedback
60098220 base test implementation for const equality for strings based on evalutaion by individual codepoint
ef78a0eb Merge pull request #683 from mwadams/main
db365735 Added tests to encoded refs to unknown keywords.
59944b7e Add test for sibling $ref and $id in nested schema to 2019 suite
2196a531 Add test for sibling $ref and $id in nested schema to 2020 suite
e99b24c9 Merge pull request #788 from jdesrosiers/next-to-v1
7eac79be Bring tests up-to-date with format changes
ae27be13 Bring dynamic reference tests up to date with spec changes
79dc2fd3 Unknown optional vocabs can't be ignored anymore
51f84640 Change draft-next to v1
f4e41b0c Merge pull request #785 from jdesrosiers/uneval-nested-conflicts
60a2633f Merge pull request #778 from koplas/extend-uri-validation-tests
bf65ef20 Copy uri tests to other drafts
02a0d89f Merge pull request #743 from Vinit-Pandit/MeastroZI-patch-1
c1545e6c Update unevaluatedProperties.json
7e970016 Merge pull request #764 from Dragonsangel/date_time_validation
8970b1ca Add tests for evaluated property/item nesting conflicts
677a9399 Merge pull request #784 from sangamon/781-date-time-format-extended-year
9672a5b6 Merge pull request #783 from sangamon/780-uuid-format-shifted-dashes
28e49b14 add test for invalid "shifted" dashes in uuid format (#780)
aecf038c add test for invalid extended year in date-time format (#781)
15e4505b Merge pull request #765 from JDepooter/unevaluated_items_with_mincontains_0
b8611c40 Extend URI and URI-reference tests
48461fc3 Merge pull request #775 from jviotti/annotations-content-id
9f7dce32 Merge pull request #766 from json-schema-org/ether/bare-email
a1e077d4 ensure we do not parse email strings permissively
72c670b0 Delete identifiers in Content annotation tests
675186ba Take into account `$id` for annotation schema location assertions
100e9823 Add tests for unevaluatedItems with minContains = 0
0be5549b Added tests to ensure date and time formats don't accept date-time formats

git-subtree-dir: json
git-subtree-split: c7257e92580678a086f0b9243a1903ed88bd27f7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants