Skip to content

Replace Ryu with a C99 port of fast_float#1006

Merged
byroot merged 2 commits into
ruby:masterfrom
byroot:eisel-lemire-float
Jun 16, 2026
Merged

Replace Ryu with a C99 port of fast_float#1006
byroot merged 2 commits into
ruby:masterfrom
byroot:eisel-lemire-float

Conversation

@byroot

@byroot byroot commented Jun 15, 2026

Copy link
Copy Markdown
Member

Contrary to Ryu, it remains correct up to 18 mantissa digits, meaning we don't have to fallback as much to Ruby's much slower rb_cstr_to_dbl.

On canada.json, the most number heavy benchmark, ffp_s2d is less than 5% of total runtime. Most of the time is actually sent parsing, not decoding.

There is a more complete C99 port of fast_float at https://github.com/kolemannix/ffc.h which also does parsing and could be worth considering, however it might make it much harder to detect when we're outside the safe range and to fallback, so unclear if it would be a win.

Credit to @tilo for the C99 port (very sligthly reworked by myself).

== Parsing citm_catalog.json (500124 bytes)
ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
               after    73.000 i/100ms
Calculating -------------------------------------
               after    733.301 (± 1.0%) i/s    (1.36 ms/i) -      3.723k in   5.077044s

Comparison:
before:      716.8 i/s
 after:      733.3 i/s - 1.02x  faster

== Parsing float parsing (2090303 bytes)
ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
               after    35.000 i/100ms
Calculating -------------------------------------
               after    353.130 (± 0.3%) i/s    (2.83 ms/i) -      1.785k in   5.054791s

Comparison:
before:      323.1 i/s
 after:      353.1 i/s - 1.09x  faster

Contrary to Ryu, it remains correct up to 18 mantissa digits,
meaning we don't have to fallback as much to Ruby's much slower
`rb_cstr_to_dbl`.

On `canada.json`, the most number heavy benchmark, `ffp_s2d` is less
than 5% of total runtime. Most of the time is actually sent parsing,
not decoding.

There is a more complete C99 port of `fast_float` at
https://github.com/kolemannix/ffc.h which also does parsing and
could be worth considering, however it might make it much harder
to detect when we're outside the safe range and to fallback,
so unclear if it would be a win.

Credit to Tilo Sloboda for the C99 port (sligthly reworked).

```
== Parsing citm_catalog.json (500124 bytes)
ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
               after    73.000 i/100ms
Calculating -------------------------------------
               after    733.301 (± 1.0%) i/s    (1.36 ms/i) -      3.723k in   5.077044s

Comparison:
before:      716.8 i/s
 after:      733.3 i/s - 1.02x  faster

== Parsing float parsing (2090303 bytes)
ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
               after    35.000 i/100ms
Calculating -------------------------------------
               after    353.130 (± 0.3%) i/s    (2.83 ms/i) -      1.785k in   5.054791s

Comparison:
before:      323.1 i/s
 after:      353.1 i/s - 1.09x  faster
```
@byroot

byroot commented Jun 15, 2026

Copy link
Copy Markdown
Member Author

Need some fix for windows support: error C4013: '__builtin_clzll' undefined; assuming extern returning int

@byroot byroot merged commit c171e34 into ruby:master Jun 16, 2026
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant