Make ffc.h the default string to double parser#318
Conversation
|
Couple things we need to decide
Edit: Now the only question is whether we want to universally make a copy and null terminate the string before passing it to |
Local tests show a ~24% performance increase on float heavy payloads.
```bash
❯ hyperfine './fast-float' './main'
Benchmark 1: ./fast-float
Time (mean ± σ): 772.5 ms ± 12.9 ms [User: 368.0 ms, System: 23.6 ms]
Range (min … max): 757.2 ms … 793.6 ms 10 runs
Benchmark 2: ./main
Time (mean ± σ): 975.0 ms ± 14.3 ms [User: 569.3 ms, System: 22.9 ms]
Range (min … max): 954.8 ms … 990.5 ms 10 runs
Summary
./fast-float ran
1.26 ± 0.03 times faster than ./main
```
I also added a define so package managers can disable the custom perser
if they have rules around such things.
This is the ffc.h sha vendored:
kolemannix/ffc.h@b1894aa
Signed-off-by: michael-grunder <michael.grunder@gmail.com>
980a058 to
c230563
Compare
zuiderkwast
left a comment
There was a problem hiding this comment.
It's a minified version of ffc.h, right? We should write down in some file (README or elsewhere) exactly which version and commit hash of ffc we're vendoring. This is important for users that need to provide exact SBOMs of their software (for example Ericsson).
I don't know if we need to allow building without ffc. If we do, then we should avoid the duplicate code for various error logging, etc. A wrapper function could solve it.
| #ifndef VALKEY_USE_STRTOD | ||
| /* For backward compatibility define the above macro to still use `strtod` for double parsing */ | ||
| #define FFC_IMPL |
There was a problem hiding this comment.
Backward compatibility? Are you saying ffc is not behaving exactly the same?
I don't know if we need to keep both variants. If we can go with only ffc, it'd be less conditional code and fewer combinations to test.
I valkey's valkey_strtod.c we have a few more defines, like FFC_DEBUG 0. Not sure if it's needed but maybe it makes it faster. And some options like allowing leading plus. The RESP specification seems to allow leading plus: https://valkey.io/topics/protocol/#doubles
#define FFC_IMPL
#define FFC_DEBUG 0
#include "ffc.h"
const ffc_parse_options valkey_strtod_options = {
FFC_PRESET_GENERAL | FFC_FORMAT_FLAG_ALLOW_LEADING_PLUS,
'.'};There was a problem hiding this comment.
My preference is only using ffc.h as well but was just being conservative.
| (len == 4 && strncasecmp(p, "-nan", 4) == 0)) { | ||
| d = NAN; /* nan. */ | ||
| } else { | ||
| #ifdef VALKEY_USE_STRTOD |
There was a problem hiding this comment.
I see there are multiple ifdef blocks like this.
In valkey, I think we use ffc unconditionally. If we would want conditional strtod/ffc, we shouldn't have the ifdefs in multiple places. We could instead use a wrapper, like valkey_strtod and the variants in https://github.com/valkey-io/valkey/pull/3329/changes#diff-6e5008f926da2db1650da5f500b32c47894cba6b5ef4930aacd883c3b7660738
We probably should since we should get a clash at linktime when building libvalkey as a dependency in valkey.
We also use different versions of ffc, 26.03.02 in valkey vs our 26.04.01 |
I have only seen that libvalkey-py uses its own implementation, but it only uses the double from the callback. |
|
If I'm reading this correctly
It's just a historical artifact going way back probably so clients can use Edit:
I think we're ok because we use hidden visibility? ❯ nm lib/libvalkey.so|grep ffc_
000000000001f570 t ffc_digit_comp.isra.0
0000000000025230 t ffc_from_chars_double
0000000000023eb0 t ffc_from_chars_double_options
0000000000022c70 t ffc_from_chars_double_options.constprop.0.isra.0
0000000000026560 t ffc_from_chars_float
0000000000025280 t ffc_from_chars_float_options
0000000000021aa0 t ffc_from_chars_float_options.constprop.0.isra.0
000000000000cb00 r ffc_large_power_of_5
0000000000025240 t ffc_parse_double
0000000000025250 t ffc_parse_double_simple
0000000000026570 t ffc_parse_float
0000000000026580 t ffc_parse_float_simple
0000000000026b80 t ffc_parse_i32
0000000000027410 t ffc_parse_i32_simple
00000000000265b0 t ffc_parse_i64
0000000000027150 t ffc_parse_i64_simple
00000000000276d0 t ffc_parse_json_number
0000000000023ea0 t ffc_parse_options_default
0000000000026e90 t ffc_parse_u32
0000000000027440 t ffc_parse_u32_simple
00000000000268d0 t ffc_parse_u64
0000000000027180 t ffc_parse_u64_simple
000000000000ca60 r ffc_powers_of_ten_uint64 |
Signed-off-by: michael-grunder <michael.grunder@gmail.com>
ce5ba11 to
0e8fbb6
Compare
|
Added a commit using it universally. I perfer that as long as we can get away with it. I kept the "no infnan" in the options because we specail case them. |
I think we don't need to fix all of these points. If there is no symbol collision, then why should we keep the fallback to strtod? Is there another reason? If we keep the fallback, then we can use it when building libvalkey as part of valkey. Then we don't get symbol collisions, thus no need to hide the symbols and/or use same version as in valkey. But I guess it's still a good idea to hide the symbols for other projects that include libvalkey and also use ffc separately. |
|
Agreed, my last commit just uses ffc.h without a fallback. Much simpler change. |
Local tests show a ~26% performance increase on float heavy payloads.
I also added a define so package managers can disable the custom perser if they have rules around such things.
This is the ffc.h sha vendored:
kolemannix/ffc.h@b1894aa