math: add the flt (binary32) engine (bnd::math::flt::)#7
Merged
Conversation
A third math engine alongside cordic (integer) and dbl (binary64): a small reproducible libm in `float`, for single-precision-only FPUs (Cortex-M4F etc.) and size/speed where double-grade precision isn't needed. bnd::math::flt::fn binary32 compute, present unless BND_MATH_NO_FP Same recipe as the double engine but in single precision (new cmath_float.hpp): own fixed polynomials evaluated with the correctly-rounded std::fma(float), and Cody-Waite range-reduction constants DERIVED AT COMPILE TIME (mask the low mantissa bits of the float-rounded constant — no external codegen), plus the correctly-rounded std::sqrt(float). So flt is bit-identical on every IEEE-754 binary32 platform — the same determinism contract as dbl. Design notes: - Empirically validated the float cores against libm: trig ~0.6 ULP, atan ~0.8, log ~1, exp a few ULP, holding across the SHARED +-2^20 domain (the constexpr split keeps float reduction float-grade even at the edge). So flt reuses the same input envelopes — the same programs compile on every engine. (cos uses proper quadrant reduction, not sin(x+pi/2): shifting the float input loses bits.) - A third value set: float != double != cordic; snapped results differ by up to a few notches on fine grids, coincide on coarse grids and exact inputs. - flt:: has the full public-shaped API mirroring dbl:: (auto-deduced grids, domain static_asserts, expected-returning tan/sqrt-signed/pow). Gated behind !BND_MATH_NO_FP; a flt:: call under the FP-free build is a compile error. - flt::store widens float->double for `real` (f64-backed) bounds or snaps via the rational path otherwise (an f32-backed storage fast path comes with Phase 4b). Tests: test_math_engines.cpp now instantiates cordic+dbl+flt in one TU, checks they agree on exact inputs, and pins 10 float-engine golden values (determinism). Docs updated (three-engine table + flt section). Single header regenerated. Deferred to Phase 4b: BND_MATH_FLOAT to make flt the unqualified default, and the f32 storage flag + real->f64 rename. Verified: default 404/404, CORDIC 442/442, all single-header smokes build. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase 4a of the math-engine plan — the
flt(binary32) engineA third math engine alongside
cordic(integer) anddbl(binary64): a small, reproducible libm infloat, callable asbnd::math::flt::fnside-by-side with the others in one binary. For single-precision-only FPUs (Cortex-M4F etc.) and size/speed where double-grade precision isn't needed.Same determinism contract as
dblNew
cmath_float.hppmirrors the double engine in single precision: own fixed polynomials evaluated with the correctly-roundedstd::fma(float), Cody-Waite range-reduction constants derived at compile time (mask the low mantissa bits of the float-rounded constant — no external codegen), and correctly-roundedstd::sqrt(float). Sofltis bit-identical on every IEEE-754 binary32 platform compiled without-ffast-math.Validated empirically before wiring
Measured the float cores against libm: trig ≈ 0.6 ULP,
atan≈ 0.8,log≈ 1,expa few ULP — holding across the shared ±2²⁰ domain (the constexpr split keeps float reduction float-grade even at the edge). Sofltreuses the same input envelopes — the same programs compile on every engine. (cosuses proper quadrant reduction, notsin(x+π/2): shifting the float input would lose bits.)Details
float ≠ double ≠ cordic; snapped results differ by up to a few notches on fine grids, coincide on coarse grids and algebraically-exact inputs.flt::has the full public-shaped API mirroringdbl::(auto-deduced grids, domainstatic_asserts, expected-returningtan/signed-sqrt/pow). Gated behind!BND_MATH_NO_FP; aflt::call in the FP-free build is a compile error.flt::storewidens float→double forreal(f64-backed) bounds or snaps via the rational path otherwise.Tests / docs
test_math_engines.cppnow instantiates cordic + dbl + flt in one TU, checks they agree on exact inputs, and pins 10 float-engine golden values (determinism). Three-engine docs table +fltsection. Single header regenerated.Deferred to Phase 4b
BND_MATH_FLOAT(makefltthe unqualified default) and thef32storage flag +real→f64rename.Verified: default 404/404, CORDIC 442/442, all single-header smokes build.
🤖 Generated with Claude Code