Skip to content

math: add the flt (binary32) engine (bnd::math::flt::)#7

Merged
NiceAndPeter merged 1 commit into
mainfrom
math-flt-engine
Jun 22, 2026
Merged

math: add the flt (binary32) engine (bnd::math::flt::)#7
NiceAndPeter merged 1 commit into
mainfrom
math-flt-engine

Conversation

@NiceAndPeter

Copy link
Copy Markdown
Owner

Phase 4a of the math-engine plan — the flt (binary32) engine

A third math engine alongside cordic (integer) and dbl (binary64): a small, reproducible libm in float, callable as bnd::math::flt::fn side-by-side with the others in one binary. For single-precision-only FPUs (Cortex-M4F etc.) and size/speed where double-grade precision isn't needed.

Same determinism contract as dbl

New cmath_float.hpp mirrors the double engine in single precision: own fixed polynomials evaluated with the correctly-rounded std::fma(float), Cody-Waite range-reduction constants derived at compile time (mask the low mantissa bits of the float-rounded constant — no external codegen), and correctly-rounded std::sqrt(float). So flt is bit-identical on every IEEE-754 binary32 platform compiled without -ffast-math.

Validated empirically before wiring

Measured the float cores against libm: trig ≈ 0.6 ULP, atan ≈ 0.8, log ≈ 1, exp a few ULP — holding across the shared ±2²⁰ domain (the constexpr split keeps float reduction float-grade even at the edge). So flt reuses the same input envelopes — the same programs compile on every engine. (cos uses proper quadrant reduction, not sin(x+π/2): shifting the float input would lose bits.)

Details

  • A third value set: float ≠ double ≠ cordic; snapped results differ by up to a few notches on fine grids, coincide on coarse grids and algebraically-exact inputs.
  • flt:: has the full public-shaped API mirroring dbl:: (auto-deduced grids, domain static_asserts, expected-returning tan/signed-sqrt/pow). Gated behind !BND_MATH_NO_FP; a flt:: call in the FP-free build is a compile error.
  • flt::store widens float→double for real(f64-backed) bounds or snaps via the rational path otherwise.

Tests / docs

test_math_engines.cpp now instantiates cordic + dbl + flt in one TU, checks they agree on exact inputs, and pins 10 float-engine golden values (determinism). Three-engine docs table + flt section. Single header regenerated.

Deferred to Phase 4b

BND_MATH_FLOAT (make flt the unqualified default) and the f32 storage flag + realf64 rename.

Note on the earlier design question: the prototype showed native float can range-reduce over the full ±2²⁰ domain (float-grade), so I kept the shared domain rather than imposing tighter flt-only limits — simpler and preserves "the same program compiles on every engine." Happy to add artificial tighter limits if preferred.

Verified: default 404/404, CORDIC 442/442, all single-header smokes build.

🤖 Generated with Claude Code

A third math engine alongside cordic (integer) and dbl (binary64): a small
reproducible libm in `float`, for single-precision-only FPUs (Cortex-M4F etc.)
and size/speed where double-grade precision isn't needed.

  bnd::math::flt::fn   binary32 compute, present unless BND_MATH_NO_FP

Same recipe as the double engine but in single precision (new cmath_float.hpp):
own fixed polynomials evaluated with the correctly-rounded std::fma(float), and
Cody-Waite range-reduction constants DERIVED AT COMPILE TIME (mask the low
mantissa bits of the float-rounded constant — no external codegen), plus the
correctly-rounded std::sqrt(float). So flt is bit-identical on every IEEE-754
binary32 platform — the same determinism contract as dbl.

Design notes:
- Empirically validated the float cores against libm: trig ~0.6 ULP, atan ~0.8,
  log ~1, exp a few ULP, holding across the SHARED +-2^20 domain (the constexpr
  split keeps float reduction float-grade even at the edge). So flt reuses the
  same input envelopes — the same programs compile on every engine. (cos uses
  proper quadrant reduction, not sin(x+pi/2): shifting the float input loses bits.)
- A third value set: float != double != cordic; snapped results differ by up to
  a few notches on fine grids, coincide on coarse grids and exact inputs.
- flt:: has the full public-shaped API mirroring dbl:: (auto-deduced grids,
  domain static_asserts, expected-returning tan/sqrt-signed/pow). Gated behind
  !BND_MATH_NO_FP; a flt:: call under the FP-free build is a compile error.
- flt::store widens float->double for `real` (f64-backed) bounds or snaps via the
  rational path otherwise (an f32-backed storage fast path comes with Phase 4b).

Tests: test_math_engines.cpp now instantiates cordic+dbl+flt in one TU, checks
they agree on exact inputs, and pins 10 float-engine golden values (determinism).
Docs updated (three-engine table + flt section). Single header regenerated.

Deferred to Phase 4b: BND_MATH_FLOAT to make flt the unqualified default, and
the f32 storage flag + real->f64 rename.

Verified: default 404/404, CORDIC 442/442, all single-header smokes build.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@NiceAndPeter NiceAndPeter merged commit a2c1ba5 into main Jun 22, 2026
16 checks passed
@NiceAndPeter NiceAndPeter deleted the math-flt-engine branch June 22, 2026 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant