Skip to content

math: add f32 (binary32) storage — the float sibling of f64#10

Merged
NiceAndPeter merged 2 commits into
mainfrom
math-storage-f32
Jun 22, 2026
Merged

math: add f32 (binary32) storage — the float sibling of f64#10
NiceAndPeter merged 2 commits into
mainfrom
math-storage-f32

Conversation

@NiceAndPeter

Copy link
Copy Markdown
Owner

Phase 4b (part 2b) — f32 (binary32) storage · the final plan item

A bound carrying f32 holds its value in a binary32 raw (float) — the single-precision sibling of f64, for single-precision FPUs and the flt engine. The natural pairing flt compute + f32 storage keeps the whole path in hardware float with no double round-trip at the I/O boundary (this closes the soft-double cost discussed during review).

Approach: fp_raw unification (not a parallel float path)

fp_raw = f64_raw || f32_raw. Read/store/compare/arithmetic compute in double and narrow to the raw type on store — lossless because the grid is fp-exact. So f32 rides the proven f64 machinery instead of duplicating it.

  • policy_flag: new f32 (bundles round_nearest); widest-wins exact > f64 > f32 > direct > indexed.
  • grid.hpp: float_exact<G> (24-bit significand, exp ≥ −126); storage_pick returns float{} for an f32 dyadic float-exact grid (static_asserts on direct misuse, like f64).
  • generic.hpp: f32_raw/fp_raw; value_raw/index_raw exclude fp_raw; as_double/operator rational/sentinel_raw were already fp-generic.
  • core.hpp: store_f64store_fp (narrows the double snap to the raw type); value-path branches keyed on fp_raw; operator double implicit for f32; f32 dyadic guard.
  • arithmetic (add/mul/div): rep propagation gains f32 with demotion — f32 stays only when both operands are f32-only and the result fits float, else widens to f64, else drops to exact; bodies snap-and-narrow via fp_raw.
  • cmath: dbl::store/flt::store store straight into any fp-backed Out (f32 results no longer detour through the rational path); fast-path guards exclude fp_raw.
  • io/assignment: f64_rawfp_raw so f32 prints/assigns like f64.

Tests / docs

test_storage_flags.cpp: f32 storage selection, lossless construct/read, arithmetic stays f32 then demotes f32→f64 when a result grid outgrows float, widest-wins, and math output landing in f32. Docs: policies.md (f32 row + widest-wins) and math.md (the flt+f32 pairing + the 24-bit overflow note).

(First commit is the mechanical real_rawf64_raw internal rename; second adds f32.)

Verification

default 407/407, CORDIC 443/443 (f32 elided → integer under FIXED), FLOAT 398/398, all single-header smokes build.

🤖 Generated with Claude Code

Peter Neiss and others added 2 commits June 22, 2026 21:28
…havior change)

Prep for the f32 storage sibling: the double-backed raw predicate/helper read
'real' but the flag is now f64, and a float-backed f32_raw is coming. Pure
mechanical rename across include/ + tests; behavior-identical.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A bound carrying `f32` holds its value in a binary32 raw (float), the
single-precision sibling of `f64`, for single-precision FPUs and the `flt`
engine. The natural pairing `flt` compute + `f32` storage keeps the whole path
in hardware float — no double round-trip at the I/O boundary.

Implemented via an `fp_raw` (= f64_raw || f32_raw) unification rather than a
parallel float path: read/store/compare/arithmetic compute in double and narrow
to the raw type on store, which is lossless because the grid is fp-exact.

- policy_flag: new `f32` flag (bundles round_nearest), widest-wins order
  exact > f64 > f32 > direct > indexed.
- grid.hpp: `float_exact<G>` (24-bit significand, exponent ≥ -126), the binary32
  analogue of double_exact; storage_pick returns `float{}` for an f32 dyadic
  float-exact grid (static_asserts on direct misuse, like f64).
- generic.hpp: `f32_raw`/`fp_raw` predicates; value_raw/index_raw exclude fp_raw;
  as_double / operator rational / sentinel_raw already fp-generic.
- core.hpp: `store_f64`→`store_fp` (narrows the double snap to the raw type);
  value-path branches keyed on fp_raw; operator double implicit for f32 too;
  the f32 dyadic guard.
- addition/multiplication/division: rep propagation gains f32 with **demotion** —
  f32 stays only when both operands are f32-only and the result fits float, else
  widens to f64, else drops to exact; bodies snap-and-narrow via fp_raw.
- cmath: dbl::store / flt::store store straight into any fp-backed Out (f32 result
  no longer detours through the rational path); fast-path guards exclude fp_raw.
- io.hpp / assignment.hpp: f64_raw→fp_raw so f32 prints and assigns like f64.

Tests: f32 storage selection, lossless construct/read, arithmetic stays f32 and
demotes f32→f64 when a result grid outgrows float, widest-wins, and math output
landing in f32 (test_storage_flags.cpp). Docs: policies.md + math.md.

Verified: default 407/407, CORDIC 443/443 (f32 elided → integer), FLOAT 398/398,
all single-header smokes build.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@NiceAndPeter NiceAndPeter merged commit 0a9ae10 into main Jun 22, 2026
17 checks passed
@NiceAndPeter NiceAndPeter deleted the math-storage-f32 branch June 22, 2026 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant