math: flt reads an f32-backed input directly (skip the double hop)#12
Merged
Conversation
The float engine narrowed every input via bound→double→float. For an f32-backed operand the raw IS already a binary32 value, so `flt::to_float` now reads `x.raw()` directly; any other storage still decodes via double then narrows. float→double→float round-trips to the same float, so this is a pure optimization — no value change (the flt golden pins are unchanged) — that keeps the whole flt+f32 path in hardware float on a single-precision FPU (no soft-double at the input boundary). The 18 cores and the inline sqrt/pow/tan flt branches route through `to_float`; the Lower/Upper<Out> overflow thresholds stay compile-time rational→double constants. Verified: default 407/407, CORDIC 443/443, FLOAT 398/398. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up — close the last soft-
doublehop onfltinputThe float engine narrowed every input via
bound → double → float. For an f32-backed operand the raw is already a binary32 value, soflt::to_floatnow readsx.raw()directly; any other storage still decodes viadoublethen narrows.float → double → floatround-trips to the samefloat, so the flt golden pins are unchanged.flt+f32path in hardwarefloaton a single-precision FPU — no soft-doubleat the input boundary (the output side already stored directly). This is the input half of the perf question from earlier.sqrt/pow/tanflt branches route throughto_float; theLower/Upper<Out>overflow thresholds stay compile-timerational→doubleconstants.(Caught and fixed a self-recursion the helper picked up from an over-broad sed during development — the segfault is gone, confirmed by the full suite.)
Verified: default 407/407, CORDIC 443/443, FLOAT 398/398.
🤖 Generated with Claude Code