Perf #859: elide per-char box round-trip in promoted-string charCodeAt (strings → parity)#873
Merged
Merged
Conversation
`EmitPromotedStringCharCodeAt` boxed its `double` result (both the in-range `get_Chars` value and the OOB `NaN`), so a numeric consumer like `sum + s.charCodeAt(i)` immediately unboxed it via `$Runtime::ConvertToNumber` — a dead box→unbox round-trip, and the `box Double` is a heap allocation *per character* in a scan loop. Both branches already push a raw `double`, so the merge is type-consistent without boxing: drop the two `box Double` and set `StackType.Double`. The consumer's `EnsureDouble` then no-ops (it already checks StackType), and a boxed-object consumer re-boxes once via `EmitBoxIfNeeded`. This is the safe case for box-elision because the promoted charCodeAt is a self-contained typed path with no dynamic-fallback merge — unlike the general `charCodeAt` (whose box is required to converge with the `InvokeMethodValue` fallback) and the count-primes array read (whose box merges the in-range value with the OOB `$Undefined` branch); those are structural, not dead, and are left untouched. **strings @10k: ~0.20ms → ~0.12ms — now ~parity with Node (~0.11ms)**, was ~2.4×. Eliminating the per-char box allocation removed the dominant residual GC cost. No IL-buffering layer exists, so a general peephole-rewrite pass isn't feasible without refactoring; this is the targeted source-level (StackType-propagation) form. Broader box/unbox elision (a `valueNeeded` discard flag) and loop-invariant `get_Length` hoisting are deferred — lower value now that the typed work removed most round-trips, higher risk. Green on dotnet test (14004; existing StringAccumulatorPromotionTests cover the charCodeAt scan + OOB→NaN paths in both modes) except the pre-existing stale/flaky Test262 baselines. IL-verified.
This was referenced Jun 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part of #856 (perf epic). The targeted, low-risk slice of #859 (dead box/unbox elision).
Result
stringWork's scan loop boxed thecharCodeAtdoubleresult every character, then thesum + …consumer immediately unboxed it via$Runtime::ConvertToNumber— a dead round-trip whosebox Doubleis a heap allocation per character. Eliminating it removed the dominant residual GC cost and brings strings to ≈Node.Change
EmitPromotedStringCharCodeAt: both branches (in-rangeget_Charsand OOBNaN) already push a rawdouble, so the merge is type-consistent without boxing. Drop the twobox Double, setStackType.Double. The consumer'sEnsureDoublethen no-ops (it already gates on StackType), and any boxed-object consumer re-boxes once viaEmitBoxIfNeeded.Why only this site
The map of the emitter (no IL-buffering layer exists → a general peephole-rewrite pass isn't feasible without refactoring) showed the clean dead round-trips are narrow:
charCodeAt(StringMethods): its fast path must converge atdoneLabelwith a dynamic-fallback branch returning a boxedobject(InvokeMethodValue) — the box is required for the merge; removing it breaks IL verification.box Boolean → IsTruthymerges the in-range value with the OOB$Undefinedbranch — structural, not dead.Only the promoted
charCodeAtis a self-contained typed path (both branchesdouble, no fallback merge), so it's the safe site. Broader box/unbox elision (avalueNeededdiscard flag) and loop-invariantget_Lengthhoisting are deferred — lower value now that the typed work removed most round-trips, higher risk.Validation
dotnet test: 13999 passed, 0 real regressions (only a flaky.NET-interop test that passes in isolation + the documented stale Test262 baselines —Array.isArray/proxy, present in interpreter mode too).StringAccumulatorPromotionTestscover the changed path in both modes (Promoted_BuildThenScan_CharCodeSum= the scan loop;Promoted_CharCodeAt_OutOfRange_IsNaN= the OOB→NaN branch).--verifypasses; IL confirms the scan loop has nobox/ConvertToNumber.