Skip to content

Perf #859: elide per-char box round-trip in promoted-string charCodeAt (strings → parity)#873

Merged
nickna merged 1 commit into
mainfrom
wrk/issue-859-peephole
Jun 21, 2026
Merged

Perf #859: elide per-char box round-trip in promoted-string charCodeAt (strings → parity)#873
nickna merged 1 commit into
mainfrom
wrk/issue-859-peephole

Conversation

@nickna

@nickna nickna commented Jun 21, 2026

Copy link
Copy Markdown
Owner

Part of #856 (perf epic). The targeted, low-risk slice of #859 (dead box/unbox elision).

Result

strings @10k (warm) Before After Node
compiled ~0.20–0.29 ms (~2.4×) ~0.12 ms (~parity) ~0.11 ms

stringWork's scan loop boxed the charCodeAt double result every character, then the sum + … consumer immediately unboxed it via $Runtime::ConvertToNumber — a dead round-trip whose box Double is a heap allocation per character. Eliminating it removed the dominant residual GC cost and brings strings to ≈Node.

Change

EmitPromotedStringCharCodeAt: both branches (in-range get_Chars and OOB NaN) already push a raw double, so the merge is type-consistent without boxing. Drop the two box Double, set StackType.Double. The consumer's EnsureDouble then no-ops (it already gates on StackType), and any boxed-object consumer re-boxes once via EmitBoxIfNeeded.

Why only this site

The map of the emitter (no IL-buffering layer exists → a general peephole-rewrite pass isn't feasible without refactoring) showed the clean dead round-trips are narrow:

  • General charCodeAt (StringMethods): its fast path must converge at doneLabel with a dynamic-fallback branch returning a boxed object (InvokeMethodValue) — the box is required for the merge; removing it breaks IL verification.
  • count-primes array read: box Boolean → IsTruthy merges the in-range value with the OOB $Undefined branch — structural, not dead.

Only the promoted charCodeAt is a self-contained typed path (both branches double, no fallback merge), so it's the safe site. Broader box/unbox elision (a valueNeeded discard flag) and loop-invariant get_Length hoisting are deferred — lower value now that the typed work removed most round-trips, higher risk.

Validation

  • dotnet test: 13999 passed, 0 real regressions (only a flaky .NET-interop test that passes in isolation + the documented stale Test262 baselines — Array.isArray/proxy, present in interpreter mode too).
  • Existing StringAccumulatorPromotionTests cover the changed path in both modes (Promoted_BuildThenScan_CharCodeSum = the scan loop; Promoted_CharCodeAt_OutOfRange_IsNaN = the OOB→NaN branch).
  • --verify passes; IL confirms the scan loop has no box/ConvertToNumber.

`EmitPromotedStringCharCodeAt` boxed its `double` result (both the in-range
`get_Chars` value and the OOB `NaN`), so a numeric consumer like `sum + s.charCodeAt(i)`
immediately unboxed it via `$Runtime::ConvertToNumber` — a dead box→unbox round-trip,
and the `box Double` is a heap allocation *per character* in a scan loop.

Both branches already push a raw `double`, so the merge is type-consistent without
boxing: drop the two `box Double` and set `StackType.Double`. The consumer's
`EnsureDouble` then no-ops (it already checks StackType), and a boxed-object consumer
re-boxes once via `EmitBoxIfNeeded`. This is the safe case for box-elision because the
promoted charCodeAt is a self-contained typed path with no dynamic-fallback merge —
unlike the general `charCodeAt` (whose box is required to converge with the
`InvokeMethodValue` fallback) and the count-primes array read (whose box merges the
in-range value with the OOB `$Undefined` branch); those are structural, not dead, and
are left untouched.

**strings @10k: ~0.20ms → ~0.12ms — now ~parity with Node (~0.11ms)**, was ~2.4×.
Eliminating the per-char box allocation removed the dominant residual GC cost.

No IL-buffering layer exists, so a general peephole-rewrite pass isn't feasible without
refactoring; this is the targeted source-level (StackType-propagation) form. Broader
box/unbox elision (a `valueNeeded` discard flag) and loop-invariant `get_Length` hoisting
are deferred — lower value now that the typed work removed most round-trips, higher risk.

Green on dotnet test (14004; existing StringAccumulatorPromotionTests cover the
charCodeAt scan + OOB→NaN paths in both modes) except the pre-existing stale/flaky
Test262 baselines. IL-verified.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant