Skip to content

Perf #856: compiled Float64Array/TypedArray element access is boxed + virtually dispatched (no typed fast path) — 24–101× slower than Node #878

Description

@nickna

Part of #856 (perf epic). Surfaced by the new cross-runtime typed-arrays benchmark (benchmarks/scripts/typed-arrays.ts): a Float64Array fill + 3-point stencil sweep.

Symptom

Per-call mean (ms):

n Compiled Node Bun Compiled / Node
1000 0.366 0.0036 0.0022 102×
100000 10.42 0.310 0.213 34×
1000000 100.3 4.18 2.23 24×

A real typed buffer + tight arithmetic loop should be near-native; instead it is 24–101× slower than V8/JSC. Notably, a plain number[] is faster than a Float64Array here, because number[] gets the #857/#872 List<double> typed path and the typed array does not.

Root cause

a[i] lowers to the boxed central dispatcher Call Runtime.GetIndex(object, object) → object (Compilation/ILEmitter.Properties.cs:734-756). There is a typed fast path in EmitGetIndex — the #857 "promoted typed-array local" branch (ILEmitter.Properties.cs:820-833) — but it only matches number[]/boolean[] promoted to List<double>/List<bool>. A real Float64Array is an emitted $TypedArray object, doesn't match that branch, and falls through to the boxed path.

For typed arrays, GetIndex routes to $Runtime.GetTypedArrayElement(object, int) → object (Compilation/RuntimeEmitter.Worker.cs:725-752), which per access:

  • does isinst + castclass against the $TypedArray base type,
  • callvirts a per-type element accessor (TypedArrayElementGet, RuntimeEmitter.TSTypedArray.cs:62),
  • returns objectboxing the double on every read; writes take a boxed object (SetTypedArrayElement, RuntimeEmitter.Worker.cs:758).

So every element read/write pays box/unbox + a type check + a virtual dispatch, versus V8/JSC compiling typed-array access to a direct memory load. Over 1M elements (fill + stencil) that overhead dominates. The backing storage itself is fine — it is the access wrapper that is slow.

Suggested fix

Add a typed-array-aware fast path analogous to #857/#872: when the static type of the indexed object is a known TypedArray (Float64Array/Int32Array/…), emit a direct unboxed access to the backing buffer instead of GetIndexGetTypedArrayElement. Options:

  • expose an unboxed typed accessor on $TypedArray (e.g. double GetF64(int) / void SetF64(int, double)) and bind it directly when the element type is statically known; or
  • expose the backing buffer so the arithmetic loop can ldelem.r8 / stelem.r8 directly.

Either removes the per-element boxing + virtual dispatch on the hot path. Out-of-range and detached-buffer semantics must be preserved (typed arrays read OOB as undefined).

Repro

benchmarks/scripts/typed-arrays.ts via benchmarks/run-benchmarks.ps1. All four runtimes compute identical results — equal work — so the gap is pure codegen.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformanceRuntime/codegen performance work

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions