Perf #856: compiled `Float64Array`/TypedArray element access is boxed + virtually dispatched (no typed fast path) — 24–101× slower than Node

Part of #856 (perf epic). Surfaced by the new cross-runtime `typed-arrays` benchmark (`benchmarks/scripts/typed-arrays.ts`): a `Float64Array` fill + 3-point stencil sweep.

## Symptom

Per-call mean (ms):

| n | Compiled | Node | Bun | Compiled / Node |
|--:|--:|--:|--:|--:|
| 1000 | 0.366 | 0.0036 | 0.0022 | 102× |
| 100000 | 10.42 | 0.310 | 0.213 | 34× |
| 1000000 | 100.3 | 4.18 | 2.23 | 24× |

A real typed buffer + tight arithmetic loop should be near-native; instead it is 24–101× slower than V8/JSC. Notably, a **plain `number[]` is *faster*** than a `Float64Array` here, because `number[]` gets the #857/#872 `List<double>` typed path and the typed array does not.

## Root cause

`a[i]` lowers to the boxed central dispatcher `Call Runtime.GetIndex(object, object) → object` (`Compilation/ILEmitter.Properties.cs:734-756`). There **is** a typed fast path in `EmitGetIndex` — the #857 "promoted typed-array local" branch (`ILEmitter.Properties.cs:820-833`) — but it only matches `number[]`/`boolean[]` promoted to `List<double>`/`List<bool>`. A real `Float64Array` is an emitted `$TypedArray` object, doesn't match that branch, and falls through to the boxed path.

For typed arrays, `GetIndex` routes to `$Runtime.GetTypedArrayElement(object, int) → object` (`Compilation/RuntimeEmitter.Worker.cs:725-752`), which **per access**:
- does `isinst` + `castclass` against the `$TypedArray` base type,
- `callvirt`s a per-type element accessor (`TypedArrayElementGet`, `RuntimeEmitter.TSTypedArray.cs:62`),
- returns `object` — **boxing the `double` on every read**; writes take a boxed `object` (`SetTypedArrayElement`, `RuntimeEmitter.Worker.cs:758`).

So every element read/write pays box/unbox + a type check + a virtual dispatch, versus V8/JSC compiling typed-array access to a direct memory load. Over 1M elements (fill + stencil) that overhead dominates. The backing storage itself is fine — it is the access *wrapper* that is slow.

## Suggested fix

Add a typed-array-aware fast path analogous to #857/#872: when the static type of the indexed object is a known TypedArray (`Float64Array`/`Int32Array`/…), emit a direct **unboxed** access to the backing buffer instead of `GetIndex` → `GetTypedArrayElement`. Options:
- expose an unboxed typed accessor on `$TypedArray` (e.g. `double GetF64(int)` / `void SetF64(int, double)`) and bind it directly when the element type is statically known; or
- expose the backing buffer so the arithmetic loop can `ldelem.r8` / `stelem.r8` directly.

Either removes the per-element boxing + virtual dispatch on the hot path. Out-of-range and detached-buffer semantics must be preserved (typed arrays read OOB as `undefined`).

## Repro

`benchmarks/scripts/typed-arrays.ts` via `benchmarks/run-benchmarks.ps1`. All four runtimes compute identical results — equal work — so the gap is pure codegen.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf #856: compiled `Float64Array`/TypedArray element access is boxed + virtually dispatched (no typed fast path) — 24–101× slower than Node #878

Symptom

Root cause

Suggested fix

Repro

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

n	Compiled	Node	Bun	Compiled / Node
1000	0.366	0.0036	0.0022	102×
100000	10.42	0.310	0.213	34×
1000000	100.3	4.18	2.23	24×

Perf #856: compiled Float64Array/TypedArray element access is boxed + virtually dispatched (no typed fast path) — 24–101× slower than Node #878

Description

Symptom

Root cause

Suggested fix

Repro

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Perf #856: compiled `Float64Array`/TypedArray element access is boxed + virtually dispatched (no typed fast path) — 24–101× slower than Node #878