Skip to content

perf(stdlib): drop per-call Vec alloc for List/String/Deque indexing 📇#160

Merged
timfennis merged 1 commit into
masterfrom
perf/index-simple-native
May 24, 2026
Merged

perf(stdlib): drop per-call Vec alloc for List/String/Deque indexing 📇#160
timfennis merged 1 commit into
masterfrom
perf/index-simple-native

Conversation

@timfennis
Copy link
Copy Markdown
Owner

Summary

  • The [] indexing native was uniformly registered with NativeFunc::WithVm, which forces the VM to stack.drain(args..).collect::<Vec<Value>>() on every call so it can pass &mut Vm into the callback.
  • Only the Map default-function path actually needs &mut Vm (it can call user code when a key is missing). List/String/Deque indexes don't touch the VM at all.
  • Split off make_get_func_simple (NativeFunc::Simple, zero-copy slice arg) for the type-specific overloads. Map and Any/Any stay on WithVm because a Map can still reach the callback path through either.
  • With c12ee66 (specificity-aware overload resolution) already on master, a typed list index dispatches to the new simple overload directly.

Benchmarks

Real-world memoized DFS (AoC 2022 day 16, 5 runs, 2 warmup):

  • master: 15.772s ± 0.130s
  • patched: 15.328s ± 0.283s
  • 1.03× faster

Profile delta (perf record, cpu_core cycles):

Symbol master patched
Vec::from_iter 4.46% 1.81%
run_to_depth 46.23% 48.22%
dispatch_call_with_memo 15.16% 15.17%

Total cycles drop ~4% (59.9B → 57.5B).

The []_simple and original [] show up as ~1.3% / ~1.2% each in the new profile — same total time spent in the native body, but the surrounding allocation overhead is gone for the List path.

Test plan

  • cargo test — all functional tests pass, including the 27 in 007_map_and_set (covers Map default-value and default-function paths)
  • cargo fmt, cargo clippy -p ndc_stdlib — no new warnings
  • 2216.ndc output identical to master

🤖 Generated with Claude Code

The `[]` native was uniformly `NativeFunc::WithVm`, forcing the VM to
`stack.drain(args..).collect::<Vec<Value>>()` before every call so it
could pass `&mut Vm`. Only the Map default-function path actually
needs the VM — most indexes don't.

Split off a `NativeFunc::Simple` variant for the type-specific
overloads (List/String/Deque). The Map and Any/Any overloads stay on
WithVm because Maps can reach the callback path. With specificity-aware
overload resolution (c12ee66), typed list lookups dispatch to the
Simple overload directly.

Profile delta on a memoized-DFS workload (AoC 2022 day 16):
- `Vec::from_iter` overhead: 4.46% → 1.81%
- Wall time: 15.77s → 15.33s (1.03×)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@timfennis timfennis merged commit 560e0c3 into master May 24, 2026
1 check passed
@timfennis timfennis deleted the perf/index-simple-native branch May 24, 2026 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant