Skip to content

perf(vm): swap Box for Rc in Function::Memoized inner function 🧠#159

Merged
timfennis merged 1 commit into
masterfrom
perf/memoized-rc-inner-function
May 25, 2026
Merged

perf(vm): swap Box for Rc in Function::Memoized inner function 🧠#159
timfennis merged 1 commit into
masterfrom
perf/memoized-rc-inner-function

Conversation

@timfennis
Copy link
Copy Markdown
Owner

Summary

  • Function::Memoized held the inner function as Box<Self>, so every clone of the variant (one per memoized callsite via resolve_callee) deep-cloned the boxed function — a heap alloc + recursive Function::clone on every call.
  • Switch to Rc<Self>: outer clones become refcount bumps, and the inner is only materialized on cache miss via Rc::unwrap_or_clone.
  • Identity semantics are unchanged — they were already pointer-based on the cache Rc.

Benchmarks

Targeted micro-bench: 10M memoized calls, 99% cache hits, minimal body work.

Workload Baseline Patched Speedup
pure fn f(x) { x + 1 } (compiled, no captures) 1.656s ± 0.021s 1.566s ± 0.019s 1.06×
let f = pure fn(x) { x + bias } (closure) 1.713s ± 0.025s 1.593s ± 0.038s 1.08×

10 runs each, 3 warmup. The closure case sees a slightly bigger win because the old Box clone walked a Vec<Rc<UpvalueCell>>; the new path just bumps a single Rc.

On a real workload (AoC 2022 day 16, a memoized DFS), the per-call savings drown in the body's own work and the result is within noise — but the change is never a loss, and it's faster wherever memoization dispatch is the bottleneck.

Test plan

  • cargo test — all 300+ functional tests pass
  • cargo fmt, cargo clippy -p ndc_vm — no new warnings
  • Output of both microbenchmarks matches between baseline and patched builds

🤖 Generated with Claude Code

Cloning Function::Memoized previously deep-cloned the boxed inner
function on every callsite (resolve_callee). Switching to Rc<Self>
makes the per-call clone a refcount bump; the inner is only cloned
on cache miss via Rc::unwrap_or_clone.

Micro-benchmarks (10M memoized calls, 99% cache hits):
- compiled body: 1.66s → 1.57s (1.06×)
- closure body:  1.71s → 1.59s (1.08×)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@timfennis timfennis force-pushed the perf/memoized-rc-inner-function branch from 8bed219 to a0907d1 Compare May 25, 2026 10:49
@timfennis timfennis merged commit cb978df into master May 25, 2026
1 check passed
@timfennis timfennis deleted the perf/memoized-rc-inner-function branch May 25, 2026 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant