Skip to content

bench: N=100 hugepage spawn numbers (deferred from #230) #234

@WaylandYang

Description

@WaylandYang

Context. #230 (merged in v0.5.2) shipped the hugepage-backed memfd path end-to-end + bench/live-fork-pause-window/bench-hugepages.py. @theflashwin ran the bench at N=4 on his hardware (numbers in the PR README) and noted he didn't have compute to push to N=100.

I committed in PR review to running N=100 on my dev box — that didn't happen because of two stacked issues that surfaced during the attempt:

What went wrong

  1. Dev-box Firecracker version drift. /usr/local/bin/firecracker was v1.12.0 (unpatched upstream). The MemfdShared backend requires the deeplethe-patched FC fork (mem_backend.shared = true field is forkd-specific) — the patched binary was sitting at ~/firecracker/build/.../firecracker v1.16.0-dev but wasn't on PATH.
  2. Snapshot bitcode incompat. Re-snapping with the patched FC works, but old snapshots (taken with v1.12) can't load on v1.16 (bitcode error from FC's snapshot deserializer). Need a fresh snapshot for the bench.

I got partway through both fixes (installed patched FC, rebuilt a benchsrc-v116 parent, started bench) before hitting an unrelated host issue and rolling back. The bench script itself is correct — the failures were environmental.

To-do

  • Provision a clean host (or container) with patched FC v1.16.0-dev installed by default
  • Build a small benchsrc snapshot fresh under that FC
  • sudo bash scripts/netns-setup.sh 100 to provision per-child netns (the bench needs N netns up front; the script is idempotent)
  • Bump /proc/sys/vm/nr_hugepages to ≥ (parent_mem_mib / 2) + n * 2 — for a 512 MiB parent + N=100 that's 257 + 200 = ~460 pages; round up to 1024 for headroom
  • Run sudo python3 bench/live-fork-pause-window/bench-hugepages.py --source-tag <fresh-tag> --n 100 --iterations 5 --branch-mode diff
  • Post the CSV + summary table (interleaved p50/p99 spawn_ms, ms/child, pause_ms for baseline vs hugepages) here and in bench/live-fork-pause-window/RESULTS-v0.5.md

Acceptance

A reproducible N=100 number that either:

  • Validates the hugepages path (≥1.3× spawn-time speedup at p50 or measurable pause-window reduction), OR
  • Surfaces that the win only shows for memory-heavy parents (in which case the bench should grow a --mem-mib knob and we re-run at 2 GiB + 4 GiB parents to characterize where it pays off)

Either outcome is shippable as a follow-up patch with a one-paragraph note in the README.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions