Found during the containerized E2E for forkd quickstart (#240). Every hub snapshot pack fails to restore on any machine other than the one that baked it. This likely explains the near-zero hub asset usage — anyone who tried forkd pull + fork on their own host hit this and bounced silently.
Two stacked defects
1. snapshot.json carries the packing host's absolute paths — FIXED
forkd unpack extracted snapshot.json verbatim, so vmstate / memory pointed at e.g. /home/<packer>/.local/share/forkd/snapshots/<tag>/vmstate. First restore failed with FC's Failed to open snapshot file: No such file or directory.
Fix landed alongside the quickstart E2E PR: unpack now rewrites vmstate/memory to the extraction dir (post-rename, idempotent, unknown-field-preserving). Unit tests cover relocate + no-op cases.
2. The FC vmstate hardcodes the rootfs path, and packs don't include the rootfs — OPEN
After fixing (1), restore proceeds further and dies on:
Failed to restore MMIO device: Cannot restore devices: Block: Virtio backend error:
Error manipulating the backing file: No such file or directory (os error 2)
/var/cache/forkd/python-3-12-slim.ext4
Firecracker restores block devices by reopening the exact backing path recorded in the vmstate at snapshot time. Two problems compound:
- The pack format (
SNAPSHOT_FILES = ["snapshot.json", "vmstate", "rootfs.ext4", "memory.bin"]) would include a rootfs from the snapshot dir, but from-image-produced snapshots keep their rootfs in /var/cache/forkd/<image>.ext4 — outside the snapshot dir — so it never gets packed.
- Even if shipped, FC v1.12 has no block-path override at restore (
network_overrides and vsock_override exist; block does not), so the file must land at the exact absolute path recorded in the vmstate.
Proposed fix (design discussion welcome)
The recorded path /var/cache/forkd/<image>.ext4 is already a forkd-controlled, machine-independent convention — unlike $HOME. So:
- Pack: when
snapshot.json's boot config references a rootfs under /var/cache/forkd/, include it in the pack (new optional entry; bump pack minor rev). Size impact: ~300-500 MB compressed per pack — consider a separate .rootfs.zst sidecar asset + manifest pointer so the 16 MB memory-delta packs stay small and rootfs blobs dedupe across packs of the same base image.
- Unpack: place the rootfs at the recorded
/var/cache/forkd/<name>.ext4 path (skip if present + sha matches).
- Interim:
forkd pull should warn loudly when the pack's vmstate references a backing file that doesn't exist locally, instead of letting the user hit a cryptic FC error at first fork. quickstart already prints "if the restore fails, install Docker and re-run for a local bake" before the hub fallback.
Repro
# any machine that isn't the packing host:
forkd pull deeplethe/python-numpy
sudo -E forkd fork --tag python-numpy -n 1 --per-child-netns
# → Block: Error manipulating the backing file: No such file or directory
Found during the containerized E2E for
forkd quickstart(#240). Every hub snapshot pack fails to restore on any machine other than the one that baked it. This likely explains the near-zero hub asset usage — anyone who triedforkd pull+forkon their own host hit this and bounced silently.Two stacked defects
1.
snapshot.jsoncarries the packing host's absolute paths — FIXEDforkd unpackextractedsnapshot.jsonverbatim, sovmstate/memorypointed at e.g./home/<packer>/.local/share/forkd/snapshots/<tag>/vmstate. First restore failed with FC'sFailed to open snapshot file: No such file or directory.Fix landed alongside the quickstart E2E PR:
unpacknow rewritesvmstate/memoryto the extraction dir (post-rename, idempotent, unknown-field-preserving). Unit tests cover relocate + no-op cases.2. The FC vmstate hardcodes the rootfs path, and packs don't include the rootfs — OPEN
After fixing (1), restore proceeds further and dies on:
Firecracker restores block devices by reopening the exact backing path recorded in the vmstate at snapshot time. Two problems compound:
SNAPSHOT_FILES = ["snapshot.json", "vmstate", "rootfs.ext4", "memory.bin"]) would include a rootfs from the snapshot dir, butfrom-image-produced snapshots keep their rootfs in/var/cache/forkd/<image>.ext4— outside the snapshot dir — so it never gets packed.network_overridesandvsock_overrideexist; block does not), so the file must land at the exact absolute path recorded in the vmstate.Proposed fix (design discussion welcome)
The recorded path
/var/cache/forkd/<image>.ext4is already a forkd-controlled, machine-independent convention — unlike$HOME. So:snapshot.json's boot config references a rootfs under/var/cache/forkd/, include it in the pack (new optional entry; bump pack minor rev). Size impact: ~300-500 MB compressed per pack — consider a separate.rootfs.zstsidecar asset + manifest pointer so the 16 MB memory-delta packs stay small and rootfs blobs dedupe across packs of the same base image./var/cache/forkd/<name>.ext4path (skip if present + sha matches).forkd pullshould warn loudly when the pack's vmstate references a backing file that doesn't exist locally, instead of letting the user hit a cryptic FC error at first fork.quickstartalready prints "if the restore fails, install Docker and re-run for a local bake" before the hub fallback.Repro