Skip to content

hub packs are not portable across hosts — vmstate hardcodes the rootfs absolute path and packs don't ship the rootfs #242

@WaylandYang

Description

@WaylandYang

Found during the containerized E2E for forkd quickstart (#240). Every hub snapshot pack fails to restore on any machine other than the one that baked it. This likely explains the near-zero hub asset usage — anyone who tried forkd pull + fork on their own host hit this and bounced silently.

Two stacked defects

1. snapshot.json carries the packing host's absolute paths — FIXED

forkd unpack extracted snapshot.json verbatim, so vmstate / memory pointed at e.g. /home/<packer>/.local/share/forkd/snapshots/<tag>/vmstate. First restore failed with FC's Failed to open snapshot file: No such file or directory.

Fix landed alongside the quickstart E2E PR: unpack now rewrites vmstate/memory to the extraction dir (post-rename, idempotent, unknown-field-preserving). Unit tests cover relocate + no-op cases.

2. The FC vmstate hardcodes the rootfs path, and packs don't include the rootfs — OPEN

After fixing (1), restore proceeds further and dies on:

Failed to restore MMIO device: Cannot restore devices: Block: Virtio backend error:
Error manipulating the backing file: No such file or directory (os error 2)
/var/cache/forkd/python-3-12-slim.ext4

Firecracker restores block devices by reopening the exact backing path recorded in the vmstate at snapshot time. Two problems compound:

  • The pack format (SNAPSHOT_FILES = ["snapshot.json", "vmstate", "rootfs.ext4", "memory.bin"]) would include a rootfs from the snapshot dir, but from-image-produced snapshots keep their rootfs in /var/cache/forkd/<image>.ext4 — outside the snapshot dir — so it never gets packed.
  • Even if shipped, FC v1.12 has no block-path override at restore (network_overrides and vsock_override exist; block does not), so the file must land at the exact absolute path recorded in the vmstate.

Proposed fix (design discussion welcome)

The recorded path /var/cache/forkd/<image>.ext4 is already a forkd-controlled, machine-independent convention — unlike $HOME. So:

  1. Pack: when snapshot.json's boot config references a rootfs under /var/cache/forkd/, include it in the pack (new optional entry; bump pack minor rev). Size impact: ~300-500 MB compressed per pack — consider a separate .rootfs.zst sidecar asset + manifest pointer so the 16 MB memory-delta packs stay small and rootfs blobs dedupe across packs of the same base image.
  2. Unpack: place the rootfs at the recorded /var/cache/forkd/<name>.ext4 path (skip if present + sha matches).
  3. Interim: forkd pull should warn loudly when the pack's vmstate references a backing file that doesn't exist locally, instead of letting the user hit a cryptic FC error at first fork. quickstart already prints "if the restore fails, install Docker and re-run for a local bake" before the hub fallback.

Repro

# any machine that isn't the packing host:
forkd pull deeplethe/python-numpy
sudo -E forkd fork --tag python-numpy -n 1 --per-child-netns
# → Block: Error manipulating the backing file: No such file or directory

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions