m-19 Phase 1: VM test harness foundations#60
Merged
Merged
Conversation
Promote ADR-016 (Linux VM test harness) and ADR-017 (device persona fixtures) from Proposed to Accepted to unblock TASK-321 (Phase 1) and TASK-322 (Phase 3). Fix ADR-017 Phase 5 header count (9 → 12). Mark TASK-290 Done with final summary capturing deferred review feedback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single source of truth for podkit's three-tier device-testing stack.
Lands every subtask of TASK-321 except TASK-321.02 (hardware persona
captures, deferred to HITL).
New package @podkit/device-testing
- DevicePersona + SystemState schemas (verbatim per ADR-017)
- TestRuntime interface + local-linux runner with auto-registration
- Subprocess capture/replay framework (PODKIT_SNAPSHOT_CAPTURE /
PODKIT_SNAPSHOT_REPLAY) with 21 framework tests
- SystemState registry seeded with 6 starter states (healthy,
no-ffmpeg, no-libgpod, no-udev, no-sg-perms, corrupt-configfs)
plus a golden-file fixture for the healthy state
Cross-package refactor (cycle-free)
- SubprocessRunner interface in @podkit/device-types so production
packages can depend on the type without importing the test harness
- defaultSubprocessRunner in @podkit/core; device-testing re-exports
it to keep behaviour in lockstep
- podkit-core callsites threaded through the runner: usb-enumeration,
usb-path-resolution, device/platforms/{macos,linux}, diagnostics
video-encoder check, transcode/ffmpeg.exec()
- Streaming spawns (FFmpegTranscoder.transcode, video probe/transcode,
music pipeline transcode, artwork resize) left on existing _spawnFn
DI; documented in subprocess.md
Per-OS test tagging convention
- *.darwin.test.ts / *.linux.test.ts via describe.skipIf
- Canary tests prove the convention works on both hosts
- Documented in agents/testing.md
Linux native build pipeline
- tools/prebuild/build-linux-glibc.sh: single shared script invoked
by .github/workflows/prebuild.yml glibc matrix AND tools/device-
testing/lima/builder.yaml (Debian 12.10 pinned)
- tools/device-testing/lima/abi-verify.yaml: stock-Debian VM used
for the ldd static-link spike
- turbo tasks @podkit/device-testing#build:linux-prebuild and
build:linux-binary with full cache inputs
- mise tasks device-testing:build-linux*
- ABI spike (TASK-321.07 AC #12): ran end-to-end on aarch64 Apple
Silicon Lima; ldd reported only linux-vdso, libc, libpthread,
libdl, libm, ld-linux-aarch64 -- zero libgpod/libglib/libgdk_pixbuf
/libplist references. x64 verification deferred to first CI run
on ubuntu-24.04.
Documentation
- New agents/device-testing.md: canonical reference for the harness
- agents/testing.md gained a Three-Tier Test Stack section
- All 11 tasks TASK-301..TASK-311 swept with harness-integration
notes so implementers land on the new stack
Cross-references
- adr/adr-016-linux-vm-test-harness.md (Accepted)
- adr/adr-017-device-persona-fixtures.md (Accepted)
- agents/device-testing.md
- packages/device-testing/README.md
- tools/device-testing/lima/README.md
Deferred / follow-ups (not blocking)
- TASK-321.02 hardware persona captures (HITL, awaiting devices)
- Streaming spawn callsites threading through SubprocessRunner
(needs runStreaming extension)
- DiagnosticContext widening so video-encoder doesn't construct
its own default runner
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
build-static-deps.sh has used cmake since 7b61c7c (libusb static build, 2026-04-25) — zlib, libusb, and libplist all build via cmake -B build. The Alpine apk install in both musl jobs (prebuild-musl-x64 cached path and prebuild-musl-arm64 docker run path) was never updated, so the musl prebuilds have been failing with "cmake: command not found" since that commit. Last successful prebuild run was 2026-03-10. Glibc and darwin matrix entries are unaffected — both ship cmake out of the box on their respective runners. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The usb npm package (used by @podkit/ipod-firmware since e825ee1) has a node-gyp postinstall that compiles libusb against libudev. Alpine ships the udev headers via the eudev-dev apk package; without it, bun install fails on the usb package's native build with "libudev.h: No such file". Same pre-existing musl regression class as the cmake fix in 8c9239d — glibc and darwin runners already have udev headers available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lands every TASK-321 subtask except hardware persona captures (TASK-321.02, deferred to HITL). Establishes the foundation that Phase 3 VM tests (TASK-322) and the TASK-301..311 test sweep depend on.
@podkit/device-testing—DevicePersona+SystemStateschemas (verbatim per ADR-017),TestRuntime+local-linuxrunner, registry pattern, subprocess capture/replay frameworkSubprocessRunnerinterface in@podkit/device-types(cycle-free),defaultSubprocessRunnerin@podkit/core, every short-lived subprocess callsite in podkit-core threaded through the runner; streaming spawns left on existing DI seamhealthy*.darwin.test.ts/*.linux.test.tsviadescribe.skipIf, documented inagents/testing.md, canaries committedtools/prebuild/build-linux-glibc.shinvoked by both.github/workflows/prebuild.yml(glibc matrix) AND Lima builder VM attools/device-testing/lima/builder.yaml; turbo + mise tasks; stock-Debian ABI verify VMagents/device-testing.mdcanonical reference;agents/testing.mdthree-tier section; all 11 TASK-301..311 swept with harness-integration notesABI spike (TASK-321.07 AC #12)
Ran end-to-end on aarch64 Apple Silicon Lima. Stock Debian 12.10 VM (
tools/device-testing/lima/abi-verify.yaml) received the binary vialimactl copy:Zero libgpod/libglib/libgdk_pixbuf/libplist references. x64 verification deferred to this PR's CI run on
ubuntu-24.04.Test plan
prebuildworkflow runs cleanly onubuntu-24.04(linux-x64 glibc, linux-arm64 glibc) — verifies the shared script refactor didn't break the release pipelineprebuildmusl jobs (linux-x64-musl, linux-arm64-musl) remain green — confirms no Alpine regressionlddon the linux-x64 prebuild artefact shows only stable system libs (no libgpod/libglib/libgdk_pixbuf/libplist)bun run typecheck+bun run test:unitgreen across all packages locally — verified pre-push (core 2459/2459, ipod-firmware 226/226, device-testing 81+2skip)Deferred to follow-ups
SubprocessRunner(needsrunStreamingextension)DiagnosticContextwidening sovideo-encodercheck doesn't construct its own default runnerADRs
🤖 Generated with Claude Code