Skip to content

m-19 Phase 1: VM test harness foundations#60

Merged
jvgomg merged 4 commits into
mainfrom
feat/m-19-phase-1
May 13, 2026
Merged

m-19 Phase 1: VM test harness foundations#60
jvgomg merged 4 commits into
mainfrom
feat/m-19-phase-1

Conversation

@jvgomg

@jvgomg jvgomg commented May 13, 2026

Copy link
Copy Markdown
Owner

Summary

Lands every TASK-321 subtask except hardware persona captures (TASK-321.02, deferred to HITL). Establishes the foundation that Phase 3 VM tests (TASK-322) and the TASK-301..311 test sweep depend on.

  • New package @podkit/device-testingDevicePersona + SystemState schemas (verbatim per ADR-017), TestRuntime + local-linux runner, registry pattern, subprocess capture/replay framework
  • Subprocess refactorSubprocessRunner interface in @podkit/device-types (cycle-free), defaultSubprocessRunner in @podkit/core, every short-lived subprocess callsite in podkit-core threaded through the runner; streaming spawns left on existing DI seam
  • 6 SystemState fixtures — healthy / no-ffmpeg / no-libgpod / no-udev / no-sg-perms / corrupt-configfs, with golden-file fixture for healthy
  • Per-OS test tagging*.darwin.test.ts / *.linux.test.ts via describe.skipIf, documented in agents/testing.md, canaries committed
  • Linux native build pipeline — single shared tools/prebuild/build-linux-glibc.sh invoked by both .github/workflows/prebuild.yml (glibc matrix) AND Lima builder VM at tools/device-testing/lima/builder.yaml; turbo + mise tasks; stock-Debian ABI verify VM
  • Documentation — new agents/device-testing.md canonical reference; agents/testing.md three-tier section; all 11 TASK-301..311 swept with harness-integration notes

ABI spike (TASK-321.07 AC #12)

Ran end-to-end on aarch64 Apple Silicon Lima. Stock Debian 12.10 VM (tools/device-testing/lima/abi-verify.yaml) received the binary via limactl copy:

$ ldd /usr/local/bin/podkit
	linux-vdso.so.1 (0x0000ffff9b68e000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff9b4a0000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffff9b651000)
	libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff9b470000)
	libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff9b440000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff9b3a0000)

Zero libgpod/libglib/libgdk_pixbuf/libplist references. x64 verification deferred to this PR's CI run on ubuntu-24.04.

Test plan

  • CI prebuild workflow runs cleanly on ubuntu-24.04 (linux-x64 glibc, linux-arm64 glibc) — verifies the shared script refactor didn't break the release pipeline
  • CI prebuild musl jobs (linux-x64-musl, linux-arm64-musl) remain green — confirms no Alpine regression
  • CI darwin jobs remain green — confirms macOS prebuild path unaffected
  • x64 ABI: ldd on the linux-x64 prebuild artefact shows only stable system libs (no libgpod/libglib/libgdk_pixbuf/libplist)
  • bun run typecheck + bun run test:unit green across all packages locally — verified pre-push (core 2459/2459, ipod-firmware 226/226, device-testing 81+2skip)

Deferred to follow-ups

  • TASK-321.02 — 3 hardware persona captures (HITL session)
  • Streaming spawn callsites threading through SubprocessRunner (needs runStreaming extension)
  • DiagnosticContext widening so video-encoder check doesn't construct its own default runner

ADRs

  • adr/adr-016-linux-vm-test-harness.md (Accepted)
  • adr/adr-017-device-persona-fixtures.md (Accepted)

🤖 Generated with Claude Code

jvgomg and others added 4 commits May 13, 2026 18:00
Promote ADR-016 (Linux VM test harness) and ADR-017 (device persona
fixtures) from Proposed to Accepted to unblock TASK-321 (Phase 1) and
TASK-322 (Phase 3). Fix ADR-017 Phase 5 header count (9 → 12). Mark
TASK-290 Done with final summary capturing deferred review feedback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single source of truth for podkit's three-tier device-testing stack.
Lands every subtask of TASK-321 except TASK-321.02 (hardware persona
captures, deferred to HITL).

New package @podkit/device-testing
- DevicePersona + SystemState schemas (verbatim per ADR-017)
- TestRuntime interface + local-linux runner with auto-registration
- Subprocess capture/replay framework (PODKIT_SNAPSHOT_CAPTURE /
  PODKIT_SNAPSHOT_REPLAY) with 21 framework tests
- SystemState registry seeded with 6 starter states (healthy,
  no-ffmpeg, no-libgpod, no-udev, no-sg-perms, corrupt-configfs)
  plus a golden-file fixture for the healthy state

Cross-package refactor (cycle-free)
- SubprocessRunner interface in @podkit/device-types so production
  packages can depend on the type without importing the test harness
- defaultSubprocessRunner in @podkit/core; device-testing re-exports
  it to keep behaviour in lockstep
- podkit-core callsites threaded through the runner: usb-enumeration,
  usb-path-resolution, device/platforms/{macos,linux}, diagnostics
  video-encoder check, transcode/ffmpeg.exec()
- Streaming spawns (FFmpegTranscoder.transcode, video probe/transcode,
  music pipeline transcode, artwork resize) left on existing _spawnFn
  DI; documented in subprocess.md

Per-OS test tagging convention
- *.darwin.test.ts / *.linux.test.ts via describe.skipIf
- Canary tests prove the convention works on both hosts
- Documented in agents/testing.md

Linux native build pipeline
- tools/prebuild/build-linux-glibc.sh: single shared script invoked
  by .github/workflows/prebuild.yml glibc matrix AND tools/device-
  testing/lima/builder.yaml (Debian 12.10 pinned)
- tools/device-testing/lima/abi-verify.yaml: stock-Debian VM used
  for the ldd static-link spike
- turbo tasks @podkit/device-testing#build:linux-prebuild and
  build:linux-binary with full cache inputs
- mise tasks device-testing:build-linux*
- ABI spike (TASK-321.07 AC #12): ran end-to-end on aarch64 Apple
  Silicon Lima; ldd reported only linux-vdso, libc, libpthread,
  libdl, libm, ld-linux-aarch64 -- zero libgpod/libglib/libgdk_pixbuf
  /libplist references. x64 verification deferred to first CI run
  on ubuntu-24.04.

Documentation
- New agents/device-testing.md: canonical reference for the harness
- agents/testing.md gained a Three-Tier Test Stack section
- All 11 tasks TASK-301..TASK-311 swept with harness-integration
  notes so implementers land on the new stack

Cross-references
- adr/adr-016-linux-vm-test-harness.md (Accepted)
- adr/adr-017-device-persona-fixtures.md (Accepted)
- agents/device-testing.md
- packages/device-testing/README.md
- tools/device-testing/lima/README.md

Deferred / follow-ups (not blocking)
- TASK-321.02 hardware persona captures (HITL, awaiting devices)
- Streaming spawn callsites threading through SubprocessRunner
  (needs runStreaming extension)
- DiagnosticContext widening so video-encoder doesn't construct
  its own default runner

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
build-static-deps.sh has used cmake since 7b61c7c (libusb static build,
2026-04-25) — zlib, libusb, and libplist all build via cmake -B build.
The Alpine apk install in both musl jobs (prebuild-musl-x64 cached path
and prebuild-musl-arm64 docker run path) was never updated, so the musl
prebuilds have been failing with "cmake: command not found" since that
commit. Last successful prebuild run was 2026-03-10.

Glibc and darwin matrix entries are unaffected — both ship cmake out of
the box on their respective runners.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The usb npm package (used by @podkit/ipod-firmware since e825ee1) has a
node-gyp postinstall that compiles libusb against libudev. Alpine ships
the udev headers via the eudev-dev apk package; without it, bun install
fails on the usb package's native build with "libudev.h: No such file".

Same pre-existing musl regression class as the cmake fix in 8c9239d —
glibc and darwin runners already have udev headers available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jvgomg jvgomg merged commit 234c104 into main May 13, 2026
12 checks passed
@jvgomg jvgomg deleted the feat/m-19-phase-1 branch May 13, 2026 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant