m-19 VM Testing restructure (CI sanity)#63
Closed
jvgomg wants to merge 56 commits into
Closed
Conversation
The container-aware sync PRD link pointed at `../../backlog/docs/`, which resolves outside the docs site tree and fails Starlight's internal-link validation. Replace with a prose pointer at the source repo path; the PRD is not currently published via the docs site. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the Tier-3 test harness from ADR-016: a separate `podkit-test-vm`
Lima VM (binary-only, no dev tooling), snapshot-based state layering,
host→VM binary transfer, a TestRuntime runner for macOS dev hosts, a
FunctionFS userspace daemon scaffold, and the Tier-3 baseline test
shape. All six TASK-322 subtasks (322.01-322.06) implemented.
The harness is structurally complete and auto-skipped on macOS by
default; opt in with PODKIT_DEVTEST_RUN_TIER3=1 once a `podkit-test-vm`
instance is set up. Two assertion families are deliberately deferred to
explicit follow-up tasks rather than scaffolded as skipped tests:
- doctor-vs-state — blocked by TASK-333 (doctor `--scope system`)
- USB device synthesis — blocked by TASK-322.05.01 (FunctionFS
descriptor handshake; needs a live VM to verify)
A third follow-up (TASK-322.02.01) tracks the Lima 2.x VZ-driver
snapshot gap surfaced during the first live-VM smoke; today's runner
degrades silently to apply-state.sh-every-time when snapshots return
"unimplemented".
Includes:
- runner: PODKIT_HOST_ARCH cache-key plumbing so cross-arch turbo
cache shares don't surface wrong-arch binaries
- runner: stopDaemon idempotency (treat systemctl exit 5 as success)
- refactor: extract lima-limactl.ts (runLimactl/limactlError/shellQuote
consolidated from four duplicate copies)
- build: `device-testing:build-linux` now builds the dummy-hcd-daemon
too; `transfer-binary` ships it to the test VM
- fix: Lima 2.x --workdir must precede the instance name; `--` is not
a separator (was breaking build-linux-prebuild/binary scripts)
- .gitignore: *.bun-build (orphan temp files) +
packages/libgpod-node/prebuilds/ (locally-built native bindings)
- agents/device-testing.md: refreshed; documents the Tier-3 test shape
Tests: 210 pass / 11 skip / 0 fail in @podkit/device-testing
32 pass / 0 fail in @podkit/dummy-hcd
57/57 turbo tasks green
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`build-linux-binary.sh` ran `bun install` inside the builder VM with
`--workdir $REPO_ROOT` pointed at the macOS-mounted repo, then attempted
to redirect `node_modules` to a VM-local /tmp path via:
mv node_modules /tmp/podkit-builder-nm-host-saved
ln -s /tmp/podkit-builder-nm node_modules
Both operations executed against the host-mounted tree. `mv` between
filesystems is a copy-then-delete, so the host's `node_modules` got
moved into the VM's tmpfs and deleted from macOS. The follow-up symlink
left the host with a broken `node_modules → /tmp/podkit-builder-nm`
pointer to a VM-only path.
Switch to the rsync-to-VM-local-checkout pattern already proven by
`mise vipod:install`:
1. rsync $REPO_ROOT → $VM_NAME:/tmp/podkit-builder-src (no
node_modules, .turbo, dist, .git, bin, *.img, src-tauri/target)
2. Build inside /tmp/podkit-builder-src (bun install + turbo build
+ compile.sh) — host tree is untouchable
3. `limactl copy` the resulting podkit binary back to
packages/podkit-cli/bin/podkit-linux-${arch}
The libgpod-node prebuild is NOT copied back — it was written to the
host by the prerequisite `build:linux-prebuild` turbo task before this
script ran, so the rsync carries it INTO the VM checkout and the host
copy is already canonical.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a CLI surface to podkit doctor that runs only the system-scope
checks, without requiring a registered device. Needed by Tier-3
baseline tests (TASK-322.06) to assert system-state against a
SystemState snapshot without first running `podkit device add`.
- `--scope system` skips device resolution; emits {success, status,
healthy, scope: 'system', checks[]} with no readiness section.
- `--scope device` requires -d (same DEVICE_REQUIRED error as repair).
- `--scope all` (default) preserves existing output byte-for-byte and
continues to honour the legacy --no-system flag.
Exit-code semantics match TASK-308: warn or fail in any check sets
exit 2. Exported resolveDoctorScopes() and runSystemOnlyDoctor() for
unit-test injection.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lima 2.x's vz driver exits "unimplemented" on every limactl snapshot call. Measured apply-state.sh cost on podkit-test-vm (aarch64, warm cache) is sub-2-second per state flip (~740ms reinstall, ~860ms purge+install) — well inside the test budget for the current 6-state matrix. Decision (recorded in ADR-016 §"Test speed strategy"): stay with apply-state.sh-every-time on vz; pin vmType: vz in test-vm.yaml so the choice is explicit; keep isSnapshotUnsupported() as a documented contingency so Linux/qemu hosts get the snapshot fast path automatically. Revisit when the matrix exceeds ~20 states or per-state cost exceeds 5s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ion) Closes the deferred enumeration loop on the dummy-hcd daemon. Without the handshake, mounting FunctionFS and opening ep0 wasn't enough to make dummy_hcd enumerate a device — podkit device scan saw nothing. New tools/device-testing/dummy-hcd/src/descriptors.ts builds the FUNCTIONFS_DESCRIPTORS_MAGIC_V2 head + FS/HS interface+bulk-IN endpoint descriptors and the empty-strings table. Pure byte-packing, 14 host-side unit tests cover magic / length / endpoint counts. runFunctionFs() now writes both buffers to ep0, starts the read loop, calls a caller-supplied attachUdc() hook, and resolves the handle only after FUNCTIONFS_BIND fires (10s watchdog). The UDC write is what causes the BIND event, so the read loop must already be live to observe it. Teardown order reworked to unbindGadget → ffs.shutdown → destroyGadget. shutdown() uses umount -l (lazy) and fire-and-forget ep0.close() to break a kernel deadlock where awaiting ep0.close() blocked on a pending read that itself waited on the gadget being unbound. Added unbindGadget() helper to split the UDC-write step out of destroyGadget(). Added teardownStarted flag for clean simultaneous SIGINT+SIGTERM handling. Live-VM verified on podkit-test-vm (aarch64): both ipod-video-5g-iflash-1tb (05ac:1209) and ipod-nano-7g-space-gray (05ac:1267) enumerate as Apple, Inc. USB devices via lsusb and disappear cleanly on SIGTERM (configfs tree empty, lsusb empty). Tier-3 test in personas-baseline.tier3.test.ts now: - Cross-checks persona vendor/product via `lsusb -d`. - Asserts `podkit doctor --scope system --json` against the SystemState fixture (exit code + overall healthy bit). - Keeps the device-scan check at envelope-shape level because Linux's findIpodDevices is lsblk-based and FFS-only personas legitimately have no block device. Known gaps (documented in the task's implementation notes, not closed here): echo-mini lacks fixture data and is excluded from the sidecar; podkit's Linux device scan would need a USB-walk path to satisfy AC #4 strict reading; dummy_hcd's /sys/class/udc/.../state field is sticky- stale and doesn't reset on unbind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four post-Phase-3 reflection items landed:
- **TASK-322.04.01**: prepare() now installs dummy-hcd-daemon@.service
into /etc/systemd/system/ and runs daemon-reload, sha256-idempotent.
Closes the first-run "Unit not found" tripwire that hit any freshly-
provisioned podkit-test-vm. New module lima-test-vm-systemd.ts mirrors
the binary-transfer pattern; reviewer GO with one nit (optional-
chaining on resolveDummyHcdDaemonUnit — asymmetric but correct).
- **TASK-322.06.01**: groupPersonasByState filters personas with no
daemon payload (sysInfoExtendedXml === null AND massStorageBackingFile
=== null), emits one stderr warning per excluded persona once per
session, references TASK-324 in the message. echo-mini is excluded
today; the canary test flips when TASK-324 captures real data. The
filter stays as a tripwire for future bare personas.
- **TASK-334**: device scan --format json now emits USB-only devices
with usbOnly: true + a usbDescriptor block; block-device entries also
get usbDescriptor when the USB join finds a match. The lsusb-cross-
check stopgap in personas-baseline.tier3.test.ts is removed in favour
of a direct vendor/product assertion. Schema is additive; macOS is
unchanged. Implementation lived in packages/podkit-cli/src/commands/
device/scan.ts (and the join in podkit-core/src/device/usb-
enumeration.ts already existed) — not platforms/linux.ts as originally
framed; the real gap was the JSON envelope, not the join.
- **TASK-335**: three polish hardenings.
1. runDiagnostics now bypasses the device-type filter when scopes ===
['system'], so future system-scope checks declared for mass-storage
only fire under --scope system regardless of device type.
2. IpodDatabase.open() is gated on allowedScopes.includes('device'),
skipping the wasted call on system-only invocations.
3. lima-test-vm-snapshots emits one stderr warning the first time
isSnapshotUnsupported() returns true in a process, naming the vz
driver and linking to TASK-322.02.01.
Tests: 57/57 turbo tasks green. @podkit/core 2468 unit + 67 integration
passes; @podkit/device-testing 241 + 0 fail; @podkit/e2e-tests 27/0;
no behavioural regressions in TASK-333 or TASK-322.02.01 tests.
Updated TASK-324 (Phase 5 persona expansion) with a new AC + note
flagging the echo-mini Tier-3 gap that 322.06.01 papers over and 324
actually fixes.
m-19 remaining work: TASK-301..308 (doctor-coverage matrices), TASK-324
(Phase 5 persona expansion), TASK-331 (ReadinessLevel: 'unsupported').
Every harness primitive these tasks need is now landed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six tasks landed across three sub-phases. Tier-1 coverage; Tier-3 deferrals continue per the TASK-322.05.01 dep. **5a — forcing decisions:** - **TASK-308**: warn-counts-as-unhealthy decision recorded in agents/testing.md §"Doctor exit-code & overall-health semantics". Exit-code table: 0 = clean / 1 = CliError or hard rejection / 2 = issues found. 27-test matrix in doctor-exit-code.test.ts covering every (readiness × check-status × scope × persona-type) cell. Surfaced the long-standing AC-text-vs-code discrepancy (ACs said "exit 1" for unhealthy-diagnostic-ran, code emits exit 2); pinned reality and updated AC text after reviewer confirmed code was right. - **TASK-331**: `ReadinessLevel` gains 'unsupported' variant + `ReadinessResult.unsupportedReason?: string`. determineLevel() short-circuits when the cascade hits a recognised-but-rejected device (Apple unsupported PIDs, iOS-range fallback, or non-Apple vendor without a preset). New devices-mass-storage/unsupported.ts carries the Sony Walkman entry. Threaded through readiness-display, device-scan-render, doctor, device/info, device/init. Both rejection personas (touch-5g, sony-nwz-e384) flipped to 'unsupported' with canonical reason text. **5b — system-scope coverage:** - **TASK-307**: 33-test flag matrix in doctor-flag-matrix.test.ts covering all 17 ACs including the new --scope flag from TASK-333. AC #16's --scope × --json × --no-system cross-product is parametric. Extracted runDoctorAction() from doctor.ts's Commander action callback to expose the validation flow to in-process tests — pure refactor, 52 existing tests still green. - **TASK-301**: 23-test matrix in system-scope-matrix.test.ts covering inquiry-methods, codec-encoders, video-encoder, udev-rule across system-state permutations. 4 ACs reconciled — text was aspirational (inquiry-methods is SCSI-axis-only by design, codec- encoders correctly returns warn-not-fail for missing encoders, etc.). 4 ACs deferred to new TASK-336 (udev-rule lacks detection logic). video-encoder.ts refactored to expose checkVideoEncoderForRunner() pure function. **5c — device-scope coverage:** - **TASK-302**: 34-test stage-matrix.test.ts driving the readiness pipeline through all 6 stages × multiple-state permutations. Skip-cascade (ACs #17-#19) and derived-level (#20) are both parametric. Format parity (#21) compares JSON-vs-text structurally, no string snapshots. 2 ACs deferred (database pass-path is libgpod- bound; platform-stage skip path doesn't reach the cascade). Two pipeline observability gaps flagged in implementation notes for follow-up. - **TASK-303**: extended sysinfo-consistency.test.ts with persona smoke block driving the captured XML fixtures through the production parse → identify → axis-compare path. All 15 ACs covered. Surfaced a model-table tension between `pid 0x1209 → video_5g` and `ModelNumStr A446 → video_5_5g` for the iPod 5.5G persona — non-blocking, documented inline. **New follow-up filed:** - **TASK-336** (Low): udev-rule check needs rule-presence + staleness detection. Closes TASK-301 ACs #11-#14 once landed. Quality gates: 57/57 turbo tasks green; 2577 @podkit/core tests pass; @podkit/device-testing 247/0; tsc + oxlint clean throughout. No behavioural regressions in any pre-existing test. m-19 remaining: TASK-304/305/306 (artwork + orphan-files coverage), TASK-324 (persona expansion), TASK-336 (the udev-rule follow-up). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
**5d — device-scope check matrices:** - **TASK-304** (artwork-rebuild + artwork-reset): 25-test matrix in artwork-matrix.test.ts. Detection paths (#1-#6) use real temp-dir ArtworkDB binaries built via the existing builder; repair paths (#7-#12) drive against a stateful in-memory IpodDatabase fake that makes the idempotency check (#12) genuine — second run sees the first run's mutations and short-circuits. Reset paths (#13-#14) + applicableTo (#15) round it out. 3 findings flagged in notes (none material). - **TASK-305** (orphan-files iPod): 16 tests across two files — detection + repair in orphans-matrix.test.ts (core), rendering helpers (CSV escape, verbose grouping, top-10-largest) in doctor-flag-matrix.test.ts (CLI). All 14 ACs. - **TASK-306** (orphan-files mass-storage): 14-test matrix exercising echo-mini / generic / rockbox presets + the per-device > device- defaults > preset-default content-path resolution chain. Mass-storage / iPod check exclusivity (#11-#12) asserted both directions. All three pass through the warn-counts-as-unhealthy → exit 2 rule locked in by TASK-308. **5e — synthesised persona expansion (AFK half of TASK-324):** - `ipod-shuffle-not-supported` — Apple unsupported-PID rejection (shuffle 3G 0x05ac:0x1302). - `non-ipod-usb-disk` — Non-Apple vendor-no-preset rejection (SanDisk Cruzer Blade 0x0781:0x5567). Adds SanDisk to devices-mass-storage UNSUPPORTED_VENDORS table. - `malformed-sysinfo` — SIE-parser error path. Real iPod 5G USB identity + deliberately-truncated SIE XML (500-byte cut). 18 new persona-smoke tests. All three personas wired into the registry, smoke-tested, and documented in agents/device-testing.md + documents/test-devices.md. provenance.md per persona records the synthesis recipe. TASK-324 ACs #3 + #4 ticked. ACs #1, #2, #5-#8 deferred to HITL hardware sessions (corrupt-db, populated echo-mini, Rockbox firmware). **New follow-ups filed:** - **TASK-337** (Low): JSON shape symmetry. Surfaced by all three 5d workers — pass-path on orphans/orphans-mass-storage/artwork omits the `details` object so JSON consumers see undefined instead of zero-valued fields. Quality gates: 57/57 turbo tasks green; @podkit/core 2627/0; @podkit/device-testing 269/0; tsc + oxlint clean. m-19 substantively complete pending HITL hardware captures + Low- priority follow-ups (TASK-336 udev-rule detection, TASK-337 JSON shape, hardware ACs in TASK-324). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…stage details Three Low-priority follow-ups landed together — closes every flagged gap from Phase 5 reviews. - **TASK-336** udev-rule check: rule-presence + staleness detection. Dropped `repairOnly: true`; added `checkUdevRule()` with an injectable `readFile` seam. Returns pass / fail+repairable / warn+repairable / fail (EACCES) on Linux; skip on macOS+win32. Stale-diff text is intentionally terse — `"installed N bytes / M lines, expected N' / M'"` — for JSON consumers to spot drift without bloat. Repair-side path verifiably unchanged: all 15 pre-existing repair tests pass. TASK-301 ACs #11-#14 ticked, deferral notes replaced by a cross-reference to TASK-336. - **TASK-337** JSON shape symmetry: pass-path emits zero-valued details. orphans / orphans-mass-storage / artwork checks now emit `{ orphanCount: 0, ... }` / `{ corruptEntries: 0, ... }` on pass instead of an undefined `details` object. ~20 line Δ total. Text renderer already guards on status; no UX change. - **TASK-338** readiness stage details enrichment. usb stage pass-path now mirrors the unsupported-path shape `{ identifier, vendorId, productId, usbModel }`. Partition stage threads `PartitionLayout` from the platform probe through to details (`{ partitionCount, partitions: [{ index, filesystem, sizeBytes, identifier?, volumeUuid? }] }`). `PlatformDeviceInfo` widened additively with `filesystem?` + `partitionLayout?`. Platform asymmetries documented inline: Linux surfaces full kernel partition table, macOS surfaces user-visible only; filesystem strings differ ("vfat" vs "MS-DOS FAT32"). Net +3 tests in stage-matrix.test.ts. Quality gates: 57/57 turbo tasks green; 2643 @podkit/core pass; tsc + oxlint clean throughout. No commits made by individual workers; integrated state verified end-to-end before this commit. m-19 status: substantively COMPLETE. Only deferred work is the HITL hardware captures in TASK-324 (corrupt-db iPod 5G, populated echo-mini, Rockbox install) which require physical-device sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ary counts warns
Two small fixes from the reflection sweep.
**Persona registry: Bun native asset imports (was: base64 codegen)**
The previous attempt at fixing the persona registry's module-eval
fs.readFileSync problem used a generator script that emitted
base64-encoded raw fixtures into `raw.generated.ts` files. That was
correct but ugly — codegen step + base64 blobs + extra build script.
Switched to Bun's native asset-import attributes:
```ts
import sysInfoExtendedXml from './raw/sysinfo-extended.xml' with { type: 'text' };
import lsblkJson from './raw/lsblk.json' with { type: 'json' };
```
Bun's bundler inlines the file content at build time; dev (TS source)
runs see the raw bytes via its loader. The persona files now show
their actual fixture filenames instead of base64 imports.
Deleted: 16 `raw.generated.ts` files, `scripts/generate-raw-fixtures.ts`,
`src/personas/lazy.ts`, the `generate:raw-fixtures` + `prebuild`
package.json scripts. Added: `src/personas/text-imports.d.ts`
(ambient declarations for the `with` import attributes).
The no-fs-at-load smoke test still passes — 0 readFileSync calls at
module-eval. Bundle output (`dist/index.js`, 260 KB) contains the
fixture content inlined (verified by grep for fixture tokens).
16 persona files updated.
**Doctor text summary: count warns alongside fails**
TASK-308 locked in warn-counts-as-unhealthy → exit 2, but the human-
text summary line ("N issues found." or "All checks passed.") was
still counting `fail` only. Same fixture would emit `exit=2` paired
with `"All checks passed."` when only warns existed — confusing.
Fixed in `packages/podkit-cli/src/commands/doctor.ts:816-829`: the
`issueCount` accumulation now includes `c.status === 'warn'` alongside
`'fail'`. The readiness-stages filter likewise. The mass-storage and
system-only paths were already correct.
Updated `doctor-exit-code.test.ts` AC #9 assertion to match new
behaviour (was: "stale comment removed; behavior unchanged"; now:
"2 issues found" emitted when 1 fail + 1 warn).
**Quality gates**: 57/57 turbo green; tsc + oxlint clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…evice scan Refuses HFS+ iPods on Linux at `podkit device add` (non-zero exit + JSON error code `UNSUPPORTED_FILESYSTEM_ON_LINUX`) and surfaces a clear filesystem-not-supported warning through the readiness pipeline at `podkit device scan` instead of running readiness stages or suggesting destructive remediation. The Linux kernel hfsplus driver refuses RW on journaled HFS+ (the iPod default), udev/blkid don't surface a filesystem UUID for HFS+ on Linux (breaking the volumeUuid identity model), and udisksctl mount paths fall back to a generic name with no label. Refusing cleanly with a docs link is structurally cleaner than patching all three friction points and sharpens podkit's Linux story to "FAT32 iPods, supported well." macOS HFS+ is unchanged. Architecture - New `packages/podkit-core/src/device/filesystem-policy.ts` — single source of truth: `isFilesystemUnsupportedHere(filesystem, platform)`, `formatHfsplusOnLinuxRefusal()` (returns `string[]`). - Readiness pipeline gains an HFS+-on-Linux short-circuit that emits `level: 'unsupported'` with the canonical refusal text joined into `unsupportedReason` (main's existing string field — no discriminated union). No placeholder "Skipped" rows are pushed. - `device add`: refusal injected in BOTH iPod branches (explicit `--path` + scan-found) BEFORE any state mutation. Trailing-slash path normalisation on the `--path` lookup. - Docs: new `docs/devices/linux-filesystems.md`. AC #5 (real-hardware) deferred to TASK-319. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces `unsupportedReason: string` with structured `unsupported: ReadinessUnsupportedReason` across the readiness pipeline, CLI rendering, and JSON output. Carries machine-readable fields (kind discriminator, headline, optional details, optional docs URL, filesystem + path for filesystem-policy rejections) so JSON consumers get rich diagnostics and the CLI emits multi-line messages without parsing strings. Migrated producers: filesystem-policy.ts (HFS+ on Linux), classify.ts (Apple iPod / iOS PIDs), unsupported.ts (mass-storage no preset), the pass-through paths in scan.ts and elsewhere. Migrated consumers: readiness-display.ts (multi-line headline + details + docs link), device-scan-render.ts, doctor.ts, device/info.ts, device/init.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…evice scan Adds a pure `reconcileIpodDiscovery(blockDevices, classifiedUsb)` primitive in `@podkit/core` that folds the two `device scan` pipelines into one record per physical iPod (match priority: serial → disk-identifier with partition suffix stripped on both sides → emit-separate). Wires it into `runDeviceScan`, replacing the ad-hoc disk-name correlation that produced a double-entry on Linux. Replaces the destructive `Needs partitioning — see: podkit device init` readiness copy with a docs link (the suggestion was doubly wrong: `device init` doesn't partition, and won't run on an unmounted device). Also extends `stripPartitionSuffix` to handle macOS BSD names and plumbs the sysfs USB fingerprint through `findIpodDevices` on Linux so block-side records carry the data reconciliation needs. AC #5 (real-hardware) deferred to TASK-319. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… unparseable SIE Fixes three correctness bugs in `podkit doctor`'s repair flow: - `--repair sysinfo-consistency` now overwrites a stale on-disk SysInfoExtended via a new `force` option on `ensureSysInfoExtended`. Previously the function short-circuited when a file was present, so the repair reported success without actually re-reading from firmware. - `--repair sysinfo-extended` (and any future repair that doesn't read the iTunesDB) no longer requires the DB to be openable. New `'database'` value on `RepairRequirement`; `runRepair()` only opens the DB when the repair declares it. Artwork + orphans repairs declare it; sysinfo repairs don't. - The readiness `SysInfoExtended:` status line distinguishes "not present" from "present but unparseable" so users can see when the file exists but won't parse (rather than thinking it's missing). Bug 3 (wires-crossed failure-explanation text) was already fixed on main by m-19's `buildCheckFailureDetails` switch; not duplicated here. AC #6 (real-hardware) deferred to TASK-319. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k id The database-health issues loop in `runDoctor` unconditionally pushed the artwork-out-of-sync text into every failing check's details, so a failing `sysinfo-consistency` check would surface "The artwork database is out of sync with the thumbnail files. Affected tracks display wrong or missing artwork on the iPod." — wires crossed. Route by check id: - `artwork-rebuild`: keep the ithmb stats + artwork copy - `sysinfo-consistency`: new copy referencing the stale on-disk file + the `--repair sysinfo-consistency` command - Other check ids: no detail text from this loop (the check's own `summary` already carries the message) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `packages/podkit-core/src/docs-urls.ts` exporting: - `DOCS_BASE_URL = 'https://jvgomg.github.io/podkit'` (current Starlight host on GitHub Pages) - `docsUrl(slug)` helper for ad-hoc URL construction - `DOCS_URLS` registry of canonical pages (supportedDevices, linuxFilesystems, troubleshooting, artworkRepair, macosMounting, soundCheck, userGuideConfiguration, cleanArtists) Migrates every existing literal docs URL in core + cli to import from the registry. Replaces the prior forward-looking `docs.podkit.app` host (used by TASK-317.11/.12 work) with the live `jvgomg.github.io/podkit` host so refusal messages and troubleshooting pointers resolve today. A single host or path layout change now lives in one file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…emediation Default firmware-inquiry failure output now names every transport tried (USB, SCSI) with each transport's reason on its own line, includes a remediation hint (podkit doctor --repair udev-rule for EACCES on /dev/sg* or /dev/bus/usb/...), and appends a (re-run with -vv for more detail) footer when verbose is not set. -vv adds libusb/ioctl detail and drops the footer. The linka EACCES repro now renders both transports' permission-denied paths instead of the previous one-line "Could not read device identity from USB" black box. Confirmed open question: the orchestrator's USB-then-SCSI plan already falls through to SCSI on any USB transport-layer throw (including EACCES); no plan-selection bug to fix. Added an orchestrator unit test to guard against any future regression that special-cases EACCES on USB and skips the planned SCSI fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The podkit udev rule previously only matched `SUBSYSTEM=="scsi_generic"`, granting plugdev access to `/dev/sg*` for SCSI VPD INQUIRY. It left `/dev/bus/usb/<bus>/<dev>` at the kernel default `0664 root:root`, so libusb-based firmware inquiry hit EACCES from SSH sessions, headless boxes, Docker containers, and CI — systemd-logind's `uaccess` ACL grants those nodes to active console seats only. Reproduced on linka 2026-05-09: even with the SCSI rule installed and james in plugdev, the USB-inquiry half of `doctor --repair sysinfo-extended` failed with no detail (orchestrator messaging fixed separately in TASK-317.14 / eed4126). Extends the rule with a second clause: SUBSYSTEM=="usb", ATTR{idVendor}=="05ac", MODE="0660", GROUP="plugdev", TAG+="uaccess" Attribute case matters: ATTR{} (singular) on the USB device's own attribute, ATTRS{} (plural) on the SCSI scope because scsi_generic has to walk up to the parent USB device. Renames the rule file `91-podkit-ipod-scsi.rules` → `91-podkit-ipod.rules` (it covers more than SCSI now). The repair installs the new filename and issues `sudo rm -f` for every legacy path in a new `LEGACY_TARGET_PATHS` constant, so users upgrading from an earlier podkit don't end up with two rule files loaded by udev. Cleanup runs only AFTER the new rule is in place — if `sudo cp` fails, the old rule stays untouched. In-source `UDEV_RULE_CONTENT` and the shipped `packages/podkit-cli/share/91-podkit-ipod.rules` are now byte-identical; a new test reads the share file and asserts string equality so they can't drift. Tests: - Rule-content shape tests assert both SCSI and USB clauses, the ATTR vs ATTRS distinction, exactly two `idVendor` matches. - New share-file equality test (covers AC #6 snapshot semantics). - `runUdevRuleInstall` tests for legacy cleanup: rm -f per legacy path, ordering after sudo cp, atomicity on cp failure, dry-run summary mentions the legacy paths. - 44/44 tests pass for the udev-rule module; 2701 unit tests + 69 integration tests pass for @podkit/core + podkit. Hardware verification (linka replug, real sudo, nano 3G + nano 4G) deferred to TASK-319 per task scope — the CLI flow is fully tested with mocked filesystem and executor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eadable Replaces the legacy synthetic-UUID fallback (if any survives) with a clean refusal. Without a real UUID, podkit can't identify the iPod across replug cycles — better to surface the problem at add-time than break downstream commands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New doctor check `sysinfo-modelnum-mismatch` surfaces stale/edited classic
SysInfo where ModelNumStr disagrees with the firmware-derived identity. The
identity cascade in `resolveIpodModel` silently picks ModelNumStr today,
so podkit would misidentify TERAPOD (iPod 5G with iFlash mod) as video_5g
when the firmware-stamped serial 9C642MEFV9M → V9M → A446 puts it on
video_5_5g. The existing `sysinfo-consistency` check compares ModelNumStr
vs USB (both 5G — agreed) so no signal fires; this new check compares
classic ModelNumStr vs the SysInfoExtended-derived serial suffix.
The check is `warn` (not `fail`): the device still works, it's just
misidentified. Skips silently when the on-disk ModelNumStr is the only
identity available (mini 2G S4G regression target) or when firmware truth
is unobtainable.
Repair `--repair sysinfo-modelnum-mismatch` rewrites the on-disk
ModelNumStr line in place from the firmware-derived variant, after
backing up the original to SysInfo.podkit-backup. Other lines preserved
verbatim to protect uncatalogued keys.
Firmware-truth cascade: SysInfoExtended.SerialNumber first (richest;
firmware-stamped; gives variant via suffix lookup), then liveIdentity.model
(USB-derived; generation only). USB-only truth can detect the mismatch but
the repair refuses to write (no model number to project back).
Injection seams (SysInfoFsReader + SieReader) keep tests off the real
filesystem without `mock.module('@podkit/ipod-firmware', ...)`, which
would leak across Bun test files and break unrelated readiness suites.
Hardware verification of the TERAPOD before/after flow and the 5-device
regression sweep deferred to TASK-319 per the task spec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bridge `makeUnsupportedReasonFromAssessment` in `@podkit/core` is the single source of truth — every device command imports it instead of re-deriving wording. No command leaks `libgpod` into user-facing copy. Behaviour changes: - `device add` on an unsupported device now warns and offers "Add anyway? [y/N]" (--yes flips default). Confirmed devices land in config with `unsupported: true`. - `device add` consults USB classification when disk scan finds nothing so iOS devices surface the canonical unsupported message instead of the generic "No iPod devices found". - `device scan` shows the resolved model label (or `iOS device` for unknown iOS-range PIDs) instead of `Unknown iPod`. - `sync --dry-run` refuses cleanly on unsupported generations before opening anything — no track plan generated. - `sync`'s `open-device.ts` composes the full identity cascade (SIE + USB + libgpod fallback) so the libgpod-only "Could not identify" warning is gone for SIE-present devices. - `device info` renders the cascade `displayName`. - `doctor` suppresses mutating repair suggestions on unsupported readiness AND refuses explicit `--repair` invocations against unsupported devices (guards against corruption on hashAB nanos). New `DeviceConfig.unsupported?: boolean` field records the user's warn-allow confirmation. TOML round-trips through loader + writer. AC #11 (hardware) deferred to TASK-319. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before: iPods got a clean three-section structure (System / Device
Readiness / Database Health) but mass-storage devices (Echo Mini)
collapsed every check into one "Device Health" bucket AND ran (then
mis-categorised) three system-scope checks plus iPod-only Firmware
Inquiry Methods.
After: a single unified renderer with the same section ordering on
every device type. Empty sections are omitted, so an Echo Mini with
no readiness-category checks just shows System then Database Health.
Architecture: kept it additive (option B from the spec). New
`category?: 'readiness' | 'database'` field on `DiagnosticCheck`
discriminates device-scope checks into the right subsection without
breaking the existing `scope` union (which would have touched every
check). Forwarded through `DiagnosticReport.checks` and the JSON
`DoctorCheckOutput` envelope so consumers can re-render the same
grouping.
Renderer: extracted `printGroupedChecks(out, checks)` in doctor.ts
that the mass-storage path now calls directly. iPod path keeps its
readiness-stage pipeline for the Device Readiness section but uses
the same scope/category filters for System + Database Health, and
skips device+readiness checks in Database Health to avoid double
rendering once a mass-storage readiness check is added.
iPod-only system check: `inquiry-methods` is now
`applicableTo: ['ipod']`. The check probes SCSI/USB transports
specific to iPod firmware inquiry, so surfacing it on an Echo Mini
under "System" misleads users into thinking iPodDriver.kext matters
for their device. The existing applicableTo filter in the registry
already handles the routing.
Per-check categorisation (every device-scope check tagged
`category: 'database'`):
- artwork-rebuild, artwork-reset
- orphan-files, orphan-files-mass-storage
- sysinfo-extended, sysinfo-consistency, sysinfo-modelnum-mismatch
Tests:
- scope-category-matrix.test.ts: per-check assertion that scope +
category + applicableTo are declared correctly. Single
expectation table is the source of truth for which section each
check lands in; new checks add a row here.
- doctor-grouped-render.test.ts: drives `printGroupedChecks` with
synthetic check fixtures to pin section ordering, empty-section
omission, repairOnly skipping, the legacy-no-category fallback,
and the Echo Mini scenario.
- inquiry-methods.test.ts: pinned applicableTo to ['ipod'].
Quality gates: lint 0 errors, build OK, 2762 unit tests pass, 69
integration tests pass.
AC #1-#8 covered; AC #9 (real-hardware verification) deferred to
TASK-319 per task spec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Status + AC checks for the m-18 hygiene cluster work landed this session. AC #N (real-hardware) on each task stays open — tracked under TASK-319. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TASK-317.08 introduced `category: 'readiness' | 'database'` as an additive
second field next to `scope: 'system' | 'device'`, with a renderer fallback
that defaulted device-scope checks without a category to Database Health.
That was the deferred Approach B — additive and unenforced. Every new
device-scope check had to remember to tag itself, and a forgotten tag
silently rendered in the wrong section.
This commit lands Approach A: a single required scope discriminator on
every check, with no defaulting and no fallback. `DiagnosticCheck.scope`
is now `'system' | 'device-readiness' | 'database-health'` and is no
longer optional — the compiler rejects any check that omits it. The
renderer branches on `scope` directly with three buckets and prints
sections in fixed order (System → Device Readiness → Database Health),
omitting empty ones.
The user-facing CLI surface is unchanged. `--scope system | device | all`
still accepts the same three values; `device` simply expands internally
to the two device-side scopes (`device-readiness` + `database-health`)
before being forwarded to `core.runDiagnostics({ scopes })`. `--no-system`
keeps its existing meaning of "skip system checks; run everything else".
JSON shape change: `DoctorCheckOutput` and `DiagnosticReport.checks[]`
no longer carry a `category` field, and the `scope` values are the new
3-way union. The additive field only landed in TASK-317.08 (commit
78b0c71), so there are no external consumers depending on the prior
shape — no migration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…econcile, DOCS_URLS) Demo build mocks `@podkit/core` to avoid loading the real bundle. New exports added by recent m-18 integration work need corresponding stubs so the demo CLI build typechecks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ReadinessUnsupportedReason moves to @podkit/device-types so resolveIpodModel can return it directly on IpodModel. Removes the bridge functions in @podkit/core. Single source of truth — every consumer reads model.unsupportedReason or assessment.model?.unsupportedReason without a bridge import. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces flat boolean with {kind, confirmedAt}. The kind captures
which unsupported-reason class triggered the warn-allow prompt
(ios-device, unsupported-device, etc.) so a future reader can tell
why the device was confirmed without re-running assessment. The
confirmedAt ISO timestamp records when.
Preserves truthy-check semantics (sync.ts gates on truthiness — an
object is truthy). Silently coerces legacy boolean shape (unsupported
= true) to {kind:'unsupported-device', confirmedAt:<epoch>} on load.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-ups TASK-285 / 287 / 288 superseded by P0–P4 split and the TASK-317 hygiene cluster — closed with explanatory final summaries. New tasks: - TASK-341 (m-19): Linux VM test coverage matrix for every TASK-317 hygiene shipped behaviour — persona-driven, hardware-deferred. - TASK-342 (m-18): macOS-specific regression coverage (HFS+ stays supported, system_profiler bsd_name partition-level case, etc.). - TASK-343 (m-18): tech-debt sweep — three other shapes carrying bare-string notSupportedReason, docs-live cherry-pick gap, test-style + mocking inconsistency, stale worktrees, DOCS_URLS trailing slash drift, worktree-then-integrate waste, backlog state churn, pre-existing lint warnings, large CLI command files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rget TASK-317.15's volumeUuid refusal broke e2e tests that drive `device add --path` against the tmpdir-backed dummy iPod target (no real filesystem, no real UUID). Add a test-only escape hatch: - `synthesizeTestVolumeUuid(path)` in `device/add.ts` returns a deterministic `test-<slug>` UUID when `PODKIT_TEST_SYNTHETIC_VOLUME_UUID=1` is set in the env; both `--path` and scan-found refusal sites consult it before throwing. - The e2e CLI runner (`packages/e2e-tests/src/helpers/cli-runner.ts`) sets the env var unconditionally so every dummy-target test gets the hatch. Real users never set this variable; production refusal is unchanged. All 27 e2e workflows now pass against the dummy target. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…st flows Captures the design problem behind the env-var hatch shipped in 3a332be: real users + e2e tests both need a way to skip the platform discovery routine when they already know the device. Includes current workaround (PODKIT_TEST_SYNTHETIC_VOLUME_UUID), proposed --no-scan API, open design questions, and migration plan to remove the env-var once the flag lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`device add`'s warn-allow flow (TASK-317.03) was unconditionally
attempting `ensureSysInfoExtendedAndReassess` after the user accepted
"Add anyway? [y/N]". Writing SysInfoExtended to a device we've just
recorded as `unsupported = { kind, confirmedAt }` is wasted work — and
fails outright against tmpdir-backed test paths (`EACCES: mkdir
/Volumes/TOUCH`) since the offered firmware-inquiry write is the only
step in the flow that actually requires a writable filesystem.
Gate `offerFirmwareInquiry` on `!recordUnsupported` in both the
explicit-`--path` branch and the scan-found branch. The skipped write
matches the no-database-init behaviour already in place for unsupported
devices.
Fixes the failing unit test
`runDeviceAdd: nano 2G slick-flow > persists unsupported rich shape
when the user accepts the warn-allow prompt (TASK-317.03)`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three were legitimate console.warn / console.log calls that lacked
eslint-disable comments (ipod-adapter's best-effort tag-write warnings;
device-testing's no-fs-at-load probe script). Adds the disable
directives with explanatory comments.
One was a real fix: mass-storage-tag-writer.ts's `new Array(n)` swapped
for `Array.from({ length: n })` per oxlint's unicorn/no-new-array
recommendation. Behaviour-equivalent.
oxlint now reports 0 warnings, 0 errors across 783 files. TASK-343
item 8 (pre-existing lint warnings) closed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Starlight serves docs pages with `trailingSlash: 'always'`, so URLs
without a trailing slash were redirected at request time. `docsUrl()`
now appends `/` (unless already present), so every `DOCS_URLS` entry
matches the served URL exactly.
- Drop the `${...}/` appends previously needed in `tips.ts`
- Bring the few remaining inlined docs URLs (devices-ipod, device/add
fallback prompt, demo mock-core, exit-code test fixture) into line
- New `docs-urls.test.ts` asserts every entry ends with `/` so future
drift is caught by CI
…dReason `IpodModel.unsupportedReason` already carries the structured `ReadinessUnsupportedReason` payload (kind + headline + details + docsUrl). Three sibling shapes still carried the legacy bare-string `notSupportedReason` field, forcing every consumer to re-derive the `kind` discriminator and re-attach the docs URL: - `IpodIdentity` (@podkit/device-types) - `IpodClassification` (@podkit/devices-ipod) - `DeviceScanDeviceEntry` (podkit-cli JSON envelope) All three now expose `unsupportedReason?: ReadinessUnsupportedReason`. The shared producer lives in `@podkit/devices-ipod` as `lookupUnsupportedReadinessReason(productId)`, which combines the existing PID-table lookup + iOS-range fallback and picks the `kind` discriminator (`ios-device` for PIDs in 0x1290–0x12af, `unsupported-device` otherwise). The CLI's local `makeIpodUnsupportedReason` helper goes away. The JSON envelope rename is a user-facing breaking change covered by the `.changeset/device-scan-unsupported-reason.md` minor bump: consumers reading `device.notSupportedReason` now read `device.unsupportedReason.headline` for the same single-line message. Also files TASK-345 for the doctor.ts / device/add.ts split (TASK-343 item 9).
Add an explicit Mocking, Assertion-style, and Canonical-fake-builders section to agents/testing.md. The CLI deps-injection seam was already documented in detail, but the rule that `mock.module()` is restricted — because of Bun's process-global registry leakage observed during the m-18 readiness work — wasn't written down anywhere. Lists the five remaining `mock.module()` call sites that are being migrated to dependency injection (TASK-343 item 3, part 2).
`mock.module(specifier, factory)` mutates Bun's process-global module
registry. A `mock.module('@podkit/ipod-firmware', …)` in one test file
was observed leaking into unrelated readiness tests during the m-18
hygiene cluster. Existing CLI runners already use dependency-injected
`Deps` shapes (`agents/testing.md` §"The deps seam, in detail"); this
commit extends the same pattern down into the four library call sites
that still used `mock.module()`:
- `ipodProvider` → `createIpodProvider({ inquireFirmware })` factory.
`ipodProvider` keeps its default-wired export; tests construct their
own provider with a stubbed firmware-inquiry function.
- `runSysInfoExtendedRepair` → optional `SysInfoExtendedRepairDeps`
parameter for `ensureSysInfoExtended`, `resolveUsbDeviceFromPath`,
and `hasCompleteUsbFingerprint`. Both `sysinfo-extended.test.ts` and
`sysinfo-consistency-repair.test.ts` now invoke the runner directly
with injected fakes; the production check objects remain unchanged.
- `VideoHandler` → optional second constructor parameter
`VideoHandlerDeps` for `transcodeVideo`, `probeVideo`, `executor-fs`
(mkdir/stat/rm), and the iPod `video.ts` helpers.
- `DirectoryAdapter` → optional second constructor parameter
`DirectoryAdapterDeps` for `glob` and `music-metadata`'s
`parseFile`. The class's internal `parseFile` private method had to
rename out of the way (now `parseAudioFile`) — the collision was
hiding silent recursion until the test confirmed the stub was wired.
No production callers change behaviour: every dep defaults to the real
import. Test files no longer touch `mock.module()` at all.
Add `buildEnumeratedUsbDevice(persona)` to `@podkit/device-testing` and migrate `device-scan-render.unit.test.ts` to derive its iPod and mass-storage USB-descriptor fixtures from the persona registry instead of hand-coding bare hex IDs. This is the first step in a longer migration: the persona registry is already the single source of truth for "what USB descriptor does this device present?", but unit tests up the stack have been re-encoding the same descriptors inline. Going through the builder keeps the fixtures in lockstep with the registry — if a persona is renamed, recaptured, or its USB IDs change, the unit tests pick up the change without a separate edit. Three personas now feed the renderer test: - `ipodVideo5gIflash1tb` (supported iPod, PID 0x1209) - `ipodTouch5gUnsupported` (unsupported iOS device, PID 0x12aa) - `echoMini` (mass-storage DAP, PID 0x3203) Adds `@podkit/device-testing` as a devDependency of `podkit-cli` so the persona registry is available to CLI unit tests. Re-exports the individual personas from the package's public entry so callers don't need to reach through the `personas` map by id. Subsequent migrations of `device-scan.unit.test.ts` and the rest of the ad-hoc inline fixtures are deferred — the builder is in place and the migration pattern is established.
Closes 10 tasks across the milestone; Tier-3 baseline goes from broken (4 fails) to 79 pass / 0 fail / 448 expect / 12 test files. Foundation (TASK-346, TASK-348) - Test VM loads sg kernel module + installs dosfstools at boot - apply-state.sh installs podkit udev rule + 99-prefix sg-perms override (Apple-vendor /dev/sg* readable from SSH session) - Daemon: Bun event-loop drain fix on mass-storage-only branch - Daemon mass-storage gadget smoke test (gadget bind → /dev/sg + /dev/sd appear → FAT32 mountable → teardown clean) - 3 starter personas (ipod-video-5g-iflash-1tb, ipod-nano-7g-space-gray, echo-mini) get FAT32 backing via mkfs.vfat --invariant in-VM - waitForScsiGenericEnumeration polls after daemon start + dumps daemon journalctl on timeout - personas-baseline.tier3.test.ts: --format json → --json (CLI flag drift) Schemas (TASK-332, TASK-340) - DevicePersona schema v2: USB descriptor hierarchy (configurations[] / interfaces[] / endpoints[] / stringDescriptors), partitionLayout.luns[] (echo-mini exposes dual LUN), nullable deviceSerial (Sony NW-* migrated from '' to null). All 17 personas migrated 1→2. - PlatformDeviceInfo schema v2: nested sub-objects with discriminated mount-state. size/blockSizeBytes/filesystem/partitionLayout fold into storage; usbFingerprint → usb; isMounted/mountPoint becomes a discriminated union. 13 production files + 10 test fixtures migrated. - Sidecar projection omits null deviceSerial so daemon's default '000000000001' takes effect. Test coverage (TASK-309, TASK-310, TASK-311, TASK-341) - doctor-device-types: check-set selection per device type + preset - doctor-output-contract: JSON schema + human-text section structure (15 ACs pinned in 816-line file) - discovery: identify() / discoverUsbIpods / resolveUsbDeviceFromPath / inquireFirmware permutations (T1 unit + T2 native + T3 VM) - m-18 hygiene cluster: 6 Tier-3 test files covering volume-uuid defensive, doctor consistent sections, scope refactor, udev USB rule, unsupported cascade, discovery reconciliation Personas (TASK-324) - ipod-video-5g-corrupt-db: synthesised 512-byte truncated iTunesDB - echo-mini-populated: 5 × 64-byte mocked tracks via initialContent (TASK-352 still owes the runner wiring; Tier-1 smoke test exercises parsers directly via exported byte arrays) - Sony NW-A1000/A1200/A3000/HD5: readiness shape swept to canonical 'unsupported' discriminant Bug fixes - libgpod no longer leaks into user-facing unsupported headlines (devices-ipod identity / resolve / tables/unsupported) - doctor --repair gate is now symmetric: mass-storage-only repair on an iPod device rejects with INCOMPATIBLE_DEVICE_TYPE (previously only iPod-only repair on mass-storage was gated) Measurement (TASK-339) - Tier-3 wall-clock: 124s. Tripwire fires at 90s spec; recommendation logged for snapshot-strategy follow-up (Options A/C in task notes). Follow-ups created - TASK-347 Capture ipod-classic-rockbox persona (HITL Rockbox install) - TASK-349 Test VM: enable contrib + install hfsprogs (unblocks HFS+ refusal scenarios) - TASK-350 Test VM: build + ship gpod-tool Linux binary (unblocks doctor repair + sysinfo modelnum mismatch scenarios) - TASK-351 dummy-hcd-daemon: per-persona FFS mountpoint (unblocks dual-iPod tests) - TASK-352 Wire initialContent seeding in backing-file synthesiser (unblocks populated / corrupt-db Tier-3 fixtures) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…troyGadget EPERM fix TASK-352 — wire `synthesis.initialContent` into `ensureBackingFile` via mtools (`mmd` + `mcopy`) between `mkfs.vfat --invariant` and the atomic mv. SOURCE_DATE_EPOCH keeps the seeded image byte-stable across runs. Up-front validation of `path` (no leading /, no .., ASCII-safe charset) and `sourceFixture` (no .., must stat as a regular file). Two-layer cleanup: build-script `rm -rf` on the success path, runner `finally` on the failure path. New tier3 suite verifies seeded files via loop-mount + sha256 and pins determinism by re-running synthesis and comparing the post-build sha. TASK-351 — switch the systemd template from hardcoded `/dev/ffs-podkit` + `podkit-test` to per-persona `--gadget-name podkit-%i --ffs-mount /dev/ffs-podkit-%i` so two `dummy-hcd-daemon@<id>.service` units run concurrently without colliding on either resource. Add `dummy_hcd num=4` modprobe.d config so multiple virtual UDCs exist. `attachUdc` walks configfs to skip already-claimed UDCs; `runFunctionFs` drains any leftover lazy-unmounted FFS instance before mounting fresh. New dual-daemon-lifecycle smoke test boots echo-mini + ipod-video-5g together, asserts both gadgets + extra /dev/sg* nodes appear, and verifies clean teardown. TASK-353 — fix EPERM on `destroyGadget` rmdir of `functions/mass_storage.0/lun.0`. The `usb_f_mass_storage` driver pins the implicit lun.0 to its parent function and rejects the direct rmdir; the parent's rmdir auto-removes lun.0. Patch removes lun.0 from the rmdir list and clears `lun.0/file` (releases backing-file open count) before the parent rmdir. Verified Tier-3 GREEN: 14 pass / 0 fail / ~60s across personas-baseline + mass-storage-binding + backing-file-content + dual-daemon-lifecycle. Dual-daemon runs 3x back-to-back with no configfs / UDC leak. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Five Lima VMs renamed for namespace clarity (so they group together in `limactl ls` and don't collide with VMs from other projects) and to distinguish the device-integration harness from the cross-platform test suite VMs. builder → podkit-linux-builder linux-tests-debian → podkit-tests-debian-glibc linux-tests-alpine → podkit-tests-alpine-musl podkit-test-vm → podkit-device-harness virtual-ipod → podkit-virtual-ipod (also: abi-verify → podkit-abi-verify in docs) Matching changes: - Yaml files git-renamed to match VM instance names - `LIMA_TEST_VM_NAME` const → `LIMA_DEVICE_HARNESS_VM_NAME` - `PODKIT_TEST_VM_NAME` env var → `PODKIT_DEVICE_HARNESS_VM_NAME` - Inside-VM paths: `/etc/modules-load.d/podkit-device-harness.conf`, `/etc/modprobe.d/podkit-device-harness-dummy-hcd.conf`, `/var/lib/podkit-device-harness/`, `/etc/udev/rules.d/99-podkit-device-harness-sg-override.rules` - Tauri app constants (`VM_NAME`, error message) - Builder VM script defaults (`BUILDER_VM_NAME:-podkit-linux-builder`) - All docs (READMEs, ADR-016, agents/device-testing.md) - mise.toml task bodies (task aliases stay short, e.g. `device-testing:builder:stop`) Runner module filenames (`lima-test-vm*.ts`) and the exported `limaTestVmRunner` variable kept as-is — orthogonal to VM-name clarity, defer rename. Verified end-to-end repeatability: - All 5 old VMs destroyed - Turbo cache + all dist/ wiped - All 5 VMs recreated from renamed yaml - Cold-cache build via `mise run device-testing:build-linux`: produced podkit-linux-arm64 + dummy-hcd-daemon-linux-arm64 + libgpod-node prebuild in 5m55s - Tier-3 baseline against fresh build: 14 pass / 0 fail / 60s (personas-baseline + mass-storage-binding + backing-file-content + dual-daemon-lifecycle) - Secondary VMs (tests-*, virtual-ipod) boot + shell verified Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… delete dead code
Restructure prelude: vocabulary cleanup + dead-code removal across
@podkit/device-testing. No package moves yet — Phase 2 does that.
Renames
- src/tier3/*.tier3.test.ts (13) → src/vm/*.e2e.test.ts
- src/tier3/tier3-runtime-setup.ts → src/vm/vm-runtime-setup.ts
- TIER3_*, resolveTier3*, resetTier3*, [tier-3] log prefix → VM_*, vm, [vm]
- turbo task @podkit/device-testing#test:tier3 → #test:vm
- Repo-wide JSDoc/comment sweep: "Tier 1/2/3" → "unit/host/VM" tests
Opt-in mechanism
- PODKIT_DEVTEST_RUN_TIER3 env gate removed
- bun run test:vm (new root script) → turbo run test:vm --filter @podkit/device-testing
- bunfig pathIgnorePatterns now excludes **/*.e2e.test.ts from default bun test
- resolveVmAvailability is purely Lima-probe; no env precheck
Deletions
- VM disk snapshot codepath (~770 lines): lima-test-vm-snapshots.{ts,test.ts}
+ base-healthy / base-<state> fast/slow paths in applyState. ADR-016
measured the codepath as never-reachable on Apple Silicon vz driver;
apply-state.sh-every-time was the de facto behaviour. YAGNI; re-add if
Lima vz adds snapshot support.
- Subprocess capture/replay framework: CapturingSubprocessRunner,
ReplaySubprocessRunner, hashSubprocessCall, createSubprocessRunner,
SubprocessFixture, subprocess.{md,test.ts}. Zero runtime consumers;
all references were JSDoc comments. SubprocessRunner interface +
defaultSubprocessRunner re-export retained for tests.
- STARTER_PERSONA_IDS alias map → flat array of literal captured ids
(spec-intent names mapped 1:1 to capture ids, no value added).
ADR updates
- ADR-016 §"Snapshot-based state layering" now records the historical
design + the deletion. Decision is "stay with apply-state.sh-every-time";
the snapshot scaffolding has been removed.
- ADR-017 example using ReplaySubprocessRunner rewritten to a hand-rolled
SubprocessRunner stub.
95 files changed, +774 / -2578.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restructure pkgs so packages/ holds published / published-adjacent code
and a new test-packages/ glob holds testing infrastructure.
Workspace glob
- root package.json: workspaces ["packages/*", "test-packages/*"]
Package moves
- packages/device-testing/ → test-packages/device-testing/
- packages/e2e-tests/ → test-packages/e2e-host-tests/ (renamed)
- packages/gpod-testing/ → test-packages/gpod-testing/
- packages/test-fixtures/ → test-packages/test-fixtures/
- packages/compatibility/ stays in packages/ (published-adjacent)
New workspace packages
- @podkit/device-testing-daemon (test-packages/device-testing-daemon/)
Promoted from tools/device-testing/dummy-hcd/. Replaces brittle relative
cross-package import (../../../packages/device-testing/...) with a real
workspace dep (@podkit/device-testing: workspace:*). bun build --compile
bundles the workspace symlink without ceremony — verified on linux-arm64.
- @podkit/e2e-vm-tests (test-packages/e2e-vm-tests/)
Holds the 11 podkit-feature VM tests (discovery, doctor-*, mass-storage-
binding, udev-usb-scope, etc.) moved out of device-testing/src/vm/.
device-testing/src/vm/ retains the 2 harness self-tests (personas-
baseline + backing-file-content) plus the shared helpers
(vm-runtime-setup, persona-fixture). Helpers now exported from the
device-testing public index.
tools/device-testing/ deletion
- tools/device-testing/lima/ → test-packages/device-testing/lima/
- tools/device-testing/scripts/apply-state.sh
→ test-packages/device-testing/scripts/
- tools/device-testing/dummy-hcd/ → test-packages/device-testing-daemon/
- tools/device-testing/ directory removed.
defaultApplyStateScriptPath() now resolves two levels up from the runner
module into scripts/apply-state.sh (was four levels up + tools/...). Same
script, package-local.
Top-level script rename
- test:e2e → test:e2e (filter renamed to @podkit/e2e-host-tests)
- test:e2e:real → ditto
- test:e2e:docker → ditto + cwd updated
- test:vm → drops --filter so both device-testing (harness
self-tests) and e2e-vm-tests run
Other path updates
- turbo.json: $TURBO_ROOT$/packages/X → test-packages/X for moved pkgs;
daemon build task moved from @podkit/device-testing#build:dummy-hcd-daemon
to @podkit/device-testing-daemon#build.
- oxlint.json: console-allow override path tracked the e2e-tests rename.
- mise.toml, prebuild scripts, lima yamls: path references updated.
- AGENTS.md monorepo structure + entry-points table reflect new layout.
- ADR-016 / ADR-017 path references updated; daemon README rewritten to
describe the workspace-dep flow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A persona was coupling raw input fixture data (USB descriptors, sysinfo XML, lsblk JSON, partitions) with assertion expectations (what podkit *should* produce in response). The split made by Phase 2 — harness lives in @podkit/device-testing, podkit-feature-via-VM suite lives in @podkit/e2e-vm-tests — makes the right home for expectations the test package, not the fixture. Schema v3 (per ADR-017 §"Schema versioning": all entries migrated in one commit, no backwards-compat shims) - DevicePersona drops: expectedCapabilities, expectedReadiness, expectedDoctorOutput - schemaVersion bumped 2 → 3 on all 19 personas - The unused DoctorOutput placeholder type + index re-export removed (it had no remaining callers after the lift) Expectations relocated - test-packages/e2e-vm-tests/src/expectations/<persona-id>.ts (19 files) Each module exports expectedCapabilities, expectedReadiness, expectedDoctorOutput for one persona. Types imported from @podkit/device-types. - test-packages/e2e-vm-tests/src/expectations/index.ts PersonaExpectations registry keyed by persona id, mirroring the personas registry in @podkit/device-testing. Tests relocated (4) - corrupt-db.test.ts - rejection-personas.test.ts - malformed-sysinfo.test.ts - echo-mini-populated.test.ts …all moved from test-packages/device-testing/src/personas/ to test-packages/e2e-vm-tests/src/expectations/. They now import the persona from @podkit/device-testing and the expectations from the local expectation module. Other touches - e2e-vm-tests/package.json: added @podkit/core + @podkit/device-types as prod deps; @podkit/ipod-db + @podkit/ipod-firmware as dev deps to support the moved tests. - Inline DevicePersona literals in 5 unit/integration/VM-runtime tests bumped to v3 + had the three fields removed. - ADR-017: added §"Schema v3 — May 2026"; existing v3 example reshaped to drop the three fields; "Type-enforced expectations" bullet updated. - agents/device-testing.md: table reflects the new layout. SystemState schema is unaffected — it keeps its own expectedDoctorSystemOutput + expectedExitCode (these belong on the host-environment fixture, not on the device). 35 files changed, +217 / -665. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three host-path resolvers walked up from their module file and assumed the parent was `packages/`. After the move to `test-packages/` they still need to find files at the old locations, which means one extra `..` segment (or, for files that moved alongside the package, fewer segments + the new package-local layout). Fixes - e2e-host-tests/src/helpers/cli-runner.ts: getCliPath() resolved test-packages/podkit-cli/dist/main.js (the binary doesn't live there). Now walks four `..` to repo root, then into packages/podkit-cli/dist. - device-testing/src/runners/lima-test-vm-backing-files.ts: personasRoot() pointed at packages/device-testing/src/personas; now points at test-packages/device-testing/src/personas. - device-testing/src/runners/local-linux.ts: defaultApplyStateScriptPath() pointed at tools/device-testing/scripts/apply-state.sh — the script moved into the package in Phase 2d. Mirror lima-test-vm-state.ts's package-local path math: two `..` from runners/ into scripts/. These slipped through because `bun run test:unit` and `bun run test:integration` do not exercise the e2e-host-tests `test` script (which spawns the real podkit binary). `bun run test` at the repo root does — that's what surfaced the regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VS Code's TypeScript service was matching files in test-packages/ and packages/ to the root tsconfig.json instead of the package-local tsconfigs, surfacing spurious "Cannot find module 'bun:test'" and "Cannot find name 'process'" errors in the editor. `bun run typecheck` was unaffected because tsc, invoked per-package via turbo, uses each package's own tsconfig. Root cause: the root tsconfig has no `include`/`files`/`references` fields, so TypeScript defaults to claiming every TS file in the tree except node_modules/dist. VS Code's project resolution can then pick the root config for a file that already has a more-specific package config — and the root has no `@types/bun` visibility (no root-level node_modules/@types/), so `process`, `bun:test`, etc. fail to resolve. Fix: `"files": []` on the root tsconfig. The config remains a base for package tsconfigs to extend (compilerOptions inheritance is unchanged), but it no longer claims any direct sources. VS Code now falls through to the correct package tsconfig for every file. After this lands, restart the TypeScript server in VS Code (Cmd+Shift+P → "TypeScript: Restart TS server") to drop the cached project graph. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without an explicit `types` field, the TypeScript compiler auto-loads every visible `@types/*` package — which works for tsc invoked from the package directory (the resolution finds @types/bun → bun-types via the hoisted .bun/ symlink). VS Code's TypeScript service was less consistent about the auto-include after the Phase 2 directory moves and reported spurious "Cannot find module 'bun:test'" for files in @podkit/device-testing and @podkit/e2e-vm-tests, even though tsc passed. Pinning `types: ["bun"]` on those two tsconfigs makes the type-include explicit. tsc behaviour is unchanged (it already auto-included @types/bun via the devDependency). VS Code can no longer choose a project context that omits the bun-test type declarations. device-testing-daemon already declares its own ambient `bun:test` shim in src/types.d.ts so it sets `types: []` deliberately — left alone. e2e-host-tests was not reported as broken in the editor; leaving it on auto-include to avoid churn. After this lands, restart the TypeScript server in VS Code (Cmd+Shift+P → "TypeScript: Restart TS server") for the new project graph to take effect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…test The two-package fix in d89e0c4 only solved the symptom in @podkit/device-testing and @podkit/e2e-vm-tests. The same VS Code TS-server auto-include flakiness affects every package whose source or tests `import { describe, it, expect } from 'bun:test'`. Sweeping fix. Pinned `types: ["bun"]` on 13 more tsconfigs: packages/ device-types, devices-ipod, devices-mass-storage, ipod-db, ipod-firmware, ipod-web, libgpod-node, podkit-cli, podkit-core, podkit-daemon, virtual-ipod-server test-packages/ e2e-host-tests, gpod-testing ipod-web also lists `react` and `react-dom` alongside `bun` since its JSX components depend on `@types/react`/`@types/react-dom` (also listed as devDeps) and an explicit `types` array opts out of the default auto-include for every @types/* package. Skipped: - test-packages/device-testing-daemon — declares its own ambient `bun:test` shim in src/types.d.ts, so its `types: []` is deliberate. - packages/docs-site — extends astro/tsconfigs/strict (Astro app, no bun:test). - packages/demo, packages/podkit-docker, packages/ipod-avatar, packages/compatibility, test-packages/test-fixtures — no bun:test imports. - packages/virtual-ipod-app — Tauri/React app, doesn't import bun:test and lacks @types/bun in devDeps; would break with this pin. After this lands, restart the TypeScript server in VS Code (Cmd+Shift+P → "TypeScript: Restart TS server") for the new project graphs to take effect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
src/types.d.ts was a hand-rolled ambient declaration block for `bun:test`,
`process`, `Buffer`, etc. It existed because the dummy-hcd daemon used
to live outside `packages/*` as a non-workspace, so it could not resolve
`@types/bun` from the workspace's hoisted node_modules.
Phase 2a promoted the daemon to `@podkit/device-testing-daemon` — a real
workspace member with `@types/bun` already in devDependencies. The shim
is redundant.
- Delete src/types.d.ts.
- tsconfig.json: replace `types: []` with `types: ["bun"]`; drop the
explicit `"src/types.d.ts"` include entry and the now-redundant
`moduleResolution: "bundler"` / `skipLibCheck: true` overrides (both
already provided by the inherited root tsconfig).
Verified:
- bun run typecheck: 32/32 pass.
- bash test-packages/device-testing-daemon/scripts/build.sh: produces
dist/dummy-hcd-daemon-linux-arm64 cleanly.
- mcp__ide__getDiagnostics on src/main.ts: zero diagnostics.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@types/bun is a thin re-exporter (its index.d.ts is just `/// <reference types="bun-types" />`). Until now every workspace relied on bun-types being transitively visible via Bun's hoisted node_modules/.bun/ — no package had bun-types in its own devDependencies. That works in practice but couples our type resolution to Bun's hoisting behaviour. If Bun ever changes how it hoists transitive deps, or someone tries to install with npm/pnpm, the `/// <reference>` directive can't find bun-types and bun:test imports fail to type-check. Belt-and-braces fix: pin bun-types directly in every workspace that already lists @types/bun. Now node_modules/bun-types is materialised per package, the reference directive resolves locally, and the editor + tsc agree on the same physical d.ts file path. 18 package.jsons touched (sorted via manypkg fix to keep deps in alphabetical order). bun install → 0 changes (bun-types was already present in the hoisted cache). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mode paths User feedback: silently skipping VM tests when Lima isn't ready makes test reports lie. Replace the skip pattern with a preflight that exits fast with an actionable message before any test loads. While exposing the new gate, two pre-existing bugs surfaced — fixed in this commit. Preflight + fail-fast - New `test-packages/device-testing/scripts/preflight-vm.ts` (registered as the `podkit-vm-preflight` bin on @podkit/device-testing). Probes Lima instance metadata AND verifies SSH actually answers (`limactl shell podkit-device-harness -- /bin/true`); both must pass or the script exits 1 with a remediation hint. - `test:vm` scripts in @podkit/device-testing and @podkit/e2e-vm-tests now invoke preflight first, then `bun test … --path-ignore-patterns=` (the bunfig `**/*.e2e.test.ts` ignore is cleared so VM tests actually run — previously the script matched only non-e2e files in src/). - `describe.skipIf(!vmAvailable)` removed from all 13 VM test files. The outer suite-level wrapping is gone; only `groupPersonasByState`-style persona-payload filtering (a different concern: input quality, not VM reachability) remains. - `resolveVmAvailability`, `resetVmSkipWarning`, `skipWarningEmitted` and the matching tests/exports removed from `vm-runtime-setup.ts`. dist-mode path resolution - Pre-existing bug exposed once tests stopped silently skipping: several runner helpers walked `..` segments from `import.meta.url` assuming the file lived at `src/runners/X.ts`. After `bun build` flattens the module graph into `dist/index.js`, the walk goes one level too far — e.g. `repoRoot()` resolved to the parent of the repo. Consumers like `@podkit/e2e-vm-tests` import via the dist bundle, so they hit the broken path; @podkit/device-testing's own tests run from src and silently dodged it. - New shared helper `test-packages/device-testing/src/runners/paths.ts` anchors on the `/test-packages/device-testing/` substring inside `import.meta.url`, which is identical for src and dist layouts. - Five runner functions migrated to the helper: `lima-test-vm.ts:repoRoot()`, `lima-test-vm-state.ts:defaultApplyStateScriptPath()`, `local-linux.ts:defaultApplyStateScriptPath()`, `lima-test-vm-systemd.ts:resolveDefaultDummyHcdDaemonUnit()`, `lima-test-vm-backing-files.ts:personasRoot()`. - Unused `fileURLToPath` imports cleared in those five files. Verified locally - typecheck 32/32, lint 0/0, build 18/18, test:unit 32/32, test:integration 31/31. - `bun run test:vm` with VM up → 24 + 67 VM tests pass end-to-end (full Lima device-harness path: persona sidecar, FunctionFS gadget, podkit binary inside the VM, doctor/discovery/mass-storage-binding etc). - `bun run test:vm` with VM SSH refused → exits 1 with the remediation message; no test files even load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
Author
|
CI sanity green. Closing — work landed on main locally. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Sanity-check PR to surface CI feedback on the m-19 restructure. Do not merge — the work is already on main locally; this PR exists to exercise CI workflows against the new repo layout.
Restructure highlights
test-packages/test-packages/workspace glob alongsidepackages/e2e-tests→e2e-host-tests; newe2e-vm-tests; newdevice-testing-daemonworkspace packageTest plan
test-packages/layout🤖 Generated with Claude Code