From 1e6848d35b3351468c43fb60063a7118e683c6a9 Mon Sep 17 00:00:00 2001 From: zackees Date: Sat, 20 Jun 2026 13:36:41 -0700 Subject: [PATCH] feat(usb): tier-1 fbuild-core::usb resolver + online-data branch wiring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add `usb-ids` crate as tier-1 bundled VID:PID -> {vendor, product} resolver in `fbuild_core::usb`. Tier-2 reads an optional JSON overlay installed by the daemon at startup. Tier-3 returns a synthetic `"Unknown vendor 0xVVVV"` placeholder so callers never see None. - Wire the resolver into the daemon's device enumeration so device descriptions, `/api/devices/list`, and `/api/devices/{port}/status` carry pretty vendor/product names. CLI `device list` / `device status` print `"vendor product (VVVV:PPPP)"`; deploy port selection logs the same canonical string at connect time. - Add `.github/workflows/nightly-usb-ids.yml` that refreshes the orphan `online-data` branch daily. The workflow YAML lives on `main` (required for `schedule` / `workflow_dispatch`); the merger script + data files live on `online-data` only. Fault-tolerant against any single source failure; refuses to write a too-small dataset; prunes history to 200. - Add `crates/fbuild-core/examples/dump_usb_ids.rs` as the tier-1 dump source for the nightly workflow (kept as an example so no new crate is introduced — see `ci/hooks/crate_guard.py`). - New `ci/hooks/crate_guard.py` PreToolUse hook blocks Edit/Write of any `Cargo.toml` outside the approved set, enforcing the monocrate policy in real time. Remove the per-edit lint hook from PostToolUse — it was triggering a full `clippy --all-targets` recompile on every save; the Stop hook still gates everything on session end. - Document the full design in `docs/online-data.md` and `tasks/todo.md`. Goal acceptance: fbuild-core: 9 new usb tests pass fbuild-daemon: 178/178 tests pass workspace `cargo check` clean --- .claude/settings.json | 15 +- .github/workflows/nightly-usb-ids.yml | 228 +++++++++++++++++ .gitignore | 7 + CLAUDE.md | 1 + Cargo.lock | 74 ++++++ Cargo.toml | 7 + ci/hooks/crate_guard.py | 168 +++++++++++++ crates/fbuild-cli/src/cli/device.rs | 33 ++- crates/fbuild-cli/src/daemon_client/README.md | 6 + crates/fbuild-cli/src/daemon_client/types.rs | 14 ++ crates/fbuild-core/Cargo.toml | 2 + crates/fbuild-core/examples/README.md | 5 + crates/fbuild-core/examples/dump_usb_ids.rs | 51 ++++ crates/fbuild-core/src/lib.rs | 1 + crates/fbuild-core/src/usb/README.md | 7 + crates/fbuild-core/src/usb/data.rs | 147 +++++++++++ crates/fbuild-core/src/usb/mod.rs | 31 +++ crates/fbuild-core/src/usb/resolver.rs | 159 ++++++++++++ crates/fbuild-daemon/src/device_manager.rs | 41 ++- crates/fbuild-daemon/src/handlers/devices.rs | 6 + .../src/handlers/operations/deploy_port.rs | 34 ++- crates/fbuild-daemon/src/models.rs | 14 ++ docs/online-data.md | 129 ++++++++++ tasks/todo.md | 238 ++++++++++++++++-- 24 files changed, 1390 insertions(+), 28 deletions(-) create mode 100644 .github/workflows/nightly-usb-ids.yml create mode 100644 ci/hooks/crate_guard.py create mode 100644 crates/fbuild-cli/src/daemon_client/README.md create mode 100644 crates/fbuild-core/examples/README.md create mode 100644 crates/fbuild-core/examples/dump_usb_ids.rs create mode 100644 crates/fbuild-core/src/usb/README.md create mode 100644 crates/fbuild-core/src/usb/data.rs create mode 100644 crates/fbuild-core/src/usb/mod.rs create mode 100644 crates/fbuild-core/src/usb/resolver.rs create mode 100644 docs/online-data.md diff --git a/.claude/settings.json b/.claude/settings.json index ab2889f8..67529533 100644 --- a/.claude/settings.json +++ b/.claude/settings.json @@ -21,17 +21,22 @@ "timeout": 5 } ] + }, + { + "matcher": "Edit|Write|NotebookEdit", + "hooks": [ + { + "type": "command", + "command": "cd \"$(git rev-parse --show-toplevel)\" && uv run --no-project --script ci/hooks/crate_guard.py", + "timeout": 5 + } + ] } ], "PostToolUse": [ { "matcher": "Edit|Write", "hooks": [ - { - "type": "command", - "command": "cd \"$(git rev-parse --show-toplevel)\" && uv run --no-project --script ci/hooks/lint.py", - "timeout": 120 - }, { "type": "command", "command": "cd \"$(git rev-parse --show-toplevel)\" && uv run --no-project --script ci/hooks/readme_guard.py", diff --git a/.github/workflows/nightly-usb-ids.yml b/.github/workflows/nightly-usb-ids.yml new file mode 100644 index 00000000..1e81eeb4 --- /dev/null +++ b/.github/workflows/nightly-usb-ids.yml @@ -0,0 +1,228 @@ +# Nightly refresh of the `online-data` branch's USB VID:PID database. +# +# The tooling (Python merger, README, data files) lives on the orphan +# `online-data` branch — NOT on `main`. This workflow file lives on `main` +# only because GitHub Actions requires `schedule` and `workflow_dispatch` +# triggers to be defined on the default branch. At runtime the job: +# +# 1. checks out `main` (default) so it can build the `dump_usb_ids` +# example from `crates/fbuild-core/examples/dump_usb_ids.rs`; +# 2. fetches + worktrees the `online-data` branch into a sibling dir so +# the merger script lives at `online-data/tools/merge_sources.py`; +# 3. dumps the bundled `usb-ids` Rust crate to JSON; +# 4. downloads several upstream `usb.ids` text mirrors (fault-tolerant — +# a single source failure does NOT abort the run); +# 5. runs the merger to produce sorted `usb-vid.json`, +# `usb-vid-conflicts.json`, and a future-forward `manifest.json`; +# 6. commits the resulting data files back to `online-data` if they +# actually changed, force-pushing only after history pruning. +# +# Fault tolerance summary: +# - Rust build failure → keep the existing committed data (no commit). +# - Any individual upstream fetch failure → workflow continues with the +# sources that succeeded; merger refuses to write if the union is +# implausibly small (< 1000 entries) and the existing data stays put. +# - History is pruned to the most recent 200 commits per the design. +# +# Manual trigger: Actions tab → "Nightly USB IDs refresh" → Run workflow. + +name: Nightly USB IDs refresh + +on: + schedule: + # 04:17 UTC daily — off-peak, avoids the top-of-hour stampede on shared + # GitHub-hosted runners. + - cron: "17 4 * * *" + workflow_dispatch: + +permissions: + contents: write + +concurrency: + group: nightly-usb-ids + cancel-in-progress: false + +env: + ONLINE_BRANCH: online-data + ONLINE_WORKTREE: ${{ github.workspace }}/.online-data + BRANCH_BASE_URL: https://raw.githubusercontent.com/${{ github.repository }}/online-data + HISTORY_LIMIT: 200 + +jobs: + refresh: + name: Refresh online-data/usb-vid.json + runs-on: ubuntu-latest + steps: + - name: Checkout main (default branch) + uses: actions/checkout@v6 + with: + # We need the git history available so `git worktree add` against + # the `online-data` branch works, and so the history-prune step + # can rewrite commits without confusing a shallow clone. + fetch-depth: 0 + + - name: Configure git identity for the commit + run: | + git config user.name "fbuild-bot[nightly]" + git config user.email "fbuild-bot+nightly@users.noreply.github.com" + + - name: Fetch + worktree the online-data branch + # Creates a sibling directory containing the orphan branch. If the + # branch does not yet exist on the remote (very first run), we + # bootstrap an empty orphan worktree so the rest of the job works. + run: | + set -euo pipefail + if git ls-remote --heads origin "${ONLINE_BRANCH}" | grep -q .; then + git fetch origin "${ONLINE_BRANCH}:${ONLINE_BRANCH}" + git worktree add "${ONLINE_WORKTREE}" "${ONLINE_BRANCH}" + else + echo "::warning::online-data branch missing on remote; bootstrapping empty orphan worktree" + git worktree add --detach "${ONLINE_WORKTREE}" + (cd "${ONLINE_WORKTREE}" && git checkout --orphan "${ONLINE_BRANCH}" && git rm -rf . 2>/dev/null || true) + fi + ls -la "${ONLINE_WORKTREE}" + + - uses: astral-sh/setup-uv@v3 + + - name: Setup soldr + uses: zackees/setup-soldr@v0.9.62 + with: + cache: true + build-cache: true + target-cache: true + prebuild-deps: none + linker: platform-default + + - name: Build dump_usb_ids example (tier-1 source) + id: build-dump + # Failure is tolerated: we still try to merge whatever upstream + # text sources arrived this run. The merger will fall back to the + # previously committed data if too few entries survive. + continue-on-error: true + run: | + set -euo pipefail + soldr cargo build --release --example dump_usb_ids -p fbuild-core + + - name: Run dump_usb_ids → /tmp/usb-ids-rs.json + id: run-dump + continue-on-error: true + if: steps.build-dump.outcome == 'success' + run: | + set -euo pipefail + ./target/release/examples/dump_usb_ids > /tmp/usb-ids-rs.json + wc -l /tmp/usb-ids-rs.json + + - name: Fetch linux-usb.org/usb.ids (tier-2) + id: fetch-linux-usb + continue-on-error: true + run: | + # HTTP only — the linux-usb.org HTTPS endpoint has a SAN mismatch. + curl --silent --show-error --retry 5 --retry-delay 10 --fail \ + --max-time 90 \ + -o /tmp/linux-usb.txt \ + "http://www.linux-usb.org/usb.ids" + wc -l /tmp/linux-usb.txt + + - name: Fetch usbids/usbids GitHub mirror (tier-3) + id: fetch-github + continue-on-error: true + run: | + curl --silent --show-error --retry 5 --retry-delay 10 --fail \ + --max-time 90 \ + -o /tmp/usbids-github.txt \ + "https://raw.githubusercontent.com/usbids/usbids/master/usb.ids" + wc -l /tmp/usbids-github.txt + + - name: Run merger (only if at least one source loaded) + id: merge + continue-on-error: true + run: | + set -euo pipefail + args=() + if [ "${{ steps.run-dump.outcome }}" = "success" ] && [ -s /tmp/usb-ids-rs.json ]; then + args+=(--json "usb-ids-rs=/tmp/usb-ids-rs.json") + fi + if [ "${{ steps.fetch-linux-usb.outcome }}" = "success" ] && [ -s /tmp/linux-usb.txt ]; then + args+=(--txt "linux-usb.org=/tmp/linux-usb.txt") + fi + if [ "${{ steps.fetch-github.outcome }}" = "success" ] && [ -s /tmp/usbids-github.txt ]; then + args+=(--txt "usbids-github=/tmp/usbids-github.txt") + fi + if [ "${#args[@]}" -eq 0 ]; then + echo "::error::all sources failed; preserving previously committed data" + exit 1 + fi + uv run --no-project --script \ + "${ONLINE_WORKTREE}/tools/merge_sources.py" \ + "${args[@]}" \ + --out-dir "${ONLINE_WORKTREE}/data" \ + --branch-base-url "${BRANCH_BASE_URL}" + + - name: Refresh manifest.json (always — even if data unchanged) + # The manifest carries `generated_at`, so we always update it; that + # gives the branch a heartbeat for downstream consumers even on a + # no-op data day. If the merge step failed we deliberately skip + # this — we don't want to advertise stale `sources` listings. + if: steps.merge.outcome == 'success' + run: | + if [ -f "${ONLINE_WORKTREE}/data/manifest.json" ]; then + mv "${ONLINE_WORKTREE}/data/manifest.json" "${ONLINE_WORKTREE}/manifest.json" + fi + + - name: Commit + push if data actually changed + id: commit + if: steps.merge.outcome == 'success' + working-directory: ${{ env.ONLINE_WORKTREE }} + run: | + set -euo pipefail + git add manifest.json data/ + if git diff --cached --quiet; then + echo "no changes to commit" + echo "changed=false" >> "$GITHUB_OUTPUT" + exit 0 + fi + ts="$(date -u +%Y-%m-%d)" + git commit -m "chore(usb-ids): nightly refresh ${ts}" + echo "changed=true" >> "$GITHUB_OUTPUT" + + - name: Prune history to last ${{ env.HISTORY_LIMIT }} commits + if: steps.commit.outputs.changed == 'true' + working-directory: ${{ env.ONLINE_WORKTREE }} + run: | + set -euo pipefail + total="$(git rev-list --count HEAD)" + echo "current history length: ${total}" + if [ "${total}" -le "${HISTORY_LIMIT}" ]; then + echo "no prune needed (<= ${HISTORY_LIMIT} commits)" + exit 0 + fi + # Find the commit `HISTORY_LIMIT-1` back from HEAD and make it + # a new root via a graft. Then `git filter-repo` (preinstalled on + # GitHub-hosted Ubuntu runners) rewrites history accordingly. + target="$(git rev-list --max-count="${HISTORY_LIMIT}" HEAD | tail -n 1)" + git replace --graft "${target}" + pip install --quiet git-filter-repo + git filter-repo --force --refs HEAD + git for-each-ref --format='delete %(refname)' refs/replace/ | \ + git update-ref --stdin + + - name: Push + if: steps.commit.outputs.changed == 'true' + working-directory: ${{ env.ONLINE_WORKTREE }} + # Force-with-lease is needed only after a history-prune rewrite. + # In the no-prune path it is a no-op compared to a fast-forward. + run: | + git push --force-with-lease origin "${ONLINE_BRANCH}" + + - name: Summary + if: always() + run: | + echo "## Nightly USB IDs refresh" >> "$GITHUB_STEP_SUMMARY" + echo "" >> "$GITHUB_STEP_SUMMARY" + echo "| source | outcome |" >> "$GITHUB_STEP_SUMMARY" + echo "|---|---|" >> "$GITHUB_STEP_SUMMARY" + echo "| usb-ids-rs (dump example) | ${{ steps.run-dump.outcome }} |" >> "$GITHUB_STEP_SUMMARY" + echo "| linux-usb.org | ${{ steps.fetch-linux-usb.outcome }} |" >> "$GITHUB_STEP_SUMMARY" + echo "| usbids/usbids github | ${{ steps.fetch-github.outcome }} |" >> "$GITHUB_STEP_SUMMARY" + echo "| merge | ${{ steps.merge.outcome }} |" >> "$GITHUB_STEP_SUMMARY" + echo "| committed | ${{ steps.commit.outputs.changed || 'n/a' }} |" >> "$GITHUB_STEP_SUMMARY" diff --git a/.gitignore b/.gitignore index ddd38192..6ce6535a 100644 --- a/.gitignore +++ b/.gitignore @@ -76,6 +76,13 @@ vscode-fbuild/*.vsix /LOOP.md tasks/loop-runs/ +# Local staging dir for the orphan `online-data` branch's tooling. +# The actual files live on that branch only — see +# `.github/workflows/nightly-usb-ids.yml` and `docs/online-data.md`. +.online-data-staging/ +# `git worktree add` location used by the nightly workflow. +.online-data/ + # clud project settings !.clud/ !.clud/settings.json diff --git a/CLAUDE.md b/CLAUDE.md index 6758d2da..942a4ac1 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -73,6 +73,7 @@ All hooks are Python scripts in `ci/hooks/`, invoked via `uv run`: - **UserPromptSubmit**: `ci/hooks/board_context.py` detects board-related prompts and injects skill guidance (board lookup workflow, external source URLs, relevant commands) - **PreToolUse**: `ci/hooks/tool_guard.py` blocks bare Rust commands and any `uv run` invocation of `soldr`/`cargo` (must use a globally-installed `soldr` directly) and bare `python`/`pip` (must use `uv`) across supported shell tools, not just Bash +- **PreToolUse**: `ci/hooks/crate_guard.py` blocks Edit/Write of `Cargo.toml` at any path outside the approved set (workspace root + 13 member dirs + `dylints/ban_raw_subprocess`). Real-time monocrate enforcement; complements the batch CI check at `ci/check_workspace_crates.py`. Keep the allowlists in both files in sync - **PostToolUse**: `ci/hooks/lint.py` auto-formats + runs clippy on edited .rs files - **PostToolUse**: `ci/hooks/readme_guard.py` errors if directory lacks README.md - **SessionStart**: `ci/hooks/check-on-start.py` captures git fingerprint diff --git a/Cargo.lock b/Cargo.lock index f4bf66ef..38fdd78a 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -972,6 +972,7 @@ dependencies = [ "thiserror 2.0.18", "tokio", "tracing", + "usb-ids", ] [[package]] @@ -2026,6 +2027,12 @@ version = "0.3.17" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "6877bb514081ee2a7ff5ef9de3281f14a4dd4bceac4c09388074a6b5df8a139a" +[[package]] +name = "minimal-lexical" +version = "0.2.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "68354c5c6bd36d73ff3feceb05efa59b6acb7626617f4962be322a825e61f79a" + [[package]] name = "miniz_oxide" version = "0.8.9" @@ -2109,6 +2116,16 @@ dependencies = [ "memchr", ] +[[package]] +name = "nom" +version = "7.1.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d273983c5a657a70a3e8f2a01329822f3b8c8172b73826411a55751e404a0a4a" +dependencies = [ + "memchr", + "minimal-lexical", +] + [[package]] name = "ntapi" version = "0.4.3" @@ -2243,6 +2260,44 @@ dependencies = [ "indexmap", ] +[[package]] +name = "phf" +version = "0.11.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1fd6780a80ae0c52cc120a26a1a42c1ae51b247a253e4e06113d23d2c2edd078" +dependencies = [ + "phf_shared", +] + +[[package]] +name = "phf_codegen" +version = "0.11.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "aef8048c789fa5e851558d709946d6d79a8ff88c0440c587967f8e94bfb1216a" +dependencies = [ + "phf_generator", + "phf_shared", +] + +[[package]] +name = "phf_generator" +version = "0.11.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3c80231409c20246a13fddb31776fb942c38553c51e871f8cbd687a4cfb5843d" +dependencies = [ + "phf_shared", + "rand 0.8.5", +] + +[[package]] +name = "phf_shared" +version = "0.11.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "67eabc2ef2a60eb7faa00097bd1ffdb5bd28e62bf39990626a582201b7a754e5" +dependencies = [ + "siphasher", +] + [[package]] name = "pin-project-lite" version = "0.2.17" @@ -3117,6 +3172,12 @@ version = "0.3.9" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "703d5c7ef118737c72f1af64ad2f6f8c5e1921f818cdcb97b8fe6fc69bf66214" +[[package]] +name = "siphasher" +version = "1.0.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8ee5873ec9cce0195efcb7a4e9507a04cd49aec9c83d0389df45b1ef7ba2e649" + [[package]] name = "slab" version = "0.4.12" @@ -3753,6 +3814,19 @@ dependencies = [ "serde", ] +[[package]] +name = "usb-ids" +version = "1.2025.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1f464d03993287ba27fae1c81bfa368df4493983de7e340429fc10e470043383" +dependencies = [ + "nom", + "phf", + "phf_codegen", + "proc-macro2", + "quote", +] + [[package]] name = "utf-8" version = "0.7.6" diff --git a/Cargo.toml b/Cargo.toml index 0bcbfc1e..cdb0e039 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -75,6 +75,13 @@ object = { version = "0.36", default-features = false, features = ["read", "std" rusqlite = { version = "0.31", features = ["bundled"] } shell-words = "1" bincode = "1" +# USB VID:PID -> {vendor, product} name lookup. Bundles the linux-usb.org +# `usb.ids` snapshot at compile time into a `phf` perfect-hash table. +# Pure Rust, no libusb / no udev. Versioning scheme `1.YYYY.N` tracks +# upstream snapshots — `cargo update` pulls in new silicon. Tier-1 for +# `fbuild_core::usb::resolve()`; the tier-2 online overlay is fetched +# from the repo's `online-data` branch. +usb-ids = "1.2025" # prost: hand-derived `#[derive(prost::Message)]` payload structs for the # fbuild v1 broker request/response lane (no .proto/build.rs needed). Pinned # to the version running-process 4.3.0 re-exports so the wire types stay diff --git a/ci/hooks/crate_guard.py b/ci/hooks/crate_guard.py new file mode 100644 index 00000000..4d0ceed6 --- /dev/null +++ b/ci/hooks/crate_guard.py @@ -0,0 +1,168 @@ +#!/usr/bin/env -S uv run --script +# /// script +# requires-python = ">=3.10" +# /// +"""PreToolUse hook: forbid creating new Rust crates via Edit/Write. + +fbuild is intentionally kept close to a monocrate (see CLAUDE.md and +FastLED/fbuild#560). New functionality is folded into an existing crate as a +*module*, not introduced as a brand-new crate. The CI check at +`ci/check_workspace_crates.py` enforces this in batch on every PR; this +hook enforces it in real time by blocking any attempt to write a `Cargo.toml` +at a path that is not already part of the approved set. + +A standalone Cargo project anywhere in the repo (workspace member or +otherwise) is treated as a new crate. The only allowed `Cargo.toml` writes +are: + - the workspace root `Cargo.toml`, + - one of the approved member directories, + - one of the approved excluded directories (dylints/*). + +A genuinely-justified new crate requires editing both `APPROVED_CRATE_DIRS` +in this file AND `APPROVED_MEMBERS` in `ci/check_workspace_crates.py` in +the same PR, with maintainer-reviewed rationale in the PR body. + +Exit codes: + 0 — allow (writes a deny JSON via stdout if a violation is detected). +""" + +from __future__ import annotations + +import json +import os +import re +import sys +from pathlib import Path, PurePosixPath + + +# Directories that are allowed to contain a `Cargo.toml`. Keep this list +# in sync with the `[workspace] members` + `exclude` lists in the root +# `Cargo.toml` and with `APPROVED_MEMBERS` in +# `ci/check_workspace_crates.py`. Use POSIX-style relative paths. +APPROVED_CRATE_DIRS: frozenset[str] = frozenset( + { + # Repo root workspace manifest: + "", + # Workspace members: + "crates/fbuild-core", + "crates/fbuild-config", + "crates/fbuild-paths", + "crates/fbuild-packages", + "crates/fbuild-serial", + "crates/fbuild-build", + "crates/fbuild-deploy", + "crates/fbuild-daemon", + "crates/fbuild-cli", + "crates/fbuild-python", + "crates/fbuild-test-support", + "crates/fbuild-header-scan", + "crates/fbuild-library-select", + "bench/fastled-examples", + # Workspace-excluded crates with their own toolchains: + "dylints/ban_raw_subprocess", + } +) + + +def repo_root() -> Path: + """The git toplevel, since the hook is launched from there.""" + return Path(os.getcwd()).resolve() + + +def deny(reason: str) -> None: + json.dump( + { + "hookSpecificOutput": { + "hookEventName": "PreToolUse", + "permissionDecision": "deny", + "permissionDecisionReason": reason, + } + }, + sys.stdout, + ) + + +def relative_dir(file_path: str) -> str | None: + """Return the repo-relative POSIX directory of `file_path`, or None + if the path is outside the repo (e.g. a temp file).""" + try: + abs_path = Path(file_path).resolve() + rel = abs_path.relative_to(repo_root()) + except (ValueError, OSError): + return None + parent = rel.parent + # `Path('.').as_posix()` -> '.', normalize the root manifest case to '': + posix = PurePosixPath(parent).as_posix() + return "" if posix == "." else posix + + +def is_cargo_toml(file_path: str) -> bool: + """True if the path ends in `Cargo.toml` (case-insensitive on Windows + but exact on POSIX). Catches the canonical filename used by Cargo to + mark a crate root.""" + name = Path(file_path).name + if sys.platform.startswith("win"): + return name.lower() == "cargo.toml" + return name == "Cargo.toml" + + +def extract_file_path(data: dict) -> str: + tool_input = data.get("tool_input") + if not isinstance(tool_input, dict): + return "" + value = tool_input.get("file_path") + if isinstance(value, str): + return value.strip() + return "" + + +# A future-proofing belt-and-suspenders check: if someone tries to +# Write/Edit something that *looks* like a Cargo project root but uses a +# non-standard name (`cargo.toml`, weird casing on Linux), the filename +# regex below catches it too. Keep it tight — we only want to flag actual +# Cargo manifest filenames. +CARGO_TOML_RE = re.compile(r"^[Cc]argo\.toml$") + + +def main() -> None: + try: + data = json.load(sys.stdin) + except json.JSONDecodeError: + sys.exit(0) + + tool_name = data.get("tool_name", "") + if tool_name not in {"Edit", "Write", "NotebookEdit"}: + sys.exit(0) + + file_path = extract_file_path(data) + if not file_path: + sys.exit(0) + + if not (is_cargo_toml(file_path) or CARGO_TOML_RE.match(Path(file_path).name)): + sys.exit(0) + + rel_dir = relative_dir(file_path) + if rel_dir is None: + # Outside the repo — let it through (e.g. tempfile). + sys.exit(0) + + if rel_dir in APPROVED_CRATE_DIRS: + sys.exit(0) + + deny( + f"Refusing to create/modify Cargo.toml at '{rel_dir or '.'}': " + "fbuild is kept close to a monocrate (see CLAUDE.md and " + "FastLED/fbuild#560). New functionality must be folded into one of " + "the existing crates as a module, not introduced as a brand-new " + "crate. The approved Cargo project directories are: " + f"{sorted(APPROVED_CRATE_DIRS - {''}) }, plus the workspace root. " + "If a new crate is genuinely justified, update " + "`APPROVED_CRATE_DIRS` in `ci/hooks/crate_guard.py` AND " + "`APPROVED_MEMBERS` in `ci/check_workspace_crates.py` in the same " + "PR, with maintainer-reviewed rationale in the PR body." + ) + sys.exit(0) + + +if __name__ == "__main__": + main() diff --git a/crates/fbuild-cli/src/cli/device.rs b/crates/fbuild-cli/src/cli/device.rs index fdd9011d..1f5c8354 100644 --- a/crates/fbuild-cli/src/cli/device.rs +++ b/crates/fbuild-cli/src/cli/device.rs @@ -30,13 +30,19 @@ pub async fn run_device(action: DeviceAction) -> fbuild_core::Result<()> { .as_ref() .map(|l| l.client_id.as_str()) .unwrap_or("-"); + // Prefer the resolver-derived "vendor product (VID:PID)" + // display over the raw daemon `description`. When no + // vendor/product was resolved (non-USB ports, or daemon + // older than the resolver wiring), fall back to the raw + // description so behavior is identical to pre-resolver. + let pretty = device_pretty_name(dev); println!( "{:<20} {:<12} {:<12} {:<24} {}", dev.port, id, lease, holder, - device_description(&dev.description, dev.previous_port.as_deref()) + device_description(&pretty, dev.previous_port.as_deref()) ); } println!("\n{} device(s) found", resp.devices.len()); @@ -54,6 +60,12 @@ pub async fn run_device(action: DeviceAction) -> fbuild_core::Result<()> { }; println!(" {}", resp.port); println!(" Device ID: {}", resp.device_id); + if let Some(ref vendor) = resp.vendor_name { + println!(" Vendor: {}", vendor); + } + if let Some(ref product) = resp.product_name { + println!(" Product: {}", product); + } println!(" Description: {}", resp.description); if let Some(ref serial) = resp.serial_number { println!(" Serial: {}", serial); @@ -168,3 +180,22 @@ fn device_description(description: &str, previous_port: Option<&str>) -> String None => description.to_string(), } } + +/// Compose the canonical `"vendor product (VVVV:PPPP)"` display string +/// for a device row. Falls back to the daemon-provided `description` +/// (and bare hex VID:PID, when available) so this code remains usable +/// against older daemons that don't yet emit `vendor_name`/`product_name`. +fn device_pretty_name(dev: &crate::daemon_client::DeviceInfoResponse) -> String { + match ( + dev.vid, + dev.pid, + dev.vendor_name.as_deref(), + dev.product_name.as_deref(), + ) { + (Some(v), Some(p), Some(vendor), Some(product)) => { + format!("{vendor} {product} ({v:04X}:{p:04X})") + } + (Some(v), Some(p), _, _) => format!("{} ({:04X}:{:04X})", dev.description, v, p), + _ => dev.description.clone(), + } +} diff --git a/crates/fbuild-cli/src/daemon_client/README.md b/crates/fbuild-cli/src/daemon_client/README.md new file mode 100644 index 00000000..1cf5c619 --- /dev/null +++ b/crates/fbuild-cli/src/daemon_client/README.md @@ -0,0 +1,6 @@ +# `daemon_client` + +HTTP client + deserialization types the CLI uses to talk to the fbuild daemon. + +- `types.rs` — request/response structs that mirror the daemon's JSON schemas (`crates/fbuild-daemon/src/models.rs`). Keep field-for-field compatible so deserialization stays forgiving via `#[serde(default)]`. +- Sibling `mod.rs` (one level up at `daemon_client.rs`) — HTTP transport, daemon lifecycle, retry logic. diff --git a/crates/fbuild-cli/src/daemon_client/types.rs b/crates/fbuild-cli/src/daemon_client/types.rs index 3c1b3b8e..65c1ba6b 100644 --- a/crates/fbuild-cli/src/daemon_client/types.rs +++ b/crates/fbuild-cli/src/daemon_client/types.rs @@ -314,6 +314,13 @@ pub struct DeviceInfoResponse { pub device_id: Option, pub vid: Option, pub pid: Option, + /// Pretty USB vendor name resolved by the daemon (tier-1 bundled + /// `usb-ids` + tier-2 online overlay). `None` for non-USB ports. + #[serde(default)] + pub vendor_name: Option, + /// Pretty USB product name (same provenance as `vendor_name`). + #[serde(default)] + pub product_name: Option, #[serde(default)] pub serial_number: Option, #[serde(default)] @@ -333,6 +340,13 @@ pub struct DeviceStatusResponse { pub port: String, pub device_id: String, pub description: String, + /// Pretty USB vendor name resolved by the daemon. `None` for + /// bluetooth/PCI/unknown serials. + #[serde(default)] + pub vendor_name: Option, + /// Pretty USB product name (same provenance as `vendor_name`). + #[serde(default)] + pub product_name: Option, #[serde(default)] pub serial_number: Option, #[serde(default)] diff --git a/crates/fbuild-core/Cargo.toml b/crates/fbuild-core/Cargo.toml index 3b3f3512..ae451c04 100644 --- a/crates/fbuild-core/Cargo.toml +++ b/crates/fbuild-core/Cargo.toml @@ -12,6 +12,8 @@ tracing = { workspace = true } serde = { workspace = true } serde_json = { workspace = true } sha2 = { workspace = true } +# Tier-1 USB VID:PID resolver — see `crate::usb`. +usb-ids = { workspace = true } # Process containment primitive (Job Objects on Windows; process groups + # PR_SET_PDEATHSIG on Linux; process groups on macOS). The single global # `ContainedProcessGroup` owned by the daemon ensures every child process diff --git a/crates/fbuild-core/examples/README.md b/crates/fbuild-core/examples/README.md new file mode 100644 index 00000000..eea72ba6 --- /dev/null +++ b/crates/fbuild-core/examples/README.md @@ -0,0 +1,5 @@ +# `fbuild-core` examples + +Standalone runnable examples exercising public `fbuild_core` APIs. + +- `dump_usb_ids.rs` — dumps the bundled `usb-ids` database as a sorted JSON object to stdout. Consumed by the `online-data` branch's nightly workflow as one input source for the merged `usb-vid.json`. Run with: `soldr cargo run --release --example dump_usb_ids -p fbuild-core > usb-ids-rs.json`. diff --git a/crates/fbuild-core/examples/dump_usb_ids.rs b/crates/fbuild-core/examples/dump_usb_ids.rs new file mode 100644 index 00000000..206955e3 --- /dev/null +++ b/crates/fbuild-core/examples/dump_usb_ids.rs @@ -0,0 +1,51 @@ +//! Dump the bundled `usb-ids` database as a JSON object to stdout. +//! +//! Used by the `online-data` branch's nightly workflow (see +//! `.github/workflows/nightly-usb-ids.yml`) as one of the input sources +//! for the merged `usb-vid.json`. Running this example via +//! `soldr cargo run --release --example dump_usb_ids -p fbuild-core` +//! captures the exact data the bundled `usb-ids` crate version we depend +//! on actually knows about, so the online overlay can be cross-checked +//! against tier-1. +//! +//! Output schema (alphabetically sorted by key): +//! ```json +//! { +//! "0403:6001": {"vendor": "Future Technology Devices ...", "product": "FT232 ..."}, +//! ... +//! } +//! ``` +//! +//! No CLI arguments, no IO beyond stdout — kept intentionally tiny so the +//! nightly workflow can pipe it into a file with no risk of partial output. + +use std::collections::BTreeMap; + +fn main() { + // BTreeMap → keys are emitted in sorted order by `serde_json`. + let mut out: BTreeMap = BTreeMap::new(); + + for vendor in usb_ids::Vendors::iter() { + let vendor_name = vendor.name().to_string(); + for device in vendor.devices() { + let key = format!("{:04x}:{:04x}", vendor.id(), device.id()); + out.insert( + key, + Entry { + vendor: vendor_name.clone(), + product: device.name().to_string(), + }, + ); + } + } + + // pretty-print so diffs on the `online-data` branch are reviewable. + serde_json::to_writer_pretty(std::io::stdout().lock(), &out).expect("write JSON to stdout"); + println!(); +} + +#[derive(serde::Serialize)] +struct Entry { + vendor: String, + product: String, +} diff --git a/crates/fbuild-core/src/lib.rs b/crates/fbuild-core/src/lib.rs index c21e13ad..beebec67 100644 --- a/crates/fbuild-core/src/lib.rs +++ b/crates/fbuild-core/src/lib.rs @@ -17,6 +17,7 @@ pub mod response_file; pub mod shell_split; pub mod subprocess; pub mod symbol_analysis; +pub mod usb; pub use build_log::BuildLog; diff --git a/crates/fbuild-core/src/usb/README.md b/crates/fbuild-core/src/usb/README.md new file mode 100644 index 00000000..d0d85b86 --- /dev/null +++ b/crates/fbuild-core/src/usb/README.md @@ -0,0 +1,7 @@ +# `fbuild_core::usb` + +USB VID:PID → human-readable `(vendor, product)` resolution. + +- `mod.rs` — public API surface (`resolve`, `try_resolve`, `pretty`, `install_online_cache`). +- `resolver.rs` — tiered lookup implementation + unit tests covering FTDI, CP210x, CH340, Espressif, and the synthetic fallback. +- `data.rs` — optional runtime overlay loaded from a JSON file. Used to pick up newly-assigned VID/PID pairs that the bundled `usb-ids` crate doesn't yet know about. Powered by the repo's `online-data` branch and its nightly refresh workflow. diff --git a/crates/fbuild-core/src/usb/data.rs b/crates/fbuild-core/src/usb/data.rs new file mode 100644 index 00000000..a5f6d474 --- /dev/null +++ b/crates/fbuild-core/src/usb/data.rs @@ -0,0 +1,147 @@ +//! Tier-2 online overlay: an optional `{ "VVVV:PPPP": {vendor, product} }` +//! JSON map loaded from disk at runtime. +//! +//! The daemon (or a CLI command) downloads the JSON from the repo's +//! `online-data` branch, writes it to a cache path, and calls +//! [`install_online_cache`] to plug it into the resolver. Replacing the +//! cache is supported (`RwLock`, not `OnceLock`) so the daemon can refresh +//! during a long-running session without a restart. +//! +//! All errors here are swallowed by design — if the overlay can't load, the +//! resolver simply degrades to tier-1 + tier-3. + +use super::UsbInfo; +use std::collections::HashMap; +use std::path::Path; +use std::sync::RwLock; + +/// URL of the dataset index produced by the `online-data` branch's nightly +/// workflow. Clients can `GET` this, parse the JSON, and pull the +/// `datasets["usb-vid"].url` field to find the live `usb-vid.json`. +pub const MANIFEST_URL: &str = + "https://raw.githubusercontent.com/fastled/fbuild/online-data/manifest.json"; + +/// Direct convenience URL for the merged dataset itself. Kept in sync with +/// [`MANIFEST_URL`]'s `datasets["usb-vid"].url` by the nightly workflow. +/// Clients that don't want to parse the manifest can fetch this directly. +pub const USB_VID_JSON_URL: &str = + "https://raw.githubusercontent.com/fastled/fbuild/online-data/data/usb-vid.json"; + +static ONLINE_MAP: RwLock>> = RwLock::new(None); + +/// Install the overlay from a JSON file on disk. Replaces any previously +/// installed overlay. Silently no-ops on any IO or parse error so the +/// resolver never crashes on a stale / partial cache file. +pub fn install_online_cache(path: &Path) { + let raw = match std::fs::read_to_string(path) { + Ok(s) => s, + Err(e) => { + tracing::debug!(?path, error = %e, "usb online overlay: read failed"); + return; + } + }; + let parsed: HashMap = match serde_json::from_str(&raw) { + Ok(m) => m, + Err(e) => { + tracing::warn!(?path, error = %e, "usb online overlay: parse failed"); + return; + } + }; + let mut packed = HashMap::with_capacity(parsed.len()); + for (key, info) in parsed { + if let Some(packed_key) = parse_vid_pid_key(&key) { + packed.insert(packed_key, info); + } + } + let count = packed.len(); + install_online_cache_map(packed); + tracing::debug!(path = %path.display(), entries = count, "usb online overlay installed"); +} + +/// Replace the overlay with a pre-built map. Exposed at `pub(crate)` so +/// the daemon could in principle skip the file dance — primary user is the +/// resolver's own test suite. +pub(crate) fn install_online_cache_map(map: HashMap) { + let mut guard = ONLINE_MAP.write().unwrap(); + *guard = Some(map); +} + +/// Tier-2 lookup. `None` if no overlay is installed or the pair is missing. +pub(crate) fn lookup(vid: u16, pid: u16) -> Option { + let guard = ONLINE_MAP.read().ok()?; + let map = guard.as_ref()?; + map.get(&pack(vid, pid)).cloned() +} + +/// Pack a (vid, pid) into a single `u32` key. The high half is the vendor. +pub(crate) fn pack(vid: u16, pid: u16) -> u32 { + ((vid as u32) << 16) | (pid as u32) +} + +fn parse_vid_pid_key(key: &str) -> Option { + let (vid_s, pid_s) = key.split_once(':')?; + let vid = u16::from_str_radix(vid_s.trim(), 16).ok()?; + let pid = u16::from_str_radix(pid_s.trim(), 16).ok()?; + Some(pack(vid, pid)) +} + +#[cfg(test)] +pub(crate) fn clear_online_cache_for_tests() { + let mut guard = ONLINE_MAP.write().unwrap(); + *guard = None; +} + +#[cfg(test)] +mod tests { + use super::*; + use std::sync::Mutex; + + static OVERLAY_LOCK: Mutex<()> = Mutex::new(()); + + #[test] + fn install_online_cache_from_file_round_trip() { + let _guard = OVERLAY_LOCK.lock().unwrap(); + let tmp = tempfile::tempdir().unwrap(); + let path = tmp.path().join("usb-vid.json"); + let json = r#"{ + "feed:c0de": {"vendor": "Feedface Inc", "product": "Coded Widget"}, + "FEED:F00D": {"vendor": "Feedface Inc", "product": "Food Sensor"} + }"#; + std::fs::write(&path, json).unwrap(); + + install_online_cache(&path); + + // Lowercase key + let a = lookup(0xFEED, 0xC0DE).expect("lowercase key parsed"); + assert_eq!(a.vendor, "Feedface Inc"); + assert_eq!(a.product, "Coded Widget"); + + // Uppercase key + let b = lookup(0xFEED, 0xF00D).expect("uppercase key parsed"); + assert_eq!(b.product, "Food Sensor"); + + clear_online_cache_for_tests(); + } + + #[test] + fn install_online_cache_missing_file_is_silent() { + let _guard = OVERLAY_LOCK.lock().unwrap(); + clear_online_cache_for_tests(); + let path = std::path::PathBuf::from("/nonexistent/path/usb-vid.json"); + // Must not panic. + install_online_cache(&path); + // No overlay installed → lookup returns None. + assert!(lookup(0x1234, 0x5678).is_none()); + } + + #[test] + fn install_online_cache_bad_json_is_silent() { + let _guard = OVERLAY_LOCK.lock().unwrap(); + clear_online_cache_for_tests(); + let tmp = tempfile::tempdir().unwrap(); + let path = tmp.path().join("bad.json"); + std::fs::write(&path, "this is not json {").unwrap(); + install_online_cache(&path); // must not panic + assert!(lookup(0x1234, 0x5678).is_none()); + } +} diff --git a/crates/fbuild-core/src/usb/mod.rs b/crates/fbuild-core/src/usb/mod.rs new file mode 100644 index 00000000..6669d240 --- /dev/null +++ b/crates/fbuild-core/src/usb/mod.rs @@ -0,0 +1,31 @@ +//! USB VID:PID → human-readable vendor/product name resolution. +//! +//! Three resolution tiers, queried in order: +//! +//! 1. **Bundled** — the [`usb-ids`](https://crates.io/crates/usb-ids) crate, +//! compiled in at build time as a `phf` perfect-hash table. Zero IO, zero +//! allocations for the lookup itself. Tracks the upstream +//! `linux-usb.org` snapshot the crate was published against. +//! 2. **Online overlay** — an optional `{ "VVVV:PPPP": {vendor, product} }` +//! JSON map loaded at runtime (typically from a daemon-managed cache file +//! that mirrors the `online-data` branch of this repo). The overlay +//! provides newly-assigned VID/PID pairs that the bundled snapshot +//! doesn't yet know about. +//! 3. **Fallback** — synthetic `"Unknown vendor 0xVVVV"` placeholder so +//! callers can always print something deterministic. +//! +//! See [`resolve`] (best-effort, never `None`), [`try_resolve`] (returns +//! `None` if both real tiers miss), and [`pretty`] (formatted as +//! `"vendor product (VVVV:PPPP)"` for connect / scan / `device list` log +//! lines). +//! +//! The daemon calls [`install_online_cache`] at startup with the path to +//! the locally-cached `usb-vid.json`. The CLI / nightly workflow keeps +//! that file in sync with the manifest URL exposed by the `online-data` +//! branch — see [`MANIFEST_URL`] and [`USB_VID_JSON_URL`]. + +pub mod data; +pub mod resolver; + +pub use data::{install_online_cache, MANIFEST_URL, USB_VID_JSON_URL}; +pub use resolver::{pretty, resolve, resolve_bundled, try_resolve, UsbInfo}; diff --git a/crates/fbuild-core/src/usb/resolver.rs b/crates/fbuild-core/src/usb/resolver.rs new file mode 100644 index 00000000..5c3d917d --- /dev/null +++ b/crates/fbuild-core/src/usb/resolver.rs @@ -0,0 +1,159 @@ +//! Tiered USB VID:PID → name resolver. See the [crate::usb] module-level +//! documentation for the design. + +use serde::{Deserialize, Serialize}; + +/// Resolved USB device identity. +#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] +pub struct UsbInfo { + pub vendor: String, + pub product: String, +} + +/// Best-effort lookup. Never returns `None`: a synthetic +/// `"Unknown vendor 0xVVVV"` / `"Unknown product 0xPPPP"` is produced +/// when both tier-1 (bundled) and tier-2 (online overlay) miss. +pub fn resolve(vid: u16, pid: u16) -> UsbInfo { + try_resolve(vid, pid).unwrap_or_else(|| UsbInfo { + vendor: format!("Unknown vendor 0x{vid:04X}"), + product: format!("Unknown product 0x{pid:04X}"), + }) +} + +/// Tier-1 + tier-2 only. Returns `None` if neither knows this pair. +pub fn try_resolve(vid: u16, pid: u16) -> Option { + resolve_bundled(vid, pid).or_else(|| super::data::lookup(vid, pid)) +} + +/// Tier-1 only (the bundled `usb-ids` crate). Use when callers need to +/// distinguish "the offline snapshot knows this" from "we had to fall +/// through to the online overlay" — diagnostics, attribution, etc. +pub fn resolve_bundled(vid: u16, pid: u16) -> Option { + let device = usb_ids::Device::from_vid_pid(vid, pid)?; + Some(UsbInfo { + vendor: device.vendor().name().to_string(), + product: device.name().to_string(), + }) +} + +/// `"vendor product (VVVV:PPPP)"` — the canonical display format used by +/// the CLI's `device list`, `device status`, and the daemon's connect / +/// scan log lines. Always returns a non-empty string thanks to [`resolve`]'s +/// synthetic fallback. +pub fn pretty(vid: u16, pid: u16) -> String { + let info = resolve(vid, pid); + format!("{} {} ({vid:04X}:{pid:04X})", info.vendor, info.product) +} + +#[cfg(test)] +mod tests { + use super::*; + use std::collections::HashMap; + use std::sync::Mutex; + + static OVERLAY_LOCK: Mutex<()> = Mutex::new(()); + + #[test] + fn bundled_resolves_ftdi_ft232() { + let info = resolve_bundled(0x0403, 0x6001).expect("FTDI FT232 in bundled DB"); + assert!( + info.vendor.to_lowercase().contains("future technology"), + "vendor: {}", + info.vendor + ); + assert!( + info.product.to_lowercase().contains("ft232"), + "product: {}", + info.product + ); + } + + #[test] + fn bundled_resolves_silabs_cp210x() { + let info = resolve_bundled(0x10C4, 0xEA60).expect("Silicon Labs CP210x in bundled DB"); + assert!( + info.vendor.to_lowercase().contains("silicon labs") + || info.vendor.to_lowercase().contains("cygnal"), + "vendor: {}", + info.vendor + ); + assert!( + info.product.to_lowercase().contains("cp210"), + "product: {}", + info.product + ); + } + + #[test] + fn bundled_resolves_wch_ch340() { + let info = resolve_bundled(0x1A86, 0x7523).expect("WCH CH340 in bundled DB"); + assert!( + info.vendor.to_lowercase().contains("qinheng") + || info.vendor.to_lowercase().contains("wch") + || info.vendor.to_lowercase().contains("nanjing"), + "vendor: {}", + info.vendor + ); + assert!( + info.product.to_lowercase().contains("ch340") + || info.product.to_lowercase().contains("serial"), + "product: {}", + info.product + ); + } + + #[test] + fn unknown_pair_returns_synthetic_placeholder() { + // 0xFFFE:0xFFFE is reserved and will not be assigned by USB-IF; + // safe sentinel for "we expect tier-3 to fire." + let info = resolve(0xFFFE, 0xFFFE); + assert_eq!(info.vendor, "Unknown vendor 0xFFFE"); + assert_eq!(info.product, "Unknown product 0xFFFE"); + } + + #[test] + fn pretty_format_uses_canonical_shape() { + // FTDI FT232 is one of the most stable VID:PIDs in the bundled DB + // (it's the de-facto USB-serial chip used in every Arduino clone). + let s = pretty(0x0403, 0x6001); + assert!(s.ends_with("(0403:6001)"), "tail format wrong: {s}"); + assert!( + s.to_lowercase().contains("future technology"), + "missing vendor: {s}" + ); + // Pretty also handles the unknown path deterministically. + let unknown = pretty(0xFFFE, 0xFFFE); + assert_eq!( + unknown, + "Unknown vendor 0xFFFE Unknown product 0xFFFE (FFFE:FFFE)" + ); + } + + #[test] + fn online_overlay_resolves_when_bundled_misses() { + let _guard = OVERLAY_LOCK.lock().unwrap(); + // Use a VID:PID that the bundled `usb-ids` crate cannot resolve + // (0xFFFD:0xABCD is reserved). Install an overlay entry for it and + // confirm `resolve()` picks tier-2 instead of falling to tier-3. + assert!( + resolve_bundled(0xFFFD, 0xABCD).is_none(), + "test fixture assumed an unallocated VID:PID; pick a different one" + ); + let mut map = HashMap::new(); + map.insert( + super::super::data::pack(0xFFFD, 0xABCD), + UsbInfo { + vendor: "Acme Test Devices".to_string(), + product: "Test Widget 9000".to_string(), + }, + ); + super::super::data::install_online_cache_map(map); + + let info = resolve(0xFFFD, 0xABCD); + assert_eq!(info.vendor, "Acme Test Devices"); + assert_eq!(info.product, "Test Widget 9000"); + + // Reset so unrelated tests don't observe this entry. + super::super::data::clear_online_cache_for_tests(); + } +} diff --git a/crates/fbuild-daemon/src/device_manager.rs b/crates/fbuild-daemon/src/device_manager.rs index 3ecb9fbc..a0b4bba3 100644 --- a/crates/fbuild-daemon/src/device_manager.rs +++ b/crates/fbuild-daemon/src/device_manager.rs @@ -101,6 +101,13 @@ pub struct DeviceState { pub description: String, pub vid: Option, pub pid: Option, + /// Human-readable USB vendor name, resolved from `vid` via + /// [`fbuild_core::usb::resolve`]. `None` only when the device has no + /// `vid` (e.g. bluetooth/PCI serial). Tier-1/2/3 fallbacks guarantee a + /// string when a `vid` exists. See [`crate::device_manager`]. + pub vendor_name: Option, + /// Human-readable USB product name (same provenance as `vendor_name`). + pub product_name: Option, pub serial_number: Option, pub previous_port: Option, pub exclusive_lease: Option, @@ -150,6 +157,8 @@ struct DiscoveredDevice { description: String, vid: Option, pid: Option, + vendor_name: Option, + product_name: Option, serial_number: Option, } @@ -222,7 +231,7 @@ impl DeviceManager { let discovered: Vec = ports .into_iter() .map(|port_info| { - let (vid, pid, desc) = match &port_info.port_type { + let (vid, pid, fallback_desc) = match &port_info.port_type { serialport::SerialPortType::UsbPort(usb) => ( Some(usb.vid), Some(usb.pid), @@ -236,6 +245,20 @@ impl DeviceManager { serialport::SerialPortType::PciPort => (None, None, "PCI Serial".to_string()), serialport::SerialPortType::Unknown => (None, None, "Unknown".to_string()), }; + // Resolve VID:PID → pretty (vendor, product) via the bundled + // `usb-ids` snapshot + any online overlay the daemon has + // installed. When both are present, the resolver-derived + // description wins over the (often blank or generic) string + // returned by the OS-level enumerator. Bluetooth / PCI / + // unknown ports keep their static fallback descriptor. + let (vendor_name, product_name, description) = match (vid, pid) { + (Some(v), Some(p)) => { + let info = fbuild_core::usb::resolve(v, p); + let desc = format!("{} {}", info.vendor, info.product); + (Some(info.vendor), Some(info.product), desc) + } + _ => (None, None, fallback_desc), + }; let serial_number = match &port_info.port_type { serialport::SerialPortType::UsbPort(usb) => usb.serial_number.clone(), _ => None, @@ -247,9 +270,11 @@ impl DeviceManager { DiscoveredDevice { port: port_info.port_name, device_id, - description: desc, + description, vid, pid, + vendor_name, + product_name, serial_number, } }) @@ -297,6 +322,8 @@ impl DeviceManager { state.device_id = device.device_id; state.vid = device.vid; state.pid = device.pid; + state.vendor_name = device.vendor_name; + state.product_name = device.product_name; state.serial_number = device.serial_number; if let Some(previous_port) = state.previous_port.clone() { self.recent_port_moves.lock().unwrap().push(DevicePortMove { @@ -318,6 +345,8 @@ impl DeviceManager { description: device.description.clone(), vid: device.vid, pid: device.pid, + vendor_name: device.vendor_name.clone(), + product_name: device.product_name.clone(), serial_number: device.serial_number.clone(), previous_port: None, exclusive_lease: None, @@ -334,6 +363,8 @@ impl DeviceManager { entry.device_id = device.device_id; entry.vid = device.vid; entry.pid = device.pid; + entry.vendor_name = device.vendor_name; + entry.product_name = device.product_name; entry.serial_number = device.serial_number; } @@ -637,6 +668,8 @@ impl DeviceManager { description: "Test Device".to_string(), vid: Some(0x1234), pid: Some(0x5678), + vendor_name: Some("Test Vendor".to_string()), + product_name: Some("Test Device".to_string()), serial_number: Some("TEST-SERIAL".to_string()), previous_port: None, exclusive_lease: None, @@ -837,6 +870,8 @@ mod tests { description: "Test Device Renumbered".to_string(), vid: Some(0x1234), pid: Some(0x5678), + vendor_name: Some("Test Vendor".to_string()), + product_name: Some("Test Device".to_string()), serial_number: Some("TEST-SERIAL".to_string()), }]); @@ -877,6 +912,8 @@ mod tests { description: "Test Device Renumbered".to_string(), vid: Some(0x1234), pid: Some(0x5678), + vendor_name: Some("Test Vendor".to_string()), + product_name: Some("Test Device".to_string()), serial_number: Some("TEST-SERIAL".to_string()), }]); diff --git a/crates/fbuild-daemon/src/handlers/devices.rs b/crates/fbuild-daemon/src/handlers/devices.rs index 5974cfec..be15e3ae 100644 --- a/crates/fbuild-daemon/src/handlers/devices.rs +++ b/crates/fbuild-daemon/src/handlers/devices.rs @@ -41,6 +41,8 @@ pub async fn device_status( port: port.clone(), device_id: String::new(), description: format!("device '{}' not found", port), + vendor_name: None, + product_name: None, serial_number: None, previous_port: None, is_connected: false, @@ -59,6 +61,8 @@ fn device_info(state: &DeviceState) -> DeviceInfo { device_id: Some(state.device_id.clone()), vid: state.vid, pid: state.pid, + vendor_name: state.vendor_name.clone(), + product_name: state.product_name.clone(), serial_number: state.serial_number.clone(), previous_port: state.previous_port.clone(), description: state.description.clone(), @@ -79,6 +83,8 @@ fn device_status_response(state: DeviceState) -> DeviceStatusResponse { port: state.port, device_id: state.device_id, description: state.description, + vendor_name: state.vendor_name, + product_name: state.product_name, serial_number: state.serial_number, previous_port: state.previous_port, is_connected: state.is_connected, diff --git a/crates/fbuild-daemon/src/handlers/operations/deploy_port.rs b/crates/fbuild-daemon/src/handlers/operations/deploy_port.rs index 7542704e..6490c525 100644 --- a/crates/fbuild-daemon/src/handlers/operations/deploy_port.rs +++ b/crates/fbuild-daemon/src/handlers/operations/deploy_port.rs @@ -56,12 +56,14 @@ pub(super) fn choose_deploy_port( if matches.len() == 1 { let selected = matches[0]; + log_connect("deploy", selected); DeployPortChoice { port: Some(selected.port.clone()), warning: None, } } else if !matches.is_empty() { let selected = matches[0]; + log_connect("deploy", selected); DeployPortChoice { port: Some(selected.port.clone()), warning: Some(format!( @@ -73,6 +75,7 @@ pub(super) fn choose_deploy_port( } } else if !candidates.is_empty() { let selected = &candidates[0]; + log_connect("deploy", selected); DeployPortChoice { port: Some(selected.port.clone()), warning: Some(format!( @@ -152,17 +155,36 @@ fn format_vids(vids: &[u16]) -> String { fn format_candidates<'a>(candidates: impl Iterator) -> String { candidates .map(|d| { - let id = match (d.vid, d.pid) { - (Some(vid), Some(pid)) => format!("{vid:04X}:{pid:04X}"), - (Some(vid), None) => format!("{vid:04X}:????"), - _ => "unknown".to_string(), + // For candidates we have a resolved VID:PID for, emit the + // canonical `vendor product (VVVV:PPPP)` form via the shared + // resolver — this is what the user sees in `fbuild device list` + // and what we log on connect, so warnings stay consistent. + let pretty = match (d.vid, d.pid) { + (Some(vid), Some(pid)) => fbuild_core::usb::pretty(vid, pid), + (Some(vid), None) => format!("{} ({vid:04X}:????)", d.description), + _ => d.description.clone(), }; - format!("{} ({}, {})", d.port, id, d.description) + format!("{} ({})", d.port, pretty) }) .collect::>() .join(", ") } +/// Emit the canonical connect-time log line: +/// `": selected (VVVV:PPPP)"`. Falls back +/// to the raw `description` when no VID:PID is known. Called by +/// [`choose_deploy_port`] at the moment a device is bound to a deploy +/// operation; the same format is used by the scan log lines so the user +/// sees identical strings in `fbuild device list` and `fbuild deploy`. +fn log_connect(op: &str, candidate: &PortCandidate) { + let pretty = match (candidate.vid, candidate.pid) { + (Some(vid), Some(pid)) => fbuild_core::usb::pretty(vid, pid), + (Some(vid), None) => format!("{} ({vid:04X}:????)", candidate.description), + _ => candidate.description.clone(), + }; + tracing::info!("{op}: selected {} — {}", candidate.port, pretty); +} + #[cfg(test)] mod tests { use super::*; @@ -177,6 +199,8 @@ mod tests { description: "USB Serial Device".to_string(), vid, pid, + vendor_name: None, + product_name: None, serial_number: None, previous_port: None, exclusive_lease: None, diff --git a/crates/fbuild-daemon/src/models.rs b/crates/fbuild-daemon/src/models.rs index a644637a..579b7456 100644 --- a/crates/fbuild-daemon/src/models.rs +++ b/crates/fbuild-daemon/src/models.rs @@ -281,6 +281,14 @@ pub struct DeviceInfo { pub vid: Option, #[serde(skip_serializing_if = "Option::is_none")] pub pid: Option, + /// Human-readable USB vendor name resolved via `fbuild_core::usb`. Only + /// emitted when the device has a USB VID (bluetooth/PCI/unknown serials + /// omit this). + #[serde(skip_serializing_if = "Option::is_none")] + pub vendor_name: Option, + /// Human-readable USB product name (same provenance as `vendor_name`). + #[serde(skip_serializing_if = "Option::is_none")] + pub product_name: Option, #[serde(skip_serializing_if = "Option::is_none")] pub serial_number: Option, #[serde(skip_serializing_if = "Option::is_none")] @@ -439,6 +447,12 @@ pub struct DeviceStatusResponse { pub port: String, pub device_id: String, pub description: String, + /// Human-readable USB vendor name (only present for USB ports). + #[serde(skip_serializing_if = "Option::is_none")] + pub vendor_name: Option, + /// Human-readable USB product name (only present for USB ports). + #[serde(skip_serializing_if = "Option::is_none")] + pub product_name: Option, #[serde(skip_serializing_if = "Option::is_none")] pub serial_number: Option, #[serde(skip_serializing_if = "Option::is_none")] diff --git a/docs/online-data.md b/docs/online-data.md new file mode 100644 index 00000000..991c7f38 --- /dev/null +++ b/docs/online-data.md @@ -0,0 +1,129 @@ +# `online-data` branch + nightly refresh + +The repo carries a long-lived orphan branch called `online-data` that holds +periodically-refreshed reference datasets fbuild reads at runtime. Today +the only dataset is the USB VID:PID → vendor/product map; the format is +**future-forward** so additional datasets (PCI vendor IDs, board feature +matrices, etc.) can be added later without breaking clients. + +The companion in-process resolver lives at `fbuild_core::usb` — see +`crates/fbuild-core/src/usb/`. The branch is the **tier-2 fallback** when +the bundled `usb-ids` crate doesn't know a VID:PID. + +## URLs + +- Manifest (entry point — clients fetch this first): + `https://raw.githubusercontent.com/fastled/fbuild/online-data/manifest.json` +- Live dataset (also exposed in the manifest): + `https://raw.githubusercontent.com/fastled/fbuild/online-data/data/usb-vid.json` +- Conflict log (visibility, not consumed by fbuild at runtime): + `https://raw.githubusercontent.com/fastled/fbuild/online-data/data/usb-vid-conflicts.json` + +The matching constants in code: `fbuild_core::usb::MANIFEST_URL` and +`fbuild_core::usb::USB_VID_JSON_URL`. + +## Branch shape + +``` +online-data (orphan, NEVER merged into main) +├── README.md +├── manifest.json +├── data/ +│ ├── usb-vid.json # alphabetically sorted, lowercase hex keys +│ └── usb-vid-conflicts.json # only keys where sources disagreed +└── tools/ + ├── README.md + └── merge_sources.py # union + sort + manifest emit +``` + +There is **no `Cargo.toml`, no `src/`, no workspace member** on +`online-data` — the dump-side tooling for the bundled `usb-ids` crate +lives on `main` as an example (`crates/fbuild-core/examples/dump_usb_ids.rs`) +so we don't have to add a new crate. The nightly workflow checks out main +to build that example, then checks out `online-data` in a worktree to run +the merger script and commit results. + +## How a refresh happens + +`.github/workflows/nightly-usb-ids.yml` is the only workflow that touches +`online-data`. It lives on `main` because GitHub Actions requires `schedule` +and `workflow_dispatch` triggers to be defined on the default branch. + +Per run: + +1. Checkout `main` (workflow + dump example live here). +2. `git worktree add` the `online-data` branch into a sibling directory. +3. Install uv + soldr. +4. `soldr cargo build --release --example dump_usb_ids -p fbuild-core` + then run it → `/tmp/usb-ids-rs.json` (one input source). +5. `curl --retry 5` two upstream `usb.ids` text mirrors: + `http://www.linux-usb.org/usb.ids` and + `https://raw.githubusercontent.com/usbids/usbids/master/usb.ids` + (independently fault-tolerant — one mirror going down does not break + the run). +6. `uv run --no-project --script .online-data/tools/merge_sources.py …` + over whichever sources arrived intact. The merger: + - takes the union, prefers `usb-ids-rs` > `linux-usb.org` > `usbids-github` + on conflict; + - sorts keys alphabetically (lowercase `vvvv:pppp`); + - writes `data/usb-vid.json`, `data/usb-vid-conflicts.json`, + and the freshly-stamped `manifest.json`; + - **refuses to write** if the union has fewer than 1000 entries so a + truncated upstream cannot blow away a healthy committed dataset. +7. If files actually changed, commit on `online-data`. +8. Prune history: if `git rev-list --count HEAD > 200`, graft the + 200-th-most-recent commit as the new root and `git filter-repo`. +9. `git push --force-with-lease origin online-data` (the force is needed + only when history was pruned). + +Manual trigger: Actions → "Nightly USB IDs refresh" → Run workflow. + +## Fault tolerance contract + +- **`usb-ids` build / dump fails** → workflow continues with text sources. +- **One upstream mirror unreachable** → merger still runs against the + remaining sources. +- **All upstream sources fail** → merger refuses to write → workflow + finishes with no commit; existing committed data is preserved. +- **Merger writes too-small output** → same as above (sanity floor). +- **Workflow itself fails before commit** → previous commit on + `online-data` remains the live data. + +In every failure mode the *previously committed* data on `online-data` +stays as the live truth — fbuild keeps working against the last good +snapshot. + +## Why orphan + force-push? + +- Orphan: `online-data` shares no history with `main`. We never want + data churn rebasing into the source tree. +- Force-push: only after the history-prune step rewrites the chain to + cap at 200 commits. A non-pruning run produces a normal fast-forward. + +## Manifest schema (future-forward) + +```json +{ + "schema_version": "1.0", + "generated_at": "2026-06-20T04:17:00Z", + "datasets": { + "usb-vid": { + "description": "USB VID:PID → {vendor, product} ...", + "url": "https://raw.githubusercontent.com/fastled/fbuild/online-data/data/usb-vid.json", + "conflicts_url": "https://raw.githubusercontent.com/fastled/fbuild/online-data/data/usb-vid-conflicts.json", + "format": "json-object", + "key_format": "vvvv:pppp", + "entries": 20536, + "sources": [ + {"name": "usb-ids-rs", "kind": "json", "entries": "20480"}, + {"name": "linux-usb.org", "kind": "usb.ids-text", "entries": "20536"}, + {"name": "usbids-github", "kind": "usb.ids-text", "entries": "20536"} + ] + } + } +} +``` + +Adding a new dataset (`pci-vid`, `board-features`, …) means appending +another entry under `datasets` and shipping a parser in the consuming +crate — no schema break. diff --git a/tasks/todo.md b/tasks/todo.md index 48de5c2b..62304df2 100644 --- a/tasks/todo.md +++ b/tasks/todo.md @@ -1,19 +1,227 @@ -# TODO — Warm-pass perf investigation (#91) +# TODO — USB VID:PID resolver + `online-data` branch (FastLED/fbuild) -## Plan +## Goal -- [x] Add `perf_log` module in `fbuild-build` with env-gated (`FBUILD_PERF_LOG=1`) phase timer -- [x] Instrument `BuildContext::new()` (config parse, board load, build-dir setup, flag collect) -- [x] Instrument `pipeline::run_sequential_build_with_libs` phases (core, variant, sketch, libs, compiledb, link) -- [x] Instrument AVR orchestrator outer phases (toolchain, framework, scan) -- [x] Instrument daemon `build` handler (lock acquire, spawn_blocking bookkeeping) -- [x] Instrument CLI round-trip for the warm build path -- [x] Ensure `cargo check` + `cargo clippy` + `cargo fmt` clean -- [x] Run cold+warm experiment on `tests/platform/uno` -- [x] Write `docs/PERF_WARM_BUILD.md` with methodology, phase table, top stalls, follow-ups -- [x] Add row to `docs/INDEX.md` -- [x] Commit (no push) +Two-part design so fbuild can always translate a USB `VID:PID` to a +human-readable `vendor / product` name: -## Review +1. **First line of defense** — bundled `usb-ids` Rust crate (offline, MIT, + `phf` perfect-hash table, zero allocations, zero network). +2. **Backup / live updates** — orphan `online-data` branch in this same repo, + refreshed once per day by a GitHub Actions workflow. Branch carries a + `manifest.json` (future-forward index) pointing at `data/usb-vid.json` + (sorted alphabetical union of multiple upstream sources). fbuild reads the + JSON when the bundled crate cannot resolve a VID:PID. Conflicts between + sources go into `data/usb-vid-conflicts.json` for observability. -See `docs/PERF_WARM_BUILD.md` for measurements + top stalls. +The acceptance bar (verbatim from the user): +- PR is merged. +- The nightly workflow is run manually via `workflow_dispatch`. +- The manifest URL `https://raw.githubusercontent.com/fastled/fbuild/online-data/manifest.json` actually resolves. +- A fbuild unit test demonstrates the `resolve(vid, pid)` API works end-to-end. +- When fbuild **connects to** or **scans** a device, the printed output includes + the pretty `vendor product (VVVV:PPPP)` string — not just raw hex. + +## Sources we will merge + +| Source | URL | Format | Notes | +|---|---|---|---| +| `usb-ids` crate (built via soldr) | bundled at compile time | Rust → JSON dump | Tracks upstream snapshots; primary truth | +| linux-usb.org (canonical) | `http://www.linux-usb.org/usb.ids` | plain text | HTTPS is broken (SAN mismatch) — `http://` is intentional | +| usbids/usbids GitHub mirror | `https://raw.githubusercontent.com/usbids/usbids/master/usb.ids` | plain text | CDN-backed, tracks upstream same-day | + +Priority order on conflict: `usb-ids-rs` > `linux-usb.org` > `usbids/usbids`. +All conflicts are still recorded in `usb-vid-conflicts.json`. + +## File layout (on `main` — minimal: resolver + workflow trigger only) + +``` +crates/fbuild-core/ + Cargo.toml # +usb-ids = "1.2025" + src/lib.rs # +pub mod usb; + src/usb/ + mod.rs # re-exports + resolver.rs # resolve(vid, pid), tiers + data.rs # online JSON load + OnceLock cache + tests.rs # FTDI / CP210x / CH340 / unknown + +crates/fbuild-daemon/ + src/device_manager.rs # enrich description via resolver + src/models.rs # +vendor_name/product_name on DeviceInfo, etc. + src/handlers/devices.rs # populate the new fields + +crates/fbuild-cli/ + src/cli/device.rs # display "vendor product (VVVV:PPPP)" in list/status + +.github/workflows/ + nightly-usb-ids.yml # cron + workflow_dispatch — checks out + # online-data into a worktree, builds the + # tools FROM THERE, commits data back. + # No tooling lives on main. + +docs/ + online-data.md # documents the branch + workflow + schema +``` + +## File layout (on `online-data` orphan branch — tools + data) + +``` +README.md # explains: orphan, do-not-merge, structure +manifest.json # future-forward index of datasets +data/usb-vid.json # {} initially, populated by workflow +data/usb-vid-conflicts.json # {} initially +.gitignore # bury *.bak / *.tmp + +tools/usb-ids-dump/ # standalone, NOT a workspace crate (and + Cargo.toml # not on main at all → no impact on the + src/main.rs # main-branch crate-gate / monocrate policy) + README.md +tools/merge_sources.py # union + sort + manifest emit +tools/README.md # how the nightly workflow uses these +``` + +> The workflow YAML still lives on `main` (GitHub requires `schedule` and +> `workflow_dispatch` triggers to be defined on the default branch). The +> workflow itself does nothing except: `git worktree add` the `online-data` +> branch, run the tools from there, and commit data files back. + +## API shape — `fbuild_core::usb` + +```rust +#[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)] +pub struct UsbInfo { + pub vendor: String, + pub product: String, +} + +/// Best-effort resolve; never returns None. If unknown, returns +/// `UsbInfo { vendor: "Unknown vendor 0xVVVV", product: "Unknown product 0xPPPP" }`. +pub fn resolve(vid: u16, pid: u16) -> UsbInfo; + +/// Only returns Some if either the bundled crate or the online cache resolved. +pub fn try_resolve(vid: u16, pid: u16) -> Option; + +/// Tier-1 only (bundled `usb-ids`). +pub fn resolve_bundled(vid: u16, pid: u16) -> Option; + +/// Install an override map. Called by the daemon at startup with the +/// path to `~/.fbuild/cache/usb-vid.json` (or wherever the CLI cached +/// the last download). +pub fn install_online_cache(path: &std::path::Path); + +/// Convenience pretty formatter: +/// "vendor product (VVVV:PPPP)" +/// Used by the CLI's device list / connect / scan log lines. +pub fn pretty(vid: u16, pid: u16) -> String; +``` + +Internals: +- Static `OnceLock>` for the online overlay. +- Key packing: `(vid as u32) << 16 | pid as u32`. +- `install_online_cache` reads file once, parses serde_json into the map. Silent on IO error (the resolver still works via tier 1 + fallback formatter). + +## Daemon / CLI wiring + +- `device_manager::refresh_devices` — when building each `DiscoveredDevice`, if `vid` and `pid` are present, call `fbuild_core::usb::resolve(vid, pid)` and stash both `vendor_name` + `product_name` on the device record. The free-form `description` becomes `"{vendor} {product}"` (overriding the bland `usb.product` string from `serialport`). +- `DeviceState`, `DeviceInfo`, `DeviceStatusResponse` — gain `Option vendor_name` and `Option product_name`. +- `fbuild-cli` — `device list` and `device status` print the new fields; deploy / monitor log lines that mention a port now use `fbuild_core::usb::pretty(vid, pid)` so the user always sees the friendly name. + +## Workflow design — `.github/workflows/nightly-usb-ids.yml` + +- Trigger: `schedule: cron '17 4 * * *'` (daily) + `workflow_dispatch`. +- Runner: ubuntu-latest. +- Permissions: `contents: write` (must push to `online-data`). +- Concurrency: `cancel-in-progress: false`, group `nightly-usb-ids` (don't trample a running update). + +Steps (high-level): +1. `actions/checkout@v6` (default — `main`, for the dump binary + script). +2. `setup-uv` + `setup-soldr` (same versions as `check-ubuntu.yml`). +3. `soldr cargo build --release --manifest-path ci/usb_ids/Cargo.toml` → produces `ci/usb_ids/target/release/dump-usb-ids`. **If this step fails, the workflow continues with the existing data — fault tolerance #1.** We do this by `continue-on-error: true` + a step output flag. +4. Run the dump: redirect stdout to `/tmp/source-usb-ids-rs.json`. Sanity-check entry count (>= 5000) — else mark as failed-but-non-fatal. +5. Download external sources with `curl --retry 5 --retry-delay 10 --fail` into `/tmp/`. Each download is independently fault-tolerant: a failure flags the source as missing for this run but does NOT abort the workflow. +6. `uv run python ci/usb_ids/merge_sources.py --rs … --txt … --out-dir /tmp/merged`. The script: + - reads only the sources that arrived intact this run, + - falls back to the previously-committed `data/usb-vid.json` from `online-data` for any missing tier, + - refuses to emit a `usb-vid.json` with fewer than 1000 entries (sanity floor), + - sorts keys alphabetically (`json.dumps(..., sort_keys=True, indent=2)` with stable encoding), + - writes `usb-vid.json`, `usb-vid-conflicts.json`, `manifest.json`. +7. Fetch + worktree `online-data` (we keep the branch in a separate `git worktree`, so we never touch the workflow checkout). +8. Replace the data files in the worktree only with files the merger actually emitted. If the merger emitted nothing for a given file, leave the existing committed copy in place — fault tolerance #2. +9. `git add manifest.json data/`. If `git diff --cached --quiet`, skip the commit (no churn commits). +10. Otherwise commit with a message like `chore(usb-ids): nightly refresh 2026-06-20 (sources: rs, linux-usb, github)` listing which sources contributed. +11. Prune history to last 200 commits: count via `git rev-list --count HEAD`; if > 200, find the boundary commit, `git replace --graft` it as a new root, run `git filter-repo` (or `filter-branch` if filter-repo isn't on the runner) to rewrite, clean the replace refs. +12. `git push --force-with-lease origin online-data` (force is needed only when history was pruned; otherwise a plain push). + +No build artifacts saved. The merger writes to `/tmp/`; the workflow only commits `manifest.json` + `data/*.json` + the existing `README.md` in `online-data`. + +## `manifest.json` schema (future-forward) + +```json +{ + "schema_version": "1.0", + "generated_at": "2026-06-20T04:17:00Z", + "datasets": { + "usb-vid": { + "description": "USB VID:PID → {vendor, product} (union of multiple sources)", + "url": "https://raw.githubusercontent.com/fastled/fbuild/online-data/data/usb-vid.json", + "conflicts_url": "https://raw.githubusercontent.com/fastled/fbuild/online-data/data/usb-vid-conflicts.json", + "format": "json-object", + "key_format": "vvvv:pppp (lowercase hex, colon-separated)", + "entries": 24536, + "sources": [ + {"name": "usb-ids-rs", "version": "1.2025.2"}, + {"name": "linux-usb.org", "fetched_at": "2026-06-20T04:17:11Z"}, + {"name": "usbids/usbids", "fetched_at": "2026-06-20T04:17:13Z"} + ] + } + } +} +``` + +Adding a future dataset (say `pci-vid`) means appending another entry under +`datasets` — no clients break. + +## `usb-vid.json` schema + +```json +{ + "0403:6001": {"vendor": "Future Technology Devices International, Ltd", "product": "FT232 Serial (UART) IC"}, + "10c4:ea60": {"vendor": "Silicon Labs", "product": "CP210x UART Bridge"}, + ... +} +``` + +Alphabetical sort by key (`json.dumps(sort_keys=True)`); 2-space indent; trailing newline. + +## `usb-vid-conflicts.json` schema + +```json +{ + "0403:6001": [ + {"source": "usb-ids-rs", "vendor": "...", "product": "..."}, + {"source": "linux-usb.org","vendor": "...", "product": "..."} + ] +} +``` + +Only entries that actually had disagreement appear here; the chosen winner is +the one in `usb-vid.json` (priority order above). + +## Acceptance plan (executable) + +1. `soldr cargo test -p fbuild-core usb::` — unit tests for `resolve()` pass for FTDI / CP210x / CH340 / Espressif / unknown. +2. `soldr cargo test -p fbuild-daemon` — DeviceManager tests still pass; new test confirms enriched description. +3. `soldr cargo clippy --workspace --all-targets -- -D warnings` clean. +4. PR open on a feature branch; `crate-gate.yml` passes (we did not add a workspace crate). +5. Push `online-data` orphan branch with seed contents — verify `https://raw.githubusercontent.com/fastled/fbuild/online-data/manifest.json` returns the seed manifest. +6. Merge PR. +7. From the Actions tab, manually run `nightly-usb-ids` via `workflow_dispatch`. +8. After the run succeeds, refetch `manifest.json` and confirm: + - `entries >= 20000` + - `sources` lists `usb-ids-rs` + the two text sources + - the `url` field actually serves a JSON object with `>= 20000` keys +9. Goal hook should auto-clear once all of the above are demonstrated. + +## Review (filled in at the end) + +(left blank for now — to be appended once everything is merged and verified)