You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
zccache ships three dylints (ban_std_pathbuf, ban_unrooted_tempdir, ban_raw_subprocess_in_daemon) to lock in architectural invariants that have caused real bugs. fbuild has the equivalent of the third (ban_raw_subprocess) but not the others, and the codebase has its own set of recurring gotchas worth pinning with custom lints.
This issue is a punch list from a multi-agent sweep across five gotcha dimensions: path normalization, async/sync boundaries, process & device containment, module-level env state, test isolation, and panic discipline. Each item lists a candidate lint name, why it matters (with a concrete bug class the rule prevents), evidence file:line examples, and a "live violations" count to help triage which lints land as net-new enforcement vs. fix-and-then-lint.
Candidate lints (ranked by impact)
Tier 1 — direct ports from zccache, would catch real bugs today
fbuild already has a NormalizedPath type at crates/fbuild-core/src/path.rs — the lint forces its use outside an explicit legacy allowlist.
Evidence: crates/fbuild-build/src/build_info.rs BuildInfo struct uses NormalizedPath for *_path fields (correct pattern) but crates/fbuild-packages/src/cache.rsCache::cache_root: PathBuf (raw) gets serialized; crates/fbuild-build/src/zccache.rs:34FingerprintWatch::{cache_file, root}: PathBuf participates in cache-key generation.
2. ban_unrooted_tempdir — direct port from zccache
tempfile::TempDir::new() / tempfile::tempdir() land in $TMPDIR, which on Windows hits MAX_PATH and on CI may be tmpfs/RAM-disk — known cause of flaky tests.
~250 unrooted-TempDir uses across the workspace (production + tests). Highest concentration in crates/fbuild-packages/src/toolchain/esp32.rs (49+) and crates/fbuild-test-support/src/mini_framework.rs:76.
Allowlist legacy callers; force new code to pass an explicit root path.
~40 std::sync::Mutex + RwLock declarations across fbuild-daemon and fbuild-serial. Examples: crates/fbuild-daemon/src/device_manager.rs:209,288, crates/fbuild-daemon/src/context.rs:580, crates/fbuild-daemon/src/handlers/health.rs:193 (.write().unwrap() on RwLock).
Implementation note: the lint should target the declaration in types reachable from async fn, not just the .await call site clippy already covers.
Tier 2 — high-value lints to lock in current clean state and prevent regression
4. ban_direct_serialport — explicitly tracked at #605 Phase 1
Only fbuild-serial may touch the serialport crate; everyone else uses SharedSerialManager via the daemon's HTTP API.
Live violations (intentional, exempt): crates/fbuild-cli/src/cli/port_scan.rs:63, crates/fbuild-cli/src/cli/serial_probe.rs:224,272 — diagnostic subcommands per the crates/CLAUDE.md exception list.
Allowlist: fbuild-serial (owner), fbuild-deploy (hardware reset ops), plus the two diagnostic CLI files above.
A panic inside a handler kills the daemon and disconnects every client. clippy's unwrap_used is too noisy for the whole workspace; scoping to crates/fbuild-daemon/src/handlers/** makes it precise.
~8 high-confidence violations all in one file: crates/fbuild-daemon/src/handlers/websockets.rs:119,131,140,173,187 (serde_json::to_string(&err_msg).unwrap() in error-reply paths).
2 violations on streaming output parsing: crates/fbuild-daemon/src/handlers/operations/build.rs:639,683 (str::from_utf8(&chunk).unwrap() on subprocess stdout — toolchains can and do emit invalid UTF-8).
Recommended scope: handlers/ only; fix the ~10 sites first, then enable the lint to prevent regression.
9. require_env_lock_in_tests
Process-wide env mutation in parallel-running cargo tests is a classic flake source. crates/fbuild-daemon/src/handlers/emulator/tests_npm_cache.rs::env_lock() is the canonical pattern.
Violations: crates/fbuild-build/tests/eh_frame_strip_esp32.rs:93,109,112,113,125 mutate FBUILD_KEEP_EH_FRAME without locking. crates/fbuild-build/tests/avr_build.rs:401 has an EnvGuard but it's per-test, not cross-test serializing.
10. ban_env_var_set_after_import — Rust analog of the Python paths.py import-time bug. crates/fbuild-daemon/src/main.rs:38 does unsafe { std::env::set_var("FBUILD_DEV_MODE", "1") } AFTER importing fbuild_paths / fbuild_build. fbuild_paths uses lazy accessors (safe), but any future OnceLock/Lazy that reads FBUILD_DEV_MODE at first-access time would freeze the wrong value.
11. require_multi_thread_flavor_when_spawning — #[tokio::test] defaults to current-thread; tests that tokio::spawn() may serialize and deadlock. Examples: crates/fbuild-daemon/tests/build_streaming.rs:33-42, crates/fbuild-serial/src/manager/tests.rs:252.
12. cli_no_build_deploy_direct_use — fbuild-cli is supposed to be a thin HTTP client. Currently clean (diagnostic subcommands are explicitly exempted per crates/CLAUDE.md) — lint locks it in.
13. require_oncelock_install_before_use — crates/fbuild-build/src/compile_backend.rsGLOBAL: OnceLock<CompileBackend> requires install_global() before any compile fires. The existing error message is good, but a lint that flags .unwrap() on get_global() outside of compile-flow contexts would catch ordering bugs at compile-time instead of runtime.
Implementation notes
Each lint should follow the pattern in dylints/ban_raw_subprocess/ and zccache's dylints/ban_std_pathbuf/: separate crate pinning its own nightly toolchain, registered via [workspace.metadata.dylint] libraries = [{ path = "dylints/*" }].
Allowlists: zccache's ban_std_pathbuf uses include_str!("allowlist.txt") for legacy paths. Adopt the same shape so the grandfathered call sites are visible in the repo, not buried in lint code.
Order: land Tier 1 first (ban_std_pathbuf, ban_unrooted_tempdir, ban_std_sync_mutex_in_async) — these catch current bugs. Tier 2 prevents regression on already-clean invariants. Tier 3 needs a fix-first sweep before enabling. Tier 4 is "if/when we have time."
CI: .github/workflows/dylint.yml already runs cargo dylint --all, so adding new crates under dylints/* picks up automatically.
Out of scope (mentioned, deliberately deferred)
require_normalized_path_for_cache_keys (Tier 1 candidate but overlaps with ban_std_pathbuf — fold into it).
Mocks-as-primary-test-surface (a CLAUDE.md principle but very hard to lint without false positives).
Ignored-test rot audit (process question, not a lint).
process::exit outside main (current state clean; very low priority).
🤖 Drafted by Claude Code from a 5-agent parallel sweep
Goal
zccache ships three dylints (
ban_std_pathbuf,ban_unrooted_tempdir,ban_raw_subprocess_in_daemon) to lock in architectural invariants that have caused real bugs. fbuild has the equivalent of the third (ban_raw_subprocess) but not the others, and the codebase has its own set of recurring gotchas worth pinning with custom lints.This issue is a punch list from a multi-agent sweep across five gotcha dimensions: path normalization, async/sync boundaries, process & device containment, module-level env state, test isolation, and panic discipline. Each item lists a candidate lint name, why it matters (with a concrete bug class the rule prevents), evidence file:line examples, and a "live violations" count to help triage which lints land as net-new enforcement vs. fix-and-then-lint.
Candidate lints (ranked by impact)
Tier 1 — direct ports from zccache, would catch real bugs today
1.
ban_std_pathbuf— direct port from zccachestd::path::PathBufdoesn't carry fbuild's normalization invariant. Mixed-separator and case-variant paths flowing into cache keys / fingerprints / compile args have caused Windows-only cache key mismatches (feat(symbols): ingest toolchain paths from build_info.json #436, Adopt NormalizePath everywhere + ban raw PathBuf/Path via dylint #437, ESP32 sketch builds fail when project_dir is relative —compile_cwdvs-opath mismatch (missingcore/parent dir) #282).NormalizedPathtype atcrates/fbuild-core/src/path.rs— the lint forces its use outside an explicit legacy allowlist.crates/fbuild-build/src/build_info.rsBuildInfo struct usesNormalizedPathfor*_pathfields (correct pattern) butcrates/fbuild-packages/src/cache.rsCache::cache_root: PathBuf(raw) gets serialized;crates/fbuild-build/src/zccache.rs:34FingerprintWatch::{cache_file, root}: PathBufparticipates in cache-key generation.2.
ban_unrooted_tempdir— direct port from zccachetempfile::TempDir::new()/tempfile::tempdir()land in$TMPDIR, which on Windows hits MAX_PATH and on CI may be tmpfs/RAM-disk — known cause of flaky tests.crates/fbuild-packages/src/toolchain/esp32.rs(49+) andcrates/fbuild-test-support/src/mini_framework.rs:76.3.
ban_std_sync_mutex_in_async/prefer_tokio_mutex_in_handlersstd::sync::Mutexin async-reachable code is a poison-panic + scheduler-starvation hazard. clippy'sawait_holding_lockcatches the worst cases but misses many.std::sync::Mutex+RwLockdeclarations acrossfbuild-daemonandfbuild-serial. Examples:crates/fbuild-daemon/src/device_manager.rs:209,288,crates/fbuild-daemon/src/context.rs:580,crates/fbuild-daemon/src/handlers/health.rs:193(.write().unwrap()on RwLock)..awaitcall site clippy already covers.Tier 2 — high-value lints to lock in current clean state and prevent regression
4.
ban_direct_serialport— explicitly tracked at #605 Phase 1fbuild-serialmay touch theserialportcrate; everyone else usesSharedSerialManagervia the daemon's HTTP API.crates/fbuild-cli/src/cli/port_scan.rs:63,crates/fbuild-cli/src/cli/serial_probe.rs:224,272— diagnostic subcommands per the crates/CLAUDE.md exception list.fbuild-serial(owner),fbuild-deploy(hardware reset ops), plus the two diagnostic CLI files above.5.
ban_deploy_tool_direct_invocationesptool/avrdude/picotool/dfu-util/pyocddirectly — go throughfbuild deploy. The existingban_raw_subprocesscatches.spawn()/.output()/.status()onCommandbut doesn't pattern-match on the binary name.Command::new("esptool" | "avrdude" | ...)matches. The lint locks in the invariant.fbuild-deploy(legitimate owner), tests with--ignoredhardware-required gates.6.
ban_file_based_locksOpenOptions::create_new(true),fs2::FileExt,flock()all return zero hits.7.
ban_await_in_spawn_blockingspawn_blockingis a sync escape hatch;.awaitinside requiresHandle::current().block_on(...)which defeats the point and risks deadlock.spawn_blockingcallsite infbuild-packages/src/disk_cache.rsand friends is purely sync.Tier 3 — error-handling discipline (high-volume, allowlist-heavy)
8.
ban_unwrap_in_daemon_handlers(or narrower:ban_serde_json_unwrap_in_handlers)unwrap_usedis too noisy for the whole workspace; scoping tocrates/fbuild-daemon/src/handlers/**makes it precise.crates/fbuild-daemon/src/handlers/websockets.rs:119,131,140,173,187(serde_json::to_string(&err_msg).unwrap()in error-reply paths).crates/fbuild-daemon/src/handlers/operations/build.rs:639,683(str::from_utf8(&chunk).unwrap()on subprocess stdout — toolchains can and do emit invalid UTF-8).9.
require_env_lock_in_testscrates/fbuild-daemon/src/handlers/emulator/tests_npm_cache.rs::env_lock()is the canonical pattern.crates/fbuild-build/tests/eh_frame_strip_esp32.rs:93,109,112,113,125mutateFBUILD_KEEP_EH_FRAMEwithout locking.crates/fbuild-build/tests/avr_build.rs:401has anEnvGuardbut it's per-test, not cross-test serializing.#[test]and#[tokio::test]only.Tier 4 — preventive lints worth considering, lower priority
10.
ban_env_var_set_after_import— Rust analog of the Pythonpaths.pyimport-time bug.crates/fbuild-daemon/src/main.rs:38doesunsafe { std::env::set_var("FBUILD_DEV_MODE", "1") }AFTER importingfbuild_paths/fbuild_build.fbuild_pathsuses lazy accessors (safe), but any futureOnceLock/Lazythat readsFBUILD_DEV_MODEat first-access time would freeze the wrong value.11.
require_multi_thread_flavor_when_spawning—#[tokio::test]defaults to current-thread; tests thattokio::spawn()may serialize and deadlock. Examples:crates/fbuild-daemon/tests/build_streaming.rs:33-42,crates/fbuild-serial/src/manager/tests.rs:252.12.
cli_no_build_deploy_direct_use—fbuild-cliis supposed to be a thin HTTP client. Currently clean (diagnostic subcommands are explicitly exempted per crates/CLAUDE.md) — lint locks it in.13.
require_oncelock_install_before_use—crates/fbuild-build/src/compile_backend.rsGLOBAL: OnceLock<CompileBackend>requiresinstall_global()before any compile fires. The existing error message is good, but a lint that flags.unwrap()onget_global()outside of compile-flow contexts would catch ordering bugs at compile-time instead of runtime.Implementation notes
dylints/ban_raw_subprocess/and zccache'sdylints/ban_std_pathbuf/: separate crate pinning its own nightly toolchain, registered via[workspace.metadata.dylint] libraries = [{ path = "dylints/*" }].ban_std_pathbufusesinclude_str!("allowlist.txt")for legacy paths. Adopt the same shape so the grandfathered call sites are visible in the repo, not buried in lint code.ban_std_pathbuf,ban_unrooted_tempdir,ban_std_sync_mutex_in_async) — these catch current bugs. Tier 2 prevents regression on already-clean invariants. Tier 3 needs a fix-first sweep before enabling. Tier 4 is "if/when we have time.".github/workflows/dylint.ymlalready runscargo dylint --all, so adding new crates underdylints/*picks up automatically.Out of scope (mentioned, deliberately deferred)
require_normalized_path_for_cache_keys(Tier 1 candidate but overlaps withban_std_pathbuf— fold into it).process::exitoutside main (current state clean; very low priority).🤖 Drafted by Claude Code from a 5-agent parallel sweep