diff --git a/.gitignore b/.gitignore index 6ce6535a..49d2bcd2 100644 --- a/.gitignore +++ b/.gitignore @@ -87,3 +87,9 @@ tasks/loop-runs/ !.clud/ !.clud/settings.json .claude/tmp/ + +# `fbuild port scan` ad-hoc diagnostic captures (added with PR #741). +# Users sometimes redirect stderr/stdout to inspect the device list; +# the captures are local-only and should never be checked in. +port_scan_stderr*.txt +port_scan_stdout*.txt diff --git a/ci/bench-results/REPORT.md b/ci/bench-results/REPORT.md index 953f6f1a..e0bb440e 100644 --- a/ci/bench-results/REPORT.md +++ b/ci/bench-results/REPORT.md @@ -4,26 +4,24 @@ User's stated goal: *"rebuild incrementally when rust changes, don't tolerate stale artifacts."* uv should auto-trigger rebuild when a `.rs` file is edited, -and that rebuild should be at the soldr-incremental floor. +the rebuild should be at the cargo-incremental floor, and untouched-source +reinstalls (version bumps, lockfile churn) should be effectively free. -## What was actually wrong - -Two distinct problems were getting conflated: +## Two distinct problems were getting conflated 1. **Forced reinstall path** — `uv sync --reinstall-package fbuild` rebuilds the wheel via setup.py → `soldr cargo build`. Even when no source had - actually changed, this was 25-30s (cold cargo cache in a temp dir from - PEP 517 build isolation). -2. **Edit-detection path** — when fbuild is editable (which it is: - `source = { editable = "." }` in uv.lock), uv decides whether to re-sync - based on `[tool.uv] cache-keys = [...]`. The default cache-keys only - watches `pyproject.toml`. So `.rs` edits were silently producing **stale - artifacts** — `uv run fbuild ...` would use whatever `_native.pyd` was - last built, no matter what you edited. - -## Fixes applied - -### setup.py + changed, this was 14.9s (cold cargo cache in a temp dir from PEP 517 + build isolation). +2. **Real-edit rebuild path** — when an `.rs` file actually changes, + `cargo build --release` cascaded through 8+ first-party crates with + opt-level=3 codegen + slow `link.exe` linker → **100s** for a single + one-line edit. + +## Fixes applied (two PRs, both merged) + +### PR #743 — no-source-change reinstall path + - **`CARGO_TARGET_DIR` pinned** to `~/.fbuild/cargo-target/wheel-build` (absolute, persistent). Survives PEP 517 temp-dir copies. Deliberately separate from `/target/` so it doesn't churn against `soldr cargo @@ -31,8 +29,6 @@ Two distinct problems were getting conflated: - **mtime-skip in `BuildWithCargo.run`**: if the staged binary is newer than every `.rs` / `Cargo.toml` / `Cargo.lock` / `rust-toolchain.toml`, skip the cargo invocation entirely. - -### pyproject.toml - **`[tool.uv] no-build-isolation-package = ["fbuild"]`** — build runs in the real repo against the real venv, not a temp copy. mtime-skip can then see the persistent `ci/bin/` staged binary. @@ -41,69 +37,93 @@ Two distinct problems were getting conflated: chicken-and-egg. - **`[tool.uv] cache-keys = [..., "crates/**/*.rs", ...]`** — uv re-syncs the editable fbuild install when any of these change. **This is what - prevents the "stale artifact" failure.** It also means `uv run` after a - `.rs` edit now actually costs cargo + linker time (no free lunch). + prevents the "stale artifact" failure.** Without it, `.rs` edits silently + produce a `_native.pyd` mismatch. -## Measurements +### PR #744 — real-edit rebuild path + +- **`setup.py` defaults to the dev profile**, not `--release`. Set + `FBUILD_BUILD_RELEASE=1` to opt back in for perf tests. The PyPI release + flow bypasses setup.py (it calls `cargo zigbuild --release` directly in + `release-auto.yml`), so published wheels are unaffected. +- **`[profile.dev.package."*"] opt-level = 3`** in `Cargo.toml` — third-party + deps stay optimized even in dev profile, so runtime hot paths (serde, + tokio, reqwest) don't tank. +- **`rust-lld` as the Windows linker** via `.cargo/config.toml`. Ships with + the Rust toolchain, faster than `link.exe`, cross-profile. -All scenarios use a real content edit (append `\n` to the file), not just a -mtime touch — touched-but-unchanged files would hit zccache on rustc and -not exercise the real rebuild path. +## Measurements -### Scenario 1 — forced reinstall, no source change +All scenarios use a *real* content edit (append `\n` to the file), not just a +mtime touch — touched-but-unchanged files would hit zccache on rustc and not +exercise the real rebuild path. -This fires on version bumps, lockfile churn, explicit `--reinstall-package -fbuild`. The mtime-skip fast path: +### Headline: real `.rs` edit + `uv run python --version` -| | Baseline | After fixes | Speedup | -|---|---:|---:|---:| -| `uv sync --reinstall-package fbuild` | **14.9s** | **1.1s** | **13.6×** | +| Build profile (setup.py) | Time | +|---|---:| +| Release (pre-#744) | **100.1s** | +| Dev (post-#744, current default) | **18.9s** | -### Scenario 2 — real `.rs` edit + `uv run` (cache-keys watching `.rs`) +**5.3× speedup** on the path that fires when a Rust source actually changes. -This is the "no stale artifacts" path: +### No source change + forced reinstall -| | Baseline | After fixes | +| | Pre-#743 | Post-#743 | |---|---:|---:| -| `.rs` edit + `uv run python --version` (round 1) | 15.7s | 14.4s | -| `.rs` edit + `uv run python --version` (round 2, warmer cache) | 15.9s | 14.3s | +| `uv sync --reinstall-package fbuild` | **14.9s** | **1.1s** | -**Only ~1-2s saved on this path.** The bottleneck is cargo recompiling -`fbuild-core` (zccache misses because content actually changed) + cascading -to dependents + linking `fbuild-cli` on Windows. zccache doesn't cache the -link step. `CARGO_INCREMENTAL=1` made no measurable difference on this -workspace (release profile already strips intermediates). +**13.6× speedup**. mtime-skip never invokes cargo. -### Scenario 3 — warm `uv run` (no edit) +### Warm `uv run` (no edit at all) -| | Baseline | After fixes | +| | Pre | Post | |---|---:|---:| -| `uv run python --version` | 110ms | 100ms | +| `uv run python --version` | ~110ms | ~100ms | + +Unchanged — uv's audit-only path is already optimal. + +## What's left: the touch-only / soldr overhead floor + +Touch-only edits (any tool bumps a `.rs` file's mtime without changing +content) still cost ~14s in the `uv run` path. Profiling pinned it precisely: + +| Component | Time | +|---|---:| +| Cargo's own `Finished` report | 1.8s | +| **soldr + zccache wrapper overhead (RUSTC_WRAPPER='' baseline)** | **7.6s** | +| uv sync + reinstall (mtime-skip fires, zero cargo) | 1.5s | +| **Total touch-only rebuild via `uv run`** | **~9-15s** | + +The 5 consecutive identical builds spanned 7.6s — 19s. The variance +correlates with `--private-daemon` lifecycle behavior in soldr's zccache +wrapper. Filed upstream as **[soldr#883](https://github.com/zackees/soldr/issues/883)** +with full reproduction script and environment data. When that's resolved, +the touch-only path should land closer to ~3-4s. + +## How to reproduce -Unchanged. uv's audit-only path is already optimal. +```bash +python ci/bench_uv_run.py