Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,9 @@ tasks/loop-runs/
!.clud/
!.clud/settings.json
.claude/tmp/

# `fbuild port scan` ad-hoc diagnostic captures (added with PR #741).
# Users sometimes redirect stderr/stdout to inspect the device list;
# the captures are local-only and should never be checked in.
port_scan_stderr*.txt
port_scan_stdout*.txt
148 changes: 84 additions & 64 deletions ci/bench-results/REPORT.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,35 +4,31 @@

User's stated goal: *"rebuild incrementally when rust changes, don't tolerate
stale artifacts."* uv should auto-trigger rebuild when a `.rs` file is edited,
and that rebuild should be at the soldr-incremental floor.
the rebuild should be at the cargo-incremental floor, and untouched-source
reinstalls (version bumps, lockfile churn) should be effectively free.

## What was actually wrong

Two distinct problems were getting conflated:
## Two distinct problems were getting conflated

1. **Forced reinstall path** — `uv sync --reinstall-package fbuild` rebuilds
the wheel via setup.py → `soldr cargo build`. Even when no source had
actually changed, this was 25-30s (cold cargo cache in a temp dir from
PEP 517 build isolation).
2. **Edit-detection path** — when fbuild is editable (which it is:
`source = { editable = "." }` in uv.lock), uv decides whether to re-sync
based on `[tool.uv] cache-keys = [...]`. The default cache-keys only
watches `pyproject.toml`. So `.rs` edits were silently producing **stale
artifacts** — `uv run fbuild ...` would use whatever `_native.pyd` was
last built, no matter what you edited.

## Fixes applied

### setup.py
changed, this was 14.9s (cold cargo cache in a temp dir from PEP 517
build isolation).
2. **Real-edit rebuild path** — when an `.rs` file actually changes,
`cargo build --release` cascaded through 8+ first-party crates with
opt-level=3 codegen + slow `link.exe` linker → **100s** for a single
one-line edit.

## Fixes applied (two PRs, both merged)

### PR #743 — no-source-change reinstall path

- **`CARGO_TARGET_DIR` pinned** to `~/.fbuild/cargo-target/wheel-build`
(absolute, persistent). Survives PEP 517 temp-dir copies. Deliberately
separate from `<repo>/target/` so it doesn't churn against `soldr cargo
build` from the dev CLI.
- **mtime-skip in `BuildWithCargo.run`**: if the staged binary is newer than
every `.rs` / `Cargo.toml` / `Cargo.lock` / `rust-toolchain.toml`, skip the
cargo invocation entirely.

### pyproject.toml
- **`[tool.uv] no-build-isolation-package = ["fbuild"]`** — build runs in
the real repo against the real venv, not a temp copy. mtime-skip can then
see the persistent `ci/bin/` staged binary.
Expand All @@ -41,69 +37,93 @@ Two distinct problems were getting conflated:
chicken-and-egg.
- **`[tool.uv] cache-keys = [..., "crates/**/*.rs", ...]`** — uv re-syncs
the editable fbuild install when any of these change. **This is what
prevents the "stale artifact" failure.** It also means `uv run` after a
`.rs` edit now actually costs cargo + linker time (no free lunch).
prevents the "stale artifact" failure.** Without it, `.rs` edits silently
produce a `_native.pyd` mismatch.

## Measurements
### PR #744 — real-edit rebuild path

- **`setup.py` defaults to the dev profile**, not `--release`. Set
`FBUILD_BUILD_RELEASE=1` to opt back in for perf tests. The PyPI release
flow bypasses setup.py (it calls `cargo zigbuild --release` directly in
`release-auto.yml`), so published wheels are unaffected.
- **`[profile.dev.package."*"] opt-level = 3`** in `Cargo.toml` — third-party
deps stay optimized even in dev profile, so runtime hot paths (serde,
tokio, reqwest) don't tank.
- **`rust-lld` as the Windows linker** via `.cargo/config.toml`. Ships with
the Rust toolchain, faster than `link.exe`, cross-profile.

All scenarios use a real content edit (append `\n` to the file), not just a
mtime touch — touched-but-unchanged files would hit zccache on rustc and
not exercise the real rebuild path.
## Measurements

### Scenario 1 — forced reinstall, no source change
All scenarios use a *real* content edit (append `\n` to the file), not just a
mtime touch — touched-but-unchanged files would hit zccache on rustc and not
exercise the real rebuild path.

This fires on version bumps, lockfile churn, explicit `--reinstall-package
fbuild`. The mtime-skip fast path:
### Headline: real `.rs` edit + `uv run python --version`

| | Baseline | After fixes | Speedup |
|---|---:|---:|---:|
| `uv sync --reinstall-package fbuild` | **14.9s** | **1.1s** | **13.6×** |
| Build profile (setup.py) | Time |
|---|---:|
| Release (pre-#744) | **100.1s** |
| Dev (post-#744, current default) | **18.9s** |

### Scenario 2 — real `.rs` edit + `uv run` (cache-keys watching `.rs`)
**5.3× speedup** on the path that fires when a Rust source actually changes.

This is the "no stale artifacts" path:
### No source change + forced reinstall

| | Baseline | After fixes |
| | Pre-#743 | Post-#743 |
|---|---:|---:|
| `.rs` edit + `uv run python --version` (round 1) | 15.7s | 14.4s |
| `.rs` edit + `uv run python --version` (round 2, warmer cache) | 15.9s | 14.3s |
| `uv sync --reinstall-package fbuild` | **14.9s** | **1.1s** |

**Only ~1-2s saved on this path.** The bottleneck is cargo recompiling
`fbuild-core` (zccache misses because content actually changed) + cascading
to dependents + linking `fbuild-cli` on Windows. zccache doesn't cache the
link step. `CARGO_INCREMENTAL=1` made no measurable difference on this
workspace (release profile already strips intermediates).
**13.6× speedup**. mtime-skip never invokes cargo.

### Scenario 3 — warm `uv run` (no edit)
### Warm `uv run` (no edit at all)

| | Baseline | After fixes |
| | Pre | Post |
|---|---:|---:|
| `uv run python --version` | 110ms | 100ms |
| `uv run python --version` | ~110ms | ~100ms |

Unchanged — uv's audit-only path is already optimal.

## What's left: the touch-only / soldr overhead floor

Touch-only edits (any tool bumps a `.rs` file's mtime without changing
content) still cost ~14s in the `uv run` path. Profiling pinned it precisely:

| Component | Time |
|---|---:|
| Cargo's own `Finished` report | 1.8s |
| **soldr + zccache wrapper overhead (RUSTC_WRAPPER='' baseline)** | **7.6s** |
| uv sync + reinstall (mtime-skip fires, zero cargo) | 1.5s |
| **Total touch-only rebuild via `uv run`** | **~9-15s** |

The 5 consecutive identical builds spanned 7.6s — 19s. The variance
correlates with `--private-daemon` lifecycle behavior in soldr's zccache
wrapper. Filed upstream as **[soldr#883](https://github.com/zackees/soldr/issues/883)**
with full reproduction script and environment data. When that's resolved,
the touch-only path should land closer to ~3-4s.

## How to reproduce

Unchanged. uv's audit-only path is already optimal.
```bash
python ci/bench_uv_run.py <label>
```

## What's left
Writes `ci/bench-results/<label>.json`. Existing snapshots:

The 14s edit-rebuild is at cargo's incremental + linker floor for this
workspace. Pushing further requires build-tool-level changes:
- `baseline.json` — `main` before either PR. Forced-reinstall = 14.9s.
- `after_fixes.json` — after PR #743 (mtime-skip path). Forced-reinstall = 1.1s.

- **Faster linker** (rust-lld / mold). On Windows, `[target.x86_64-pc-windows-msvc]
linker = "rust-lld.exe"` in `.cargo/config.toml` typically cuts link time
by 30-50%. Risk: occasional linker compat issues. Not applied here — would
affect the dev CLI's `target/` too and needs broader testing.
- **Split `fbuild-cli` into multiple smaller binaries**. Each link would be
cheaper. Out of scope.
- **Migrate off `setup.py` + hand-rolled `ci/publish.py` to `setuptools-rust`**.
Wouldn't affect the rebuild floor, but would make the build-system code
much smaller and remove the dual-path-divergence class of bugs that
prompted this whole investigation. Separate refactor (see earlier thread).
The real-edit-cost measurements in this report were taken with timed manual
runs (`time uv run python --version` after `echo "" >> crates/fbuild-core/src/lib.rs`)
because `bench_uv_run.py` doesn't yet model a semantic edit — it only does
mtime touches, which fail to exercise the dev-vs-release codegen difference.

## Files changed
## Files

- `setup.py` — `CARGO_TARGET_DIR` pin + `_staged_binary_is_up_to_date`
helper + mtime-skip wired into `BuildWithCargo.run`.
- `pyproject.toml` — `no-build-isolation-package`, `default-groups`,
`cache-keys`; setuptools added to `dev`.
- `ci/bench_uv_run.py` — benchmark script (new).
- `setup.py` — `CARGO_TARGET_DIR` pin, `_staged_binary_is_up_to_date`,
`_use_release_profile()` env-gated profile selection.
- `pyproject.toml` — `no-build-isolation-package`, `default-groups = ["dev"]`,
`cache-keys`; setuptools added to `dev` group.
- `Cargo.toml` — `[profile.dev.package."*"] opt-level = 3`.
- `.cargo/config.toml` — `[target.x86_64-pc-windows-msvc] linker = "rust-lld.exe"`.
- `ci/bench_uv_run.py` — benchmark script.
- `ci/bench-results/{baseline,after_fixes}.json` — raw timings.
- `ci/bench-results/REPORT.md` — this file.
Loading