WGPU/Metal backend, pure-env config loader, nextest migration (MULTI-1407)#5
Merged
Conversation
WGPU is wired alongside CUDA behind a new `wgpu` feature, dispatching to Metal on macOS / Vulkan on Linux / DX12 on Windows. Precision is now backend-aware: bf16 on CUDA (the locked cloud recipe, MULTI-1386), f32 on WGPU (Metal doesn't implement bf16 arithmetic over WGPU) and ndarray. Selection order is `cuda > wgpu > ndarray`; the model and training code stay backend-generic. Bumps the crate-root recursion limit to 256 for CubeCL's `#[cube]` macro expansion. Verified locally with `cargo test --no-default-features --features wgpu`: the forward/backward smoke runs on Metal via WGPU. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…licy `LayeredConfig::with_env` no longer reads process env. A new `read_model_env_overrides()` captures `std::env::vars()` once at the CLI boundary into an `EnvOverrides` map, which is threaded explicitly through `load_model_config` and `TrainSubcommand::resolve_model_config`. Tests build the map directly (`parse_env_overrides`) and never call `set_var` or `Jail::set_env`. Before this change, two loader tests flaked under parallel `cargo test`: `figment::Jail::set_env` mutates process-global env via `std::env::set_var`, while tests *without* a `Jail` (`Jail` serializes on a global mutex, but tests not using it skip the lock) read `WUBBIE_MODEL_*` through figment's `Env::prefixed` provider — so the env-setting test could inject `D_FF=2222` into a concurrent test's "defaults only" extraction. Verified deterministic with `WUBBIE_MODEL_D_FF=9999 cargo test` and 5 consecutive default-parallel runs (55/55 pass). CLAUDE.md gains a "Testing policy" section asserting no flakes are tolerated and codifying the read-ambient-state-at-the-CLI-seam pattern so the same class of race can't be reintroduced through a different surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI (`on-push.yml` / `on-merge.yml`) installs `cargo-nextest` via `taiki-e/install-action@v2` and runs `cargo nextest run --workspace --locked --no-tests=pass` in place of `cargo test`. `Makefile.toml`'s `[tasks.test]` is swapped to the matching nextest invocation, and CLAUDE.md's Build Commands / Testing policy / CI Guardrails sections are updated to reference `cargo nextest run` so the documented commands match what CI actually runs. A new `[tasks.monitor]` is added to `Makefile.toml` so the Claude Code harness's "run `cargo make monitor` on launch" hook succeeds instead of failing with `Task "monitor" not found`. It runs `bacon --headless --summary --no-help-line`. Because wubbie is a virtual workspace, bacon configuration lives under `[workspace.metadata.bacon]` (rather than `[package.metadata.bacon]` as in keystore) with `check` (default), `clippy`, and `test` jobs mirroring the CI gate; the `test` job uses `analyzer = "nextest"` and `--hide-progress-bar --failure-output final` so bacon can parse nextest output structurally. Verified locally with `cargo make test`: 55/55 pass under nextest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three stacked commits on
trunk:043116d— Add WGPU/Metal backend with backend-aware precision (MULTI-1407). Newwgpucrate feature; selection order iscuda > wgpu > ndarray. Precision is now backend-aware: bf16 on CUDA (the locked cloud recipe — MULTI-1386), f32 on WGPU (Metal has no bf16 arithmetic over WGPU) and ndarray. Crate-root#![recursion_limit = "256"]bump for CubeCL's#[cube]macros. Forward/backward smoke test on the active backend, verified locally on Metal withcargo test --no-default-features --features wgpu.a6240ce— Make the config loader pure ofstd::env; assert no-flake testing policy. Two pre-existing loader tests flaked under default-parallelcargo testbecausefigment::Jail::set_envmutates process-global env viastd::env::set_varwhile concurrent tests (without aJail) readWUBBIE_MODEL_*throughfigment::Env::prefixed.LayeredConfig::with_envno longer reads process env;read_model_env_overrides()capturesstd::env::vars()once at the CLI boundary into anEnvOverridesmap that is threaded throughload_model_config/TrainSubcommand::resolve_model_config. Tests build the map directly and never mutate process env. CLAUDE.md gains a Testing policy section codifying the no-flake rule and the read-ambient-state-at-the-CLI-seam pattern.57be649— Migrate test runner to nextest; add bacon-driven monitor task. CI swapscargo testforcargo nextest run --workspace --locked --no-tests=pass;Makefile.tomlgets matchingtestandmonitortasks (the latter driven bybacon --headless).Acceptance criteria — MULTI-1407
wgpufeature (#![recursion_limit = "256"]applied) — ✅Test plan
cargo fmt --all --checkcargo clippy --all-targets --workspace --locked -- -D warnings(defaultndarray,--features wgpu,--features cuda)cargo build --workspace --locked(default) andcargo build -p wubbie --no-default-features --features {cuda,wgpu} --lockedcargo nextest run --workspace --locked --no-tests=pass— 55/55 deterministic across 5 consecutive runsWUBBIE_MODEL_D_FF=9999 cargo nextest run --workspace --locked --no-tests=pass— still 55/55 (confirms the loader is now pure of process env)cargo test --no-default-features --features wgpu --lib backend::runs the forward/backward smoke on Metal via WGPU⚡ PR Readygreen (push, merge-queue contexts)🤖 Generated with Claude Code