Skip to content

ci: path-filter platform builds + once-daily nightly to relieve CI back-pressure (79 workflows run every commit) #835

Description

@zackees

Context

Every push to main and every PR currently triggers all 79 per-board build workflows (.github/workflows/build-*.yml). Each one:

  • Calls the shared template_build.yml (cold ~minutes; warm cache still pays per-board setup).
  • Restores its own toolchain cache key.
  • Builds the fbuild CLI/daemon in debug, then builds the board firmware twice (--quick and --release).

The 79 workflows are concrete board wrappers (e.g. build-lpc804.yml, build-esp32dev.yml) — small files that pass a test-dir/env-name into the reusable template. They are unconditionally triggered on:

on:
  push:        { branches: [main] }
  pull_request:{ branches: [main] }

Meanwhile the actual platform code is cleanly grouped by family under crates/fbuild-build/src/{avr,esp32,esp8266,nxplpc,stm32,teensy,nrf52,rp2040,sam,ch32v,apollo3,silabs,renesas,generic_arm}/ (with configs/, mcu_config.rs, orchestrator*, *_compiler.rs, *_linker.rs). Per-board test sketches live under tests/platform/<board>/. The platform layer has been stable for a while — most commits touch a single family or fbuild-cli/fbuild-daemon/fbuild-core, yet trigger the full 79-job fan-out anyway.

This is the back-pressure. The platform checks are stable enough that we can bias toward not running on every commit, as long as we (a) still catch breakage in common code, and (b) still have a daily safety net on real changes.

Proposal

Two complementary mechanisms, designed to be biased toward NOT running:

1. Per-board workflows: add paths: filters scoped to family + common code

Each build-<board>.yml keeps its push / pull_request triggers but gains a paths: allowlist that fires the build only when one of the following changed:

  • The board's own test dir: tests/platform/<board>/**
  • The board's family code: e.g. for LPC boards, crates/fbuild-build/src/nxplpc/**
  • The common-code list (any change here forces all platform builds — see below)
  • The workflow file itself + the template: .github/workflows/build-<board>.yml, .github/workflows/template_build.yml

Board → family mapping (from crates/fbuild-build/src/ layout):

Family path Boards
avr/** atmega8, atmega8a, attiny{1604,1616,4313,85,88}, leonardo, nano_every, uno, mega (any atmega/attiny)
esp32/** esp32dev, esp32{c2,c3,c5,c6,h2,p4,s2,s3}
esp8266/** esp8266
nxplpc/** + generic_arm/** lpc804, lpc845, lpc845brk, lpcxpresso804, lpcxpresso845max
stm32/** blackpill, bluepill, giga-r1, nucleo-f4{29,39}zi, stm32f103{c8,cb,tb}, stm32f411ce, stm32h747xi, tinystm
teensy/** teensy{30,31,32,35,36,40,41,lc}
nrf52/** nice_nano_nrf52840, nrf52840-sense, nrf52840_dk, nrfmicro_nrf52840, supermini_nrf52840
rp2040/** rp2040, rp2350, rpipico, rpipico2
sam/** due, sam3x8e_due, samd21, samd21_zero, samd51{j,p}, matrix_portal_m4, qtpy_m0, thingplusmatter
ch32v/** ch32l103, ch32v{003,006,103,203,208,303,307}, ch32x035
apollo3/** apollo3_red, apollo3_thing_explorable
silabs/** mgm240
renesas/** uno-r4-wifi, uno_r4_wifi

Common-path force-run list (changes here trigger all platform builds — the safety net for common breakage):

crates/fbuild-build/src/{compiler.rs,linker.rs,compile_backend.rs,compile_many.rs,resolution.rs,flag_overlay.rs,framework_core_cache.rs,framework_libs.rs,parallel.rs,package_override.rs,perf_log.rs,arduino_props.rs,build_info.rs,build_output.rs,eh_frame_policy*.rs,zccache*.rs,script_runtime*.rs,lib.rs,source_scanner.rs,source_scanner/**,build_fingerprint/**,compile_database/**,pipeline/**,shrink/**,symbol_analyzer/**}
crates/fbuild-cli/**
crates/fbuild-daemon/**
crates/fbuild-core/**
crates/fbuild-paths/**
crates/fbuild-config/**
crates/fbuild-deploy/**
crates/fbuild-packages/**
crates/fbuild-header-scan/**
crates/fbuild-library-select/**
crates/fbuild-python/**
Cargo.toml
Cargo.lock
rust-toolchain.toml
.github/workflows/template_build.yml

To avoid duplicating ~25 path globs in 79 YAML files, generate the per-board paths section from a single source of truth — e.g. ci/board_families.toml (board → family) and ci/ci_common_paths.txt, with a ci/render_workflows.py (or extend the existing update-data.yml flow) that re-renders the trigger blocks. A CI check fails if rendered output drifts from committed.

2. New nightly-platforms.yml: daily safety net, only on real changes

Add a single workflow that runs all platform builds once a day at 1am PST (= 09:00 UTC standard / 08:00 UTC daylight — pick one; see Decisions). Strategy:

on:
  schedule:
    - cron: '0 9 * * *'   # ~1am PST year-round; midnight during PDT
  workflow_dispatch: {}

Guard step skips the run when there were no commits to main in the last 24h (so a quiet day costs nothing):

if [ -z "$(git log --since='24 hours ago' --oneline origin/main)" ]; then
  echo "no commits in last 24h, skipping nightly platform sweep"
  exit 78  # neutral
fi

The nightly workflow workflow_calls every template_build.yml invocation via a board matrix, sharing toolchain caches. Failures here open / reopen a tracking issue (or post to the existing nightly issue) so a regression that slipped a path filter still gets caught within 24h.

Acceptance criteria

  • A PR that only touches crates/fbuild-build/src/nxplpc/** triggers all LPC build-lpc*.yml workflows and nothing else.
  • A PR that only touches tests/platform/esp32dev/** triggers build-esp32dev.yml and nothing else.
  • A PR that only touches crates/fbuild-cli/** or crates/fbuild-build/src/compiler.rs triggers all 79 platform builds (common-code safety net).
  • A PR that only touches docs/**, README.md, .claude/**, or non-CI hooks triggers no platform builds.
  • A new nightly-platforms.yml workflow runs all 79 builds on the cron schedule iff there were commits to main in the last 24h; a no-commit day exits cleanly without running the matrix.
  • Single source of truth for board→family mapping (ci/board_families.toml or equivalent); CI fails if rendered workflow trigger blocks drift from the source of truth.
  • check-{ubuntu,macos,windows}.yml, crate-gate.yml, dylint.yml, fmt.yml, msrv.yml, loc-gate.yml, lint-subprocess.yml, docs.yml continue to run on every push/PR unchanged — they are the gate, not the back-pressure.
  • release-auto.yml is not affected (releases must build everything).
  • README / docs/DEVELOPMENT.md documents the path-filter contract and how to opt back in (touch the family path, or workflow_dispatch the specific board).

Decisions

Defaults to ship with; edit on GitHub if you disagree:

  • Priority: P2. CI cost / queue depth is real friction but not blocking shipping; deferring a week is fine.
  • Scope: per-board workflows + new nightly cron. Not converting the per-board files into a single matrix workflow — keeps workflow_dispatch per board working and minimizes blast radius of the change.
  • Source of truth: ci/board_families.toml + ci/ci_common_paths.txt, with ci/render_workflows.py. Generation pattern, not hand-edited duplication. CI check enforces no drift. Re-uses the existing uv run / Python conventions.
  • Cron at 0 9 * * * UTC. That is 1am PST in winter and 2am PDT in summer; close enough to "~1am PST" without DST gymnastics. If you would rather follow PDT (2026 most of the year), use 0 8 * * *.
  • Nightly skip on no commits: git log --since="24 hours ago" origin/main with exit-78 neutral. Cheap, no GitHub API needed.
  • Force-run on workflow self-edits. Each rendered workflow lists itself + template_build.yml in paths:, so editing a trigger always re-tests the trigger.
  • tests/platform/<board>/** is the per-board key, not the env-name — sketch dir is the actual code surface that affects the build.
  • Common-path list is conservative (broad). Bias is toward correctness on common code, narrowness only on platform code. Better to over-run on common-code edits than miss a regression.
  • No path filter on check-* / fmt / clippy / dylint / docs / msrv. Those jobs are fast, gate every PR, and are exactly the fast feedback loop we want to preserve.
  • Do not gate by labels. Pure path-based; predictable and review-free.
  • Failure handling on nightly: for now just rely on the existing red-X notification. Auto-filing an issue on nightly failure can be a follow-up if noise warrants it.

Related issues

None found (searched path filter, CI back-pressure, platform build trigger, nightly schedule cron, paths-filter platform — no strong overlap).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions