Skip to content

[WIP] NixOS support#379

Open
mrosseel wants to merge 163 commits into
brickbots:mainfrom
mrosseel:nixos
Open

[WIP] NixOS support#379
mrosseel wants to merge 163 commits into
brickbots:mainfrom
mrosseel:nixos

Conversation

@mrosseel
Copy link
Copy Markdown
Collaborator

Summary

  • Full NixOS-based system for PiFinder (replaces Raspbian)
  • Declarative system configuration via Nix flake
  • SD card image, netboot, and migration bootstrap tarball builds
  • Software update via nixos-rebuild with GitHub release/PR channels

Test plan

  • Flash SD image and verify boot
  • Test WiFi AP and client mode switching
  • Test software update UI channels
  • Test hostname rename via web UI

🤖 Generated with Claude Code

mrosseel and others added 30 commits February 4, 2026 19:02
- build.yml: single build + Cachix push + unstable channel updates
- release.yml: manual release workflow for stable/beta channels

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The SD image module provides filesystems, but toplevel builds need
a minimal stub to evaluate successfully.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Required for NixOS module system to accept devMode setting.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Required when module has both options and config sections.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replaces FIXME placeholders with actual SRI hashes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Uses Pi5 runner when RUNNER_LABELS variable is set, falls back to
ubuntu with QEMU emulation otherwise.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Filter to only Pi 4B device tree (CM4 incompatible with our overlays)
- Use shorthand DTS syntax for PWM overlay

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Untracked file was excluded from Nix flake source tree, causing
"No module named 'PiFinder.sys_utils_base'" on SD card boot.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add camera overlay (imx477) to netboot config.txt via flake.nix
- Fix sys_utils import in main.py to use utils.get_sys_utils()
- Add hip_main.dat fetch to pifinder-src.nix for starfield plotting
- Add dma_heap udev rule for libcamera/picamera2 access
- Fix shared memory naming in solver.py (remove leading /)
- Add DNS nameservers for netboot environment
- Document power control scripts in CLAUDE.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add runtimeCameraSelection option to hardware.nix (default: true)
- SD image includes config.txt with "include camera.txt" directive
- Users can edit camera.txt and reboot to switch cameras
- Supported cameras: imx296, imx290 (imx462), imx477
- Fix cameraDriver scope in hardware.nix (moved to top-level let)
- Add sudoers rules for systemctl stop/start pifinder.service
- Add DMA heap udev rule for libcamera video group access
- Netboot config sets cameraType = "imx477" for HQ camera dev

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Refactor sys_utils modules to use common base class
- Add sys_utils_nixos.py for NixOS-specific implementations
- Add get_sys_utils() detection in utils.py for platform selection
- Add flake.lock for reproducible builds
- Add NetworkManager config to networking.nix
- Add deploy-image-to-nfs.sh for netboot development workflow

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update build.yml CI workflow
- Fix fonts.py import
- Fix marking_menus.py formatting
- Add missing import to preview.py
- Simplify objects_db.py
- Add catalog_imports improvements
- Update pifinder_objects.db

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Switch to NFSv4 with caching disabled (noac, actimeo=0)
- Disable auto-optimise-store in devMode (hard links fail on NFS)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ServerAliveInterval/CountMax to prevent timeout during transfers
- Use rsync -R (relative) to preserve directory structure correctly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Comets.txt is downloaded at runtime and must be in a writable
location, not the read-only Nix store.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Extend eth0 wait to 30 seconds with debug output
- Wait for link carrier before DHCP
- Add DHCP retries (3 attempts)
- Add LIBCAMERA_IPA_MODULE_PATH to pifinder service environment

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Restore SUBSYSTEM=="pwm" udev rule that was accidentally removed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Turns on keypad LEDs during sysinit for early visual boot feedback.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- boot-splash.c: displays welcome image with scanning animation
- Starts at sysinit, stops when pifinder.service starts
- Much faster than Python splash

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove nixos-hardware module (saves 659MB linux-firmware)
- Fetch nixos-rebuild at runtime (saves ~500MB llvm/nix deps)
- Remove git from systemPackages (nix has built-in git for flakes)

Target: ~150MB vs current 1.7GB
- Remove default packages (vim, nano, etc)
- Disable polkit, udisks2, speechd
- Should reduce closure significantly
NetworkManager-vpnc alone has 1.1GB closure (webkitgtk, llvm, etc).
Disable all NM plugins for bootstrap - we just need WiFi.
mrosseel and others added 3 commits March 10, 2026 22:18
SHA256 is now fetched at runtime from the .sha256 sidecar file
in the GitHub release, eliminating the need for CI to update the
migration branch after each build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root cause: network-online.target was unreliable because
NetworkManager-wait-online was disabled, so pifinder-first-boot
ran before internet was available.

- Add curl-based connectivity check with 5-minute retry loop
- Add Restart=on-failure with 15s delay
- Re-enable NetworkManager-wait-online (with 30s timeout)
- Add sudo permissions for systemctl/journalctl (remote recovery)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
github-actions Bot and others added 6 commits March 10, 2026 22:34
Starts boot-splash in animation mode while downloading the full
system closure, so the user sees activity instead of a static screen.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
solution() can return None when solve_state() is truthy, causing
TypeError in base.py screen_update. Also fetch first-boot target
from GitHub pifinder-build.json with baked-in fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merges upstream PiFinder changes including:
- Bottle → Flask/Jinja webserver migration (.tpl → .html)
- Harris and Lynga catalog loaders + data + DB update
- Selenium-based webserver test suite

NixOS-specific decisions:
- Keep deletion of requirements.txt, requirements_dev.txt, noxfile.py,
  version.txt (managed via flake.nix / pifinder-build.json)
- Keep Python 3.13 target (was reset to py39 by upstream)
- Keep stub sys_utils_fake (uses sys_utils_base)
- Replace bottle/cheroot with flask/flask-babel/waitress in
  nixos/pkgs/python-packages.nix; add selenium to devPackages
- Reimplement Bottle-era network status message in Jinja syntax
Includes nixos/pkgs/pifinder-src.nix fix: drop fetchurl for hip_main.dat,
which upstream now ships as a committed file in astro_data/.
github-actions Bot and others added 3 commits May 25, 2026 08:44
- nixos/RELEASE.md: document version flow + release/dev pipelines
- software.py: MIN_NIXOS_VERSION 2.5.0 → 3.0.0
- python-packages.nix: add pyerfa (used by calc_utils since upstream brickbots#423,
  silently dropped during upstream merge because requirements.txt is not
  mirrored into the Nix env)
- python-packages.nix: include hardwarePackages in devEnv so nix develop
  matches the runtime import surface
- python-packages.nix: select simplejpeg wheel by host arch (was hard-pinned
  to aarch64; failed to import on x86_64 dev shells)
- flake.nix: apply libcamera -Dpycamera=enabled overlay to the x86_64
  devShell and export PYTHONPATH so picamera2 finds the python bindings

Verified: nix develop --command python -c 'import …' on x86_64
succeeds for all 34 imports (erfa, picamera2, libcamera, PyHotKey,
pynput, hardware packages, etc.). RPi.GPIO still raises its own
"only on a Raspberry Pi" RuntimeError at import time — expected,
matches upstream pip behavior on non-Pi hardware.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. gpsd-add-uart: rename /dev/ttyAMA3 → /dev/ttyAMA1 (6 sites). The uart3
   overlay surfaces as ttyAMA1, matching hardware.nix's udev rule and the
   Debian image's gpsd.conf.

2. /etc/default/gpsd: drop the custom USBAUTO+GPSD_SOCKET pair, write
   upstream pi_config_files/gpsd.conf's three lines verbatim. DEVICES now
   opens the on-board UART at startup. gpsd-add-uart kept as the boot-time
   socket-activation kick; can retire after on-Pi confirmation.

3. pifinder-upgrade: replace fragile `nix build --dry-run | grep` progress
   with `nix --log-format internal-json build … --max-jobs 0` parsed by
   gawk, counting type=100 (actCopyPath) start/stop events. Stable across
   Nix ≥ 2.4. Validated against a real cache.nixos.org substitute (5/5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
github-actions Bot and others added 6 commits May 27, 2026 14:54
build.yml was defaulting VERSION=2.5.0 on push triggers (and the
workflow_dispatch default also read 2.5.0), so this branch's auto-build
was publishing v2.5.0-migration tarballs while the migration branch's
downloader (software.py _MIGRATION_VERSION_INFO and the brickbots/PiFinder
release branch's migration_gate.json) points at v3.0.0-migration. Bump
both the workflow_dispatch default and the push-trigger fallback to 3.0.0
so a normal push to nixos publishes the artifact at the URL the migration
branch actually downloads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Lint & Test workflow was using DeterminateSystems/magic-nix-cache-action,
which is backed by GitHub's Actions Cache and gets HTTP-418 rate-limited
under sustained traffic — exactly the failure mode that just broke
type-check ("--install-types failed: substituter disabled, rate limit
exceeded"). The Nix substituter is then disabled mid-run and dependent
commands like mypy --install-types fall over.

Replace it with cachix/cachix-action@v17 pointed at the pifinder cache
(read-only, no auth token needed). Same backing as build.yml, so dev-shell
substitutes hit the same store paths the system closure was built against.
cache.nixos.org remains the default fallback.

Also bump actions/checkout@v4 → @v6 in this file to align with the Node 24
migration in build.yml/release.yml.

This is a stop-gap. The real fix is standing up Attic with an S3 backend
so both build.yml and lint.yml can retire cachix.org and MNC together —
tracked separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rong)

Previous commit on this branch swapped DeterminateSystems/magic-nix-cache-action
for cachix/cachix-action@v17 thinking the MNC HTTP-418 rate-limit was the
root cause of the failed lint/type-check. That swap made things worse:
the pifinder Cachix only contains the NixOS *system closure*, not the
*dev shell* (cedar-detect-server's Rust crate builds). With MNC removed,
the dev shell had to rebuild from source, which fetched crate tarballs
from a crates.io mirror and hit 403s.

MNC was carrying real weight by caching locally-built derivations
between runs. Restoring it. The original MNC rate-limit was a transient
flake — re-runs work around it. Real fix is standing up Attic with
S3-backed storage so both build.yml and lint.yml can retire MNC and
cachix.org together.

The checkout@v4 → @v6 bump from the swap commit is preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
github-actions Bot and others added 6 commits May 27, 2026 16:04
The Nix derivation was overwriting pifinder-build.json with
"nix-${gitRev}" at build time, so even released devices reported a
random short-sha instead of the release version. Three writers became
two, with consistent semantics everywhere:

- pifinder-src.nix: drop the cat > pifinder-build.json block and the
  gitRev arg — the derivation now copies the source file through
  verbatim, no version invention.
- flake.nix: drop the pifinderGitRev _module.args plumbing.
- services.nix: drop pifinderGitRev / gitRev from the pifinder-src
  import.
- release.yml: reorder so the version stamp is written into the
  working tree BEFORE the nix build (so the store path bakes in the
  release version, not the previous stamp), then re-stamp with the
  resulting store_path after the build, commit, push, tag.

Result: SD image, cachix closure, and committed JSON all agree on
the released version. Matches the flow already documented in
nixos/RELEASE.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the decision to self-host Attic at cache.pifinder.eu, backed by
SQLite + local disk initially with Cloudflare R2 as the eventual chunk
store. Covers considered alternatives (cachix.org, Magic Nix Cache,
nix-casync, harmonia) and the operational consequences for CI publishing,
on-device updates, and failure fall-through to cache.nixos.org.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testable Ready for testing via PiFinder software update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant