Harden cook stability: graceful shutdown, preheat race, PID guards#20
Conversation
Address the highest-value stability/safety findings from the project review: - Graceful shutdown (C1): Smoker is now IDisposable and the host's ApplicationStopping hook tears it down, so a clean stop/SIGTERM drives the auger and igniter to a safe off-state and releases GPIO/SPI/I2C instead of leaving relays in their last commanded state. All background loops (mode, fire minder, display, ADC read, preheat) now stop on a lifetime token. - Preheat sampling (C2/N1): moved PreheatMonitor.Update out of the Status property getter into a dedicated 1 Hz loop, and guarded the monitor's non-concurrent queue with a lock. Fixes a data race (display loop + every /status caller) and keeps the 60-sample window a true ~60s regardless of how many clients poll status. - CancellationTokenSource lifecycle (N3): the per-mode token is now created and disposed by ModeLoop under a lock that SetMode also takes to cancel, eliminating the dispose-vs-cancel race that could throw ObjectDisposedException and drop a mode change. - PID guards (N4): seed _lastUpdate/_lastTemp, return proportional-only on the first (or post-dropout) reading, skip integral/derivative when dt<=0, and re-seed after a NaN so a sensor dropout can't spike the derivative. - SmokerProxy (N8): add a 10s HTTP timeout so a hung API can't stall callers (e.g. the MQTT bridge), and parse p-value defensively. Adds regression tests for the PID same-instant divide and concurrent PreheatMonitor access. Full review notes captured separately.
|
🚀 The Update (preview) for CamSoper-org/inferno-deploy/main (at 0826138) was successful. ✨ Neo Code ReviewRoutine redeployment of updated application binaries to the target device; the infrastructure is unchanged. ✅ Low RiskThis is a routine code deployment. The trigger hashes on the publish/restart commands changed because the application source changed, causing Pulumi to rebuild and redeploy all three service binaries (API, CLI, MQTT) and restart the corresponding services on the target device. The PR itself is a correctness/safety hardening pass: cooperative cancellation on shutdown (so auger and igniter relays are de-energized cleanly on Resource Changes Name Type Operation
+- restart-api command:remote:Command replaced
+- publish-api command:local:Command replaced
+- publish-cli command:local:Command replaced
+- publish-mqtt command:local:Command replaced
+- restart-mqtt command:remote:Command replaced
|
What & why
Implements the highest-value, low-controversy findings from a full project review, focused on cook stability and safety. Build is clean (0 warnings) and all tests pass (68, incl. 2 new).
🔴 Critical
Graceful shutdown — relays no longer left energized on stop (C1).
Previously, hardware was built at top level and
app.Run()just blocked; nothing disposed theSmokeror devices, so asystemctl stop/ SIGTERM / Ctrl-C left the auger and igniter relays in their last commanded state. Now:SmokerimplementsIDisposable; the host'sApplicationStoppinghook tears it down and disposes the sharedGpioController.Smoker.Dispose()drives a hard safe-off (auger + igniter off) and releases every device (RelayDevice.Disposedrives pins off and closes them;RtdArrayfrees the ADC/SPI;Displayfrees the LCD).Preheat detector data race (C2).
PreheatMonitor.Update()ran inside theSmoker.Statusproperty getter, mutating a non-concurrentQueueconcurrently from the display loop, every/api/statusrequest, and the MQTT poller. Moved sampling to a dedicated 1 Hz loop and guarded the monitor with a lock.🟡 Nice to have
Status-read keeps the 60-sample window a true ~60 s regardless of how many clients poll — no more premature "Preheated."ModeLoopunder a lock thatSetModealso takes to cancel, eliminating the dispose-vs-cancel race that could throwObjectDisposedExceptionand drop a mode change._lastUpdate/_lastTemp; return proportional-only on the first (or post-dropout) reading; skip integral/derivative whendt<=0; re-seed after a NaN so a sensor dropout can't spike the derivative. (All existing PID tests still pass.)HttpClienttimeout so a hung API can't stall callers (e.g. the MQTT bridge's state loop); defensive p-value parse.Tests
SmokerPidTests: back-to-back calls (dt≈0) return finite (no divide-by-zero NaN/Inf).PreheatMonitorTests: concurrentUpdate/Resetfrom many threads doesn't throw/corrupt.Deliberately NOT included (needs a product decision)
FireMinder.GetFireCheckTemp()integer division (N2).SetPoint/180is integer division, so the reignition margin is a flat 30 °F below setpoint up to 359, then jumps to 60 °F at 360. The intended proportional scaling is lost. Fixing it changes tested cook behavior, and the right target (flat margin vs. proportionalSetPoint*(1-30/180)) is a judgment call — happy to do it in a follow-up once you pick the behavior.Other review items left for follow-up: making pins/timeouts/PID gains configurable + a dev/sim device mode (which would unblock unit-testing the
Smokerstate machine), and the home-LAN security notes (no API auth, MQTT without TLS, Pulumi writing the MQTT password into the systemd unit).Verification
dotnet build Inferno.sln→ cleandotnet test Inferno.Tests→ 68 passedhttps://claude.ai/code/session_01T8v7xa23G8v1GjsSFBwzme
Generated by Claude Code