Skip to content

fix(firmware): close strobe ISR race that leaves old channel HIGH on switch#549

Open
hongquanli wants to merge 3 commits into
masterfrom
fix/hw-trigger-strobe-race
Open

fix(firmware): close strobe ISR race that leaves old channel HIGH on switch#549
hongquanli wants to merge 3 commits into
masterfrom
fix/hw-trigger-strobe-race

Conversation

@hongquanli
Copy link
Copy Markdown
Contributor

Summary

Fixes a race in the Teensy firmware's strobe ISR where switching channels (e.g. 640 nm → 405 nm) during live HW-triggered acquisition could leave the old channel's port stuck HIGH. The strobe-start and strobe-end branches both read the global illumination_source; if Python's channel-switch sequence (turn_off_illuminationset_illumination(new)turn_on_illumination) ran between them, the strobe-end turned off the new port and the old one stayed energized.

  • Latch illumination_source into a per-camera-channel array strobe_active_source[] at strobe start, and use that latched value at strobe end — the ISR always extinguishes the port it lit.
  • Extract symmetric turn_on_illumination_source(int) and turn_off_illumination_source(int) helpers; the public turn_on_illumination()/turn_off_illumination() keep their existing behavior (read the global, update illumination_is_on).
  • Mark illumination_source and illumination_is_on volatile (they cross the ISR/main-loop boundary on Cortex-M7).
  • Add a default: case in turn_off_illumination_source() that falls back to turn_off_all_ports() — defensive against a future D-port being added to the ON switch but forgotten in the OFF switch (the same failure class this fix prevents).
  • Bump firmware 1.2 → 1.3, with matching bump in FirmwareSimSerial so version-gated Python code stays consistent under --simulation.

Test plan

  • PlatformIO build for Teensy 4.1 (pio run -e teensy41) — clean, no new warnings, firmware.hex produced
  • pytest software/tests/control/test_firmware_sim_serial.py tests/control/test_microcontroller.py tests/squid/test_filter_wheel.py tests/control/test_def.py — 87 passed, 1 pre-existing skip
  • python3 main_hcs.py --simulation --verbose — app boots clean, MCU heartbeats steady
  • Hardware verification still required — the race lives in C++ ISR code that FirmwareSimSerial does not model. Flash to a real Teensy and reproduce the original repro (HW-triggered live → switch 640 → 405) to confirm 640 nm goes dark.

Known follow-ups (not in this PR)

Surfaced by review, all pre-existing in master, deferred to keep this PR focused:

  • led_matrix_r/g/b, illumination_intensity, and illumination_port_is_on[] are not volatile despite being read in ISR context.
  • set_illumination(), set_illumination_led_matrix(), turn_off_illumination(), and turn_off_all_ports() are non-atomic vs the strobe ISR — main-loop callbacks don't use noInterrupts().
  • set_illumination() calling turn_on_illumination() mid-strobe (when illumination_is_on==true) can briefly drive two ports HIGH simultaneously.
  • turn_off_port() doesn't clear illumination_is_on, so the multi-port API can desync the flag from the physical pin set.

🤖 Generated with Claude Code

hongquanli and others added 3 commits May 23, 2026 22:50
…IGH on channel switch

In hardware-triggered live acquisition, switching channels (e.g. 640 nm -> 405 nm)
could leave the previous channel's port stuck HIGH. Root cause was a race in
ISR_strobeTimer() against Python's channel-switch sequence
(turn_off_illumination -> set_illumination -> turn_on_illumination): if the
strobe-start branch fired between turn_off and set_illumination, it turned on
the old port using the still-current illumination_source; after Python's
set_illumination updated illumination_source, the strobe-end branch turned off
the *new* port and the old one remained HIGH.

Fix: latch illumination_source into strobe_active_source[camera_channel] at
strobe start and turn off that latched source at strobe end, so the ISR always
extinguishes the port it lit. Extracted turn_off_illumination_source(int) as a
static helper; turn_off_illumination() now calls it with the global. The
strobe-end illumination_is_on clear is guarded so it does not invalidate a
Python turn_on_illumination() that already ran for the new channel.

Also marks illumination_source and illumination_is_on volatile, since they are
read/written by both ISR_strobeTimer() and main-loop command callbacks on
Teensy 4.1 (Cortex-M7).

Firmware version bumped 1.2 -> 1.3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rial 1.3

Follow-up to the strobe-source latch fix, addressing review feedback:

1. Extract turn_on_illumination_source(int) symmetric to the existing
   turn_off_illumination_source(int). The ISR now turns ON the latched source
   instead of re-reading the global illumination_source after the latch —
   closes the residual race where any future change to ISR priority or
   preemption could re-introduce the stuck-HIGH bug.

2. Add a default: case in turn_off_illumination_source() that calls
   turn_off_all_ports() as a safety net. If a future ILLUMINATION_D6 is added
   to the ON switch but forgotten in the OFF switch, the strobe end will fall
   through to a global shutdown instead of silently leaving the port HIGH —
   the same failure class this fix is preventing.

3. Mark strobe_active_source[] static so it cannot accidentally acquire
   external linkage from a future extern declaration.

4. Bump SimSerial.FIRMWARE_VERSION_MINOR to 3 (and the corresponding default
   in response_bytes_for + the test_filter_wheel mock helper) to match the
   real firmware. Keeps version-gated Python code consistent in simulation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five tests asserted the SimSerial-reported firmware version equals
(1, 2). Commit a39428e bumped SimSerial.FIRMWARE_VERSION_MINOR to 3
to match the real firmware but missed these assertions, causing CI
to fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a Teensy firmware race in ISR_strobeTimer() that could leave the previously-selected illumination channel stuck HIGH when Python switches channels mid hardware-triggered strobe. It does so by latching the active illumination source at strobe start and ensuring strobe end turns off that same latched source, and bumps the firmware version to 1.3 (with corresponding simulation/test updates).

Changes:

  • Latch illumination_source per camera channel at strobe start and use it for symmetric ON/OFF at strobe end to prevent stuck-HIGH illumination on channel switches.
  • Introduce symmetric turn_on_illumination_source(int) / turn_off_illumination_source(int) helpers and mark ISR-crossing globals as volatile.
  • Bump firmware version to 1.3 across firmware constants, SimSerial, and affected tests/mocks.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
firmware/controller/src/functions.cpp Latches strobe source in ISR; adds symmetric on/off helpers and guarded illumination_is_on clear logic
firmware/controller/src/globals.h Marks illumination_source / illumination_is_on as volatile for ISR/main-loop sharing
firmware/controller/src/globals.cpp Updates definitions to match volatile globals
firmware/controller/src/constants.h Bumps firmware version minor 2 → 3
software/control/microcontroller.py Bumps SimSerial-reported firmware version to 1.3 (including default response version)
software/tests/test_watchdog.py Updates expected SimSerial firmware version to (1, 3)
software/tests/test_multiport_illumination_protocol.py Updates version assertions to (1, 3)
software/tests/test_multiport_illumination_edge_cases.py Updates version assertions to (1, 3)
software/tests/test_multiport_illumination_bugs.py Updates version assertions to (1, 3)
software/tests/squid/test_filter_wheel.py Updates mock microcontroller default firmware version to (1, 3)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +342 to +343
// a pin stuck HIGH — the original bug class this fix is preventing.
turn_off_all_ports();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants