fw/drivers/imu/lis2dw12: harden against FIFO/INT1 stream stalls#1484
Open
jplexer wants to merge 5 commits into
Open
fw/drivers/imu/lis2dw12: harden against FIFO/INT1 stream stalls#1484jplexer wants to merge 5 commits into
jplexer wants to merge 5 commits into
Conversation
…tall detection The INT1 watchdog used the time of the last INT1 edge to decide whether the FIFO stream had stalled. Shake/wake-up function interrupts are routed onto the same INT1 pad (CTRL7 INT2_ON_INT1), so wrist motion kept kicking the watchdog while the FIFO threshold stream itself was dead, leaving the stall undetected until the accel session was recreated (e.g. by toggling Health on the phone). Field logs from FIRM-2490 show exactly this: steps frozen and orientation stale with no watchdog warnings at all. Track the time of the last successful FIFO burst read instead, which is the actual signal we care about. The trip threshold gains a 2x margin over the FIFO threshold period: the old 1x threshold tripped at 1004 ms against a 1000 ms period, i.e. on normal scheduling jitter. Fixes FIRM-2490 Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Joshua Jun <lets@throw.rocks>
…very The watchdog and overrun recovery paths only rewrote FIFO_CTRL. That discards up to a full FIFO (~1.3 s of samples at 25 Hz) on every recovery, and it cannot repair upsets to ODR (CTRL1) or INT routing (CTRL4/5/7) — field logs from FIRM-2490 show recovery firing three times in under a minute without keeping the stream alive. Introduce a shared recovery helper used by both the watchdog and the FIFO overrun branch that quiesces INT routing (forcing a latched-high pad low so re-enabling yields a fresh rising edge), drains queued samples before they are lost to the bypass write, re-asserts ODR, FIFO mode and INT routing, and clears latched function INT sources. The previously unused num_recoveries counter is now incremented and logged so field logs can distinguish first-time from repeat stalls. Drained samples are timestamped at read time like regular FTH reads, so their timestamps are late by up to one FIFO period; unchanged from the existing behavior. Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Joshua Jun <lets@throw.rocks>
…ling While sampling is active, accel_peek returns the last FIFO sample, so a stalled stream freezes the data consumed by stationary-mode motion detection and activity orientation checks. The stall then masks itself: the watch reports being motionless and flat, enters stationary mode on a moving wrist, and never logs anything (FIRM-2490). Detect staleness directly in accel_peek and queue the shared stall check, giving a second, caller-driven trigger alongside the watchdog timer. The check re-validates staleness on the serialized driver work queue, so schedule/execute races with reconfiguration are benign, and a pending flag prevents frequent peek callers from flooding the work queue. The cached (stale but bounded) sample is still returned rather than an error or a one-shot measurement: an error would regress stationary detection, and the one-shot path rewrites CTRL1 mid-stream and blocks the caller for up to 100 ms. Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Joshua Jun <lets@throw.rocks>
accel_set_num_samples discarded any samples queued in the FIFO via the bypass write, losing up to a full FIFO of data on every subscriber reconfiguration. Drain the FIFO first when sampling was previously active, resolving the long-standing FIXME. Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Joshua Jun <lets@throw.rocks>
The stall threshold scales with the FIFO threshold period, which is derived from the most demanding subscriber. An app requesting per-sample updates at 25 Hz yields a 40 ms period and thus an 80 ms threshold, which normal work-queue latency can exceed, tripping spurious recoveries. Floor the threshold at one second, matching the watchdog timer granularity. Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Joshua Jun <lets@throw.rocks>
gmarull
reviewed
Jun 11, 2026
gmarull
left a comment
Member
There was a problem hiding this comment.
let me take this carefully, first question: has this been tested on hw?
Member
Author
|
regression-tested normal steps/shake on getafix_dvt2, and verified the recovery path with a fault-injection build (killed INT routing / FIFO mode / ODR behind the driver's back) all recovering sucessfully |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Investigation of FIRM-2490 ("Step count not working, started working after toggling health") showed the accel sample stream on obelix stalls: steps stop counting, activity sees frozen orientation, and stationary mode engages on a moving wrist. Recreating the accel session (the health toggle) fixed it because it fully re-arms the FIFO and INT1 — this series makes the driver do that itself.
Root cause mechanics:
Changes
Fixes FIRM-2490
Related: FIRM-2285, FIRM-1626, FIRM-1141
Testing
./waf testgreen (200 suites, including test_accel_manager)FIFO stream stalled for N ms/Recovering accel stream (count N)followed by steps continuing to count, and no recovery storms🤖 Generated with Claude Code