VN-300 at 400 Hz with 921600 baud + per-sample GPS timestamps#642
VN-300 at 400 Hz with 921600 baud + per-sample GPS timestamps#642sritchie wants to merge 2 commits into
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Firmware ArtifactsMain firmware (Gen3 box)
Each External display firmware
Each The huVVer-AVI binary is built but not yet validated on real hardware — see the bring-up checklist for what to verify on first flash. X-Plane plugin
Drop the Built from
|
37695a5 to
c6fc64c
Compare
Three findings from a careful code review of the original PR: 1. CRITICAL — real-hardware production builds use InitWithStream() (which does NOT call pSerial->begin()), so the per-EFIS-type baud logic in EfisSerialPort::Init() never ran in production. The UART was hardcoded to 115200 in EfisReadTaskInit() regardless of EFIS type. Effect: Vac's VN-300 would receive ~55 KB/s on a 11.5 KB/s ceiling and get garbage. Plumbed a baud parameter through EfisReadTaskInit() and let setup() in .ino choose 921600 for VN-300 and 115200 for everything else. The Serial2 fallback path keeps working (EfisSerialPort::Init still picks per-type baud). 2. Vn300.h class doc comment said sync was "0xFA followed by 0x19" (the OLD groups byte). Implementation correctly syncs on 0xFA 0x1B. Comment fixed. 3. The VNCC commands in docs/site/docs/efis-integration/vectornav.md used *XX placeholders, which work in VNCC's terminal (which computes checksums) but FAIL silently in a plain serial terminal (which sends *XX literally and the VN-300 rejects). Doc now shows both forms — *XX for VNCC users and the precomputed *50, *7E, *57 values for hand-typed terminal use. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Addressed three findings from the adversarial review (commit b296a8a): Critical — UART baud wasn't actually being set in production builds. The reviewer flagged a missing Stale comment. Doc checksums. Findings I did NOT act on (also from the review):
pio test -e native: 1169 pass, 1 skip. pio run -e esp32s3-v4p: SUCCESS. |
## Universal writer + web-mutex hardening Five concurrent changes that improve data-logging reliability and web responsiveness at every supported IMU rate (50 / 208 Hz today; opens the door for the 416 Hz follow-up PR). Each change stands on its own; they combined to bring a benchtop stress run from 4% CSV loss to 0% and web p95 latency from ~25 sec to ~1.7 sec under aggressive load. ## Changes ### Writer hardening - **\`LogSensor.cpp\`**: bump \`WRITE_BUF_SIZE\` 2 KB → **32 KB**, **PSRAM-allocated**. A 2 KB buffer forced many small writes per PERF window, each acquiring \`xWriteMutex\` — under web load this serialised aggressively against \`/api/logs\` and \`/aoaconfig\` handlers that also need the mutex. 32 KB lets the writer drain in a small number of large iterations. PSRAM allocation matters because in-BSS 32 KB allocation fragmented the internal SRAM heap below WiFi's EAPOL contiguous-buffer threshold. - **\`LogSensor.cpp\`**: add an **8 KB size gate + 100 ms age gate** (whichever fires first) before each SD write. Universal at any rate: at 416 Hz the size gate dominates (8 KB writes ~20/sec); at 208 Hz the age gate dominates (~4 KB writes ~10/sec); at 50 Hz the age gate fires but writes wait for at least one full sector. The 5 sec \`f.sync()\` still bounds the unsynced tail. - **\`Globals.h\`**: bump \`kLoggingRingBufferBytes\` **256 KB → 1 MB** (PSRAM). Sized for the worst case (416 Hz, ~400 B rows, 200 ms SD pause = ~33 KB needed; 1 MB gives ~6 sec of headroom). Effectively free at 50/208 Hz. ### Observability - **\`SensorIO.cpp\` + \`LogSensor.cpp\`**: new \`g_uImuMaxLateUsAllTime\` counter that survives PERF window resets, exposed as \`imu_lateMaxUsAT\` in the PERF heartbeat. Catches bursty multi-ms IMU stalls that the existing per-window \`imu_lateMaxUs\` loses when the heartbeat fires mid-event. - **\`SensorIO.cpp\`**: loosen \`iLogRate\` equality checks (\`== 208\` → \`>= 208\`, \`!= 208\` → \`< 208\`) so future higher rates take the IMU-rate write path correctly. No effect at 50/208 Hz today. ### Mutex hardening - **\`ConfigWebServer.cpp\`**: replace four \`portMAX_DELAY\` \`xAhrsMutex\` takes in web handlers (HandleConfigSave, Upload, bias-cal) with bounded takes (100 ms / 500 ms / 1 s / 500 ms). Each has a 503 / skip-and-warn fallback. Web handlers no longer block \`ImuReadTask\` indefinitely when AHRS state is contended; pilot sees a "system busy" rather than a 10+ sec hang. Affected handlers are slow paths that aren't exercised in steady-state flight anyway. ### Boot-order - **\`OnSpeed-Gen3-ESP32.ino\`**: defer \`ImuReadTask\` creation to the END of \`setup()\` in the Sensors data-source mode. The task's \`delayMicroseconds()\` busy-wait at priority 5 starves the Arduino setup task (priority 1) on Core 1 — at 208 Hz the impact is benign because the busy-wait window is small relative to the period, but at higher rates it blocks \`CfgWebServerInit()\` entirely and the WiFi AP never comes up. Spawning \`ImuReadTask\` last keeps the door open for that follow-up. ## Verification \`\`\`bash pio run -e esp32s3-v4p # SUCCESS (16% flash, 22% RAM) pio test -e native # 1168 passed, 1 skipped — no regressions \`\`\` **Bench stress (45 min \`--realistic\` profile at 416 Hz, on a stacked branch):** | Metric | Before this PR | After this PR | |---|---|---| | CSV data loss | 4.14% (~15K rows) | **0.00%** | | Worst inter-sample gap | 613 ms | **3.5 ms** | | Web \`/api/logs\` p95 | ~25 s | **1.7 s** | | Web \`/api/logs\` errors | 23 timeouts | **0** | | Worst SD write_max | 230 ms | 230 ms (unchanged; absorbed by ring) | | imu_lateMaxUsAT peak | (couldn't measure) | **780 µs** | ## Related - **#644** — Deferred 416 Hz hardening (AHRS retune, native goldens, atomic AHRS snapshots, etc.) - **#645** — Decouple log-producer task from sensor/AHRS tasks (architectural follow-up enabling arbitrary log rates 1 Hz–416 Hz) - **#627** — High-rate IMU follow-ups (mutex contention, scheduling jitter) - **#642** — VN-300 wire format change (separate PR; independent of this one) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #637. Wire format changes from a 127-byte / 50 Hz / 115200 frame to a 138-byte frame with four active groups (Common + Time + GNSS1 + AHRS). New per-sample timestamps land in vnTimeStartupNs (ns since boot, always advancing) and vnTimeGpsNs (ns since GPS epoch, valid once dateOk). The old 5 Hz GNSS1.UTC string column is dropped. OnSpeed's EFIS serial port now opens at 921600 baud for VN-300; the new frame at 400 Hz is 55.2 kB/s on the wire, which 115200 cannot carry. Other EFIS types (Dynon, Garmin, MGL) keep their 115200 baud and their native broadcast rates (10-50 Hz). The VN-300 hardware must be reconfigured to match: write reg 75 ($VNWRG,75,1,1,1B,01E3,0200,0090,0142), reg 5 ($VNWRG,5,921600,1), then $VNWNV to persist. Coupling is intentional and fail-closed — the parser memcmps the 10-byte header so an unconfigured VN-300 produces zero vn* data rather than corrupt output. Group masks verified against the canonical vnproglib enum: - Common 0x01E3 = TimeStartup + TimeGps + AngularRate + Position + Velocity + Accel - Time 0x0200 = TimeStatus - GNSS1 0x0090 = Fix + VelNed - AHRS 0x0142 = YawPitchRoll + LinearAccelBody + YprU Changes - software/Libraries/onspeed_core/src/efis/Vn300.{h,cpp}: 138-byte frame layout; new kHeader[10]; sync detection on 0xFA 0x1B; TimeStartup/TimeGps/TimeStatus decode; szTimeUTC + formatTimeHmsMs removed - software/Libraries/onspeed_core/src/types/LogRow.h: vnTimeUtc string + kLogRowUtcTimeLen replaced by three numeric fields - software/Libraries/onspeed_core/src/proto/LogCsv.{h,cpp}: column header line and row emission use the three new numeric columns; the comma-in-vnTimeUtc assertion is removed (u64/u8 can't produce commas) - software/Libraries/onspeed_core/src/proto/LogCsvHeaderIndex.{h,cpp}: three new idx fields; require-all list grows from 26 to 28; unused TakeString helper removed - software/sketch_common/src/io/EfisSerialPort.{h,cpp}: SuVN300Data carries the three new fields; Init() picks 921600 for VN-300, 115200 for everything else - software/sketch_common/src/tasks/LogSensor.cpp: plumb new fields into LogRow; drop VN-300 utcOrNull source for the sidecar meta builder (reconstruct offline from vnTimeGpsNs if needed) - software/Libraries/onspeed_core/src/test_frames/SynthFrames.cpp: 138-byte synth frame - test/test_efis_vn300: new fixture builder, new tests for per-sample timestamps and TimeStatus; old-format rejection test - test/test_efis_vn300_crc_parity: 137-byte fuzz buffer - test/test_log_csv, test/test_log_csv_header_index: new column names, new round-trip assertions; obsolete vnTimeUtc tests dropped - test/test_efis_dispatcher, test/test_synth_frames: 138-byte builder - docs/site: vectornav.md rewritten for 400 Hz / 921600 / new VNCC config recipe; log-columns.md describes the three new columns Verification pio test -e native 1169 passed, 1 skipped pio run -e esp32s3-v4p SUCCESS (15.7% flash, 22.3% RAM) tools/regression/run_snapshot all 5 goldens match Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three findings from a careful code review of the original PR: 1. CRITICAL — real-hardware production builds use InitWithStream() (which does NOT call pSerial->begin()), so the per-EFIS-type baud logic in EfisSerialPort::Init() never ran in production. The UART was hardcoded to 115200 in EfisReadTaskInit() regardless of EFIS type. Effect: Vac's VN-300 would receive ~55 KB/s on a 11.5 KB/s ceiling and get garbage. Plumbed a baud parameter through EfisReadTaskInit() and let setup() in .ino choose 921600 for VN-300 and 115200 for everything else. The Serial2 fallback path keeps working (EfisSerialPort::Init still picks per-type baud). 2. Vn300.h class doc comment said sync was "0xFA followed by 0x19" (the OLD groups byte). Implementation correctly syncs on 0xFA 0x1B. Comment fixed. 3. The VNCC commands in docs/site/docs/efis-integration/vectornav.md used *XX placeholders, which work in VNCC's terminal (which computes checksums) but FAIL silently in a plain serial terminal (which sends *XX literally and the VN-300 rejects). Doc now shows both forms — *XX for VNCC users and the precomputed *50, *7E, *57 values for hand-typed terminal use. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
b296a8a to
85b227b
Compare
|
Rebased on master (which now includes #647's 416 Hz support + the two reboot-path bug fixes that landed with it). Force-pushed to 85b227b. Auto-merge handled the only overlap ( Preserved both intent surfaces:
Verification: Vac can now test the full stack: 416 Hz IMU + VN-300 at 921600 baud + 400 Hz GPS timestamps + the clean web-reboot path. Coordination requirement is unchanged: VN-300 must be reconfigured via VNCC (commands in |
VN-300 at 400 Hz with 921600 baud + per-sample GPS timestamps
Closes #637.
Wire format change for the VN-300: the 5 Hz-latched
GNSS1.UTCfield is replaced with per-sampleTimeStartup+TimeGpsfrom the Common group, andTimeStatusfrom the Time group. Output rate bumps to 400 Hz; baud bumps to 921600 to fit the 138-byte frame (55.2 kB/s on the wire, ~60% of 921600 — 115200 cannot carry it).The VN-300 hardware must be reconfigured to match (write reg 75, write reg 5,
\$VNWNVto persist). Coupling is intentional and fail-closed — the parsermemcmps the 10-byte header, so an unconfigured VN-300 produces zerovn*data rather than corrupt output. The VNCC command for Vac is\$VNWRG,75,1,1,1B,01E3,0200,0090,0142followed by\$VNWRG,5,921600,1and\$VNWNV.Other EFIS types (Dynon, Garmin, MGL) keep their 115200 baud — the new per-EFIS-type baud logic only affects VN-300. They also keep their native broadcast rates (10–50 Hz). OnSpeed's EFIS reader is byte-arrival-driven (
Serial.available()loop, not polled), so device-side rate is just whatever the EFIS sends.Changes
software/Libraries/onspeed_core/src/efis/Vn300.{h,cpp}— 138-byte frame layout, 10-bytekHeader, sync detection on0xFA 0x1B, decode of TimeStartup/TimeGps/TimeStatus, removal ofszTimeUTC+formatTimeHmsMs.LogRow.h—vnTimeUtc[32]+kLogRowUtcTimeLenreplaced byvnTimeStartupNs(u64),vnTimeGpsNs(u64),vnTimeStatus(u8).LogCsv.{h,cpp},LogCsvHeaderIndex.{h,cpp}— three new numeric columns; the comma-in-vnTimeUtc safety assertion is removed (u64/u8 can't produce commas); require-all column count grows 26 → 28.EfisSerialPort.{h,cpp}—Init()picks 921600 for VN-300, 115200 for everything else. Live reconfigure on EFIS-type changes via the existingRequestTypeChangepath; no reboot needed for runtime swaps.LogSensor.cpp, sketch-sideSuVN300Data— new fields plumbed through; sidecarutcOrNullsource for VN-300 dropped (reconstruct offline fromvnTimeGpsNsif needed).SynthFrames.cpp— 138-byte builder.test_efis_vn300fixture; new tests for per-sample timestamps andTimeStatus; old-format rejection test; updated CRC-parity buffer to 137; updatedtest_log_csvandtest_log_csv_header_indexschemas + fixture; updatedtest_efis_dispatcherandtest_synth_framesbuilders.efis-integration/vectornav.mdrewritten for 400 Hz / 921600 / new VNCC recipe;log-columns.mddescribes the three new columns.Group masks (verified against vnproglib enum)
Header (10 B) + payload (126 B) + CRC (2 B) = 138 B total.
Testing
```bash
pio test -e native # 1169 passed, 1 skipped
pio run -e esp32s3-v4p # SUCCESS — 15.7% flash, 22.3% RAM
tools/regression/run_snapshot.py # all 5 goldens match (no flight-replay drift)
```
🤖 Generated with Claude Code