Skip to content

feat: SG-42630: Add crash-dump reporting capability to RV#1312

Draft
bernie-laberge wants to merge 1 commit into
AcademySoftwareFoundation:mainfrom
bernie-laberge:add_crash_dump_capability
Draft

feat: SG-42630: Add crash-dump reporting capability to RV#1312
bernie-laberge wants to merge 1 commit into
AcademySoftwareFoundation:mainfrom
bernie-laberge:add_crash_dump_capability

Conversation

@bernie-laberge

@bernie-laberge bernie-laberge commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Linked issues

Fixes #1268

Summarize your change.

Adds minidump-based crash reporting to RV on macOS, Linux, and Windows, using Google Crashpad
for in-process capture (plus the out-of-process crashpad_handler) and Google Breakpad for
offline symbolication tooling (dump_syms, minidump_stackwalk, minidump_dump). The two libraries
are kept strictly disjoint. The intended architecture and its invariants are documented normatively in
docs/crash-reporting.md.

Describe the reason for the change.

RV had no way to capture actionable crash reports from end users. This adds automatic minidump capture
with enough context — which Mu/Python script was executing, GPU info, and the attached session log —
to symbolicate and debug a crash offline, without needing a reproduction.

Describe what you have tested and on which operating system.

  • macOS (Apple Silicon) — full local validation: handler init across rv/RV/rvio; a triggered
    crash() produces a minidump with the expected annotations (platform, gpu_*, py_*,
    mu_function); symbols are stripped from the Release install; the symbols_archive zip is produced;
    a real dump symbolicates end-to-end via symbolicate_crash.sh; the unit + smoke tests pass under
    ctest.
  • Windows (VS 2022, Release) — validated: the Release install strips all PDBs; symbols_archive
    produces RV-<version>-windows-amd64-symbols.zip containing the PDBs; CrashHandlerTest builds and
    passes.
  • Linux (Rocky) — the implementation carries Linux-specific support (the crashpad_handler
    wrapper with the ulimit -t guard, .sym generation, install rules); the unit + smoke tests run in
    CI on Linux.

Add a list of changes, and note any that might need special attention during the review.

Crash capture

  • src/lib/base/TwkUtil/CrashHandler.{h,cpp} — singleton Crashpad handler: annotation table, log
    attachment, init-order-independent pre-init annotation buffering.
  • src/lib/app/MuTwkApp/CrashHandlerInit.{h,cpp} — one shared init path for rv, the macOS RV
    bundle, and rvio; per-platform handler wrappers (src/bin/apps/rv/crashpad_handler_*.sh.in).

Script context⚠️ please review the Mu-core touch points

  • src/lib/mu/Mu/Mu/ExecutionObserver.h + hooks in MachineRep.cpp / Thread.cpp: a
    dependency-free execution-observer hook in the Mu core (no TwkUtil/Qt dependency); the
    crash-specific MuCrashObserver lives in the app layer (src/lib/app/MuTwkApp). Function-level
    fidelity. The crash() Mu command (IPMu/CommandsModule.cpp) is a test-only trigger and carries
    no production logic.

Symbols & packaging⚠️ note the OpenRV ↔ pipeline boundary

  • cmake/macros/rv_generate_symbols.cmake — build-time .sym (and native PDBs on Windows); the
    intermediate .dSYM is removed after generation so it does not pollute the staged plugin dirs.
  • cmake/macros/rv_archive_symbols.cmake (+ rv_collect_pdbs.cmake) — the symbols_archive target
    builds a versioned, per-platform symbol zip. OpenRV only produces the archive; uploading it to a
    symbol store is left to the external build/release pipeline (no external infrastructure assumed).
  • cmake/install/pre_install*.cmake — strip symbols from the customer (Release) package.
  • src/bin/apps/rv/symbolicate_crash.sh — portable; adds a --symbols <dir> override.

Tests⚠️ enabled by default in CI

  • src/test/CrashHandlerTest (doctest unit) and src/test/CrashDumpSmokeTest (end-to-end). The smoke
    test launches RV with crash() and asserts a dump is produced (annotations too where minidump_dump
    is available); reviewers may want to confirm the headless launch behaves on their CI runners.

Docs

  • docs/crash-reporting.md (normative/technical spec).
  • docs/rv-manuals/rv-crash-reporting.md — user-facing guide added to the docs site: how crash reporting works, where dumps are written per platform, and how to disable it (RV_CRASH_DUMPS_ENABLED=0).

If possible, provide screenshots.

N/A — this is a build/runtime backend feature with no UI.

When launching RV, the crash dumps directory is shown:

rel (run rvcfg/rvmk) labergb@MTLGJKW517F71 MacOS % ./rv
INFO: real-time thread priorities set (bus=100000kHz)
./rv
Version 4.0.0 (RELEASE), built on Jun 19 2026 at 10:52:14 (HEAD=c5a74aaf).
Copyright Contributors to the Open RV Project
INFO: Crash handler initialized successfully
INFO: Crash dumps will be saved to: /Users/labergb/Library/Logs/ASWF/Crashes/

Symbolicating a crash report:

% ./symbolicate_crash.sh /Users/labergb/Library/Logs/ASWF/Crashes/pending/4b94d812-9292-4d51-a19d-093c0954560b.dmp 
Symbolicating crash dump...

Example of a symbolicated crash

Operating system: Mac OS X
                  26.5.1 25F80
CPU: arm64
     10 CPUs

GPU: Apple Apple M1 Pro

Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Crash address: 0x0
Process uptime: 28 seconds

Thread 0 (crashed)
 0  rv!IPMu::crash(Mu::Node const&, Mu::Thread&) + 0x8
     x0 = 0x0000000116704508    x1 = 0x0000000114cbce00
     x2 = 0x0000000114cbce00    x3 = 0x0000000130e5ed11
     x4 = 0x0000000000000000    x5 = 0x0000000000000000
     x6 = 0x0000000000000000    x7 = 0xfffff0003ffff800
...

========================================
Crashpad Annotations
========================================

gpu_renderer: Apple M1 Pro
gpu_vendor: Apple
python_caller: 
py_function: _update_media_info
py_script_line: 446
py_script_file: /Users/labergb/openrv_bernie2/OpenRV/_build/stage/app/RV.app/Contents/PlugIns/Python/multiple_source_media_rep.py
mu_script_file: 
mu_function: rvui.clearEverything

@bernie-laberge bernie-laberge changed the title SG-42630: Add crash-dump reporting capability to RV feat: SG-42630: Add crash-dump reporting capability to RV Jun 19, 2026
@bernie-laberge bernie-laberge marked this pull request as draft June 19, 2026 19:55
@bernie-laberge bernie-laberge force-pushed the add_crash_dump_capability branch 4 times, most recently from 10e7ea5 to 45072e1 Compare June 19, 2026 20:37
@bernie-laberge bernie-laberge changed the title feat: SG-42630: Add crash-dump reporting capability to RV [ Add crash-dump reporting capability to RV ] Jun 19, 2026
@bernie-laberge bernie-laberge changed the title [ Add crash-dump reporting capability to RV ] [ SG-42630: Add crash-dump reporting capability to RV ] Jun 19, 2026
@bernie-laberge bernie-laberge changed the title [ SG-42630: Add crash-dump reporting capability to RV ] feat: SG-42630: Add crash-dump reporting capability to RV Jun 19, 2026
@bernie-laberge bernie-laberge force-pushed the add_crash_dump_capability branch 4 times, most recently from cddefb1 to 6ed91b2 Compare June 20, 2026 02:13
Add minidump-based crash reporting on macOS, Linux, and Windows using Google
Crashpad (in-process capture + out-of-process handler) and Google Breakpad
(offline symbolication tooling). Breakpad and Crashpad are kept disjoint.

Capture & context
- One shared TwkApp::initializeCrashHandler() init path for rv, the macOS RV
  bundle, and rvio; per-platform crashpad_handler wrappers (Linux adds a
  ulimit -t guard). Handler naming is fixed per platform.
- Annotation table (platform, qt_version, gpu_vendor/renderer, py_*, mu_*,
  python_caller). Every key has exactly one table entry; addAnnotation() warns on
  unmapped keys in debug builds and buffers pre-init annotations, flushing them on
  successful init so delivery is init-order-independent.
- Mu execution context is captured by a dependency-free ExecutionObserver hook in
  the Mu core plus a MuCrashObserver in the app layer (function-level fidelity);
  Python context via PyEval_SetTrace. The crash() Mu command is a test-only
  trigger and carries no production logic.

Symbols
- Build-time Breakpad .sym generation (dump_syms) on macOS/Linux and native PDBs
  on Windows; the intermediate dSYM is removed after symbol generation so it does
  not pollute the staged plugin directories. Symbols are stripped from the
  customer (Release) package and packaged by the 'symbols_archive' target into a
  versioned, per-platform zip for offline symbolication; OpenRV only produces the
  archive (upload is the external pipeline's job).
- symbolicate_crash.sh (portable; with a --symbols override) resolves a dump
  against an archived symbol tree.

Tests & docs
- doctest unit test (src/test/CrashHandlerTest) and an end-to-end smoke test
  (src/test/CrashDumpSmokeTest), enabled by default in CI on every platform that
  builds the Crashpad handler. The smoke test asserts the platform annotation
  where minidump_dump is available (macOS/Linux) and verifies dump production only
  on Windows.
- Normative spec docs/crash-reporting.md; user-facing guide
  docs/rv-manuals/rv-crash-reporting.md (RV_CRASH_DUMPS_ENABLED and friends).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Bernard Laberge <bernard.laberge@autodesk.com>
@bernie-laberge bernie-laberge force-pushed the add_crash_dump_capability branch from 6ed91b2 to c449f6f Compare June 20, 2026 02:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add crash-dump generation capabilities

1 participant