feat: SG-42630: Add crash-dump reporting capability to RV#1312
Draft
bernie-laberge wants to merge 1 commit into
Draft
feat: SG-42630: Add crash-dump reporting capability to RV#1312bernie-laberge wants to merge 1 commit into
bernie-laberge wants to merge 1 commit into
Conversation
10e7ea5 to
45072e1
Compare
cddefb1 to
6ed91b2
Compare
Add minidump-based crash reporting on macOS, Linux, and Windows using Google Crashpad (in-process capture + out-of-process handler) and Google Breakpad (offline symbolication tooling). Breakpad and Crashpad are kept disjoint. Capture & context - One shared TwkApp::initializeCrashHandler() init path for rv, the macOS RV bundle, and rvio; per-platform crashpad_handler wrappers (Linux adds a ulimit -t guard). Handler naming is fixed per platform. - Annotation table (platform, qt_version, gpu_vendor/renderer, py_*, mu_*, python_caller). Every key has exactly one table entry; addAnnotation() warns on unmapped keys in debug builds and buffers pre-init annotations, flushing them on successful init so delivery is init-order-independent. - Mu execution context is captured by a dependency-free ExecutionObserver hook in the Mu core plus a MuCrashObserver in the app layer (function-level fidelity); Python context via PyEval_SetTrace. The crash() Mu command is a test-only trigger and carries no production logic. Symbols - Build-time Breakpad .sym generation (dump_syms) on macOS/Linux and native PDBs on Windows; the intermediate dSYM is removed after symbol generation so it does not pollute the staged plugin directories. Symbols are stripped from the customer (Release) package and packaged by the 'symbols_archive' target into a versioned, per-platform zip for offline symbolication; OpenRV only produces the archive (upload is the external pipeline's job). - symbolicate_crash.sh (portable; with a --symbols override) resolves a dump against an archived symbol tree. Tests & docs - doctest unit test (src/test/CrashHandlerTest) and an end-to-end smoke test (src/test/CrashDumpSmokeTest), enabled by default in CI on every platform that builds the Crashpad handler. The smoke test asserts the platform annotation where minidump_dump is available (macOS/Linux) and verifies dump production only on Windows. - Normative spec docs/crash-reporting.md; user-facing guide docs/rv-manuals/rv-crash-reporting.md (RV_CRASH_DUMPS_ENABLED and friends). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Bernard Laberge <bernard.laberge@autodesk.com>
6ed91b2 to
c449f6f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Linked issues
Fixes #1268
Summarize your change.
Adds minidump-based crash reporting to RV on macOS, Linux, and Windows, using Google Crashpad
for in-process capture (plus the out-of-process
crashpad_handler) and Google Breakpad foroffline symbolication tooling (
dump_syms,minidump_stackwalk,minidump_dump). The two librariesare kept strictly disjoint. The intended architecture and its invariants are documented normatively in
docs/crash-reporting.md.Describe the reason for the change.
RV had no way to capture actionable crash reports from end users. This adds automatic minidump capture
with enough context — which Mu/Python script was executing, GPU info, and the attached session log —
to symbolicate and debug a crash offline, without needing a reproduction.
Describe what you have tested and on which operating system.
rv/RV/rvio; a triggeredcrash()produces a minidump with the expected annotations (platform,gpu_*,py_*,mu_function); symbols are stripped from the Release install; thesymbols_archivezip is produced;a real dump symbolicates end-to-end via
symbolicate_crash.sh; the unit + smoke tests pass underctest.symbols_archiveproduces
RV-<version>-windows-amd64-symbols.zipcontaining the PDBs;CrashHandlerTestbuilds andpasses.
crashpad_handlerwrapper with the
ulimit -tguard,.symgeneration, install rules); the unit + smoke tests run inCI on Linux.
Add a list of changes, and note any that might need special attention during the review.
Crash capture
src/lib/base/TwkUtil/CrashHandler.{h,cpp}— singleton Crashpad handler: annotation table, logattachment, init-order-independent pre-init annotation buffering.
src/lib/app/MuTwkApp/CrashHandlerInit.{h,cpp}— one shared init path forrv, the macOSRVbundle, and
rvio; per-platform handler wrappers (src/bin/apps/rv/crashpad_handler_*.sh.in).Script context —⚠️ please review the Mu-core touch points
src/lib/mu/Mu/Mu/ExecutionObserver.h+ hooks inMachineRep.cpp/Thread.cpp: adependency-free execution-observer hook in the Mu core (no TwkUtil/Qt dependency); the
crash-specific
MuCrashObserverlives in the app layer (src/lib/app/MuTwkApp). Function-levelfidelity. The
crash()Mu command (IPMu/CommandsModule.cpp) is a test-only trigger and carriesno production logic.
Symbols & packaging —⚠️ note the OpenRV ↔ pipeline boundary
cmake/macros/rv_generate_symbols.cmake— build-time.sym(and native PDBs on Windows); theintermediate
.dSYMis removed after generation so it does not pollute the staged plugin dirs.cmake/macros/rv_archive_symbols.cmake(+rv_collect_pdbs.cmake) — thesymbols_archivetargetbuilds a versioned, per-platform symbol zip. OpenRV only produces the archive; uploading it to a
symbol store is left to the external build/release pipeline (no external infrastructure assumed).
cmake/install/pre_install*.cmake— strip symbols from the customer (Release) package.src/bin/apps/rv/symbolicate_crash.sh— portable; adds a--symbols <dir>override.Tests —⚠️ enabled by default in CI
src/test/CrashHandlerTest(doctest unit) andsrc/test/CrashDumpSmokeTest(end-to-end). The smoketest launches RV with
crash()and asserts a dump is produced (annotations too whereminidump_dumpis available); reviewers may want to confirm the headless launch behaves on their CI runners.
Docs
docs/crash-reporting.md(normative/technical spec).docs/rv-manuals/rv-crash-reporting.md— user-facing guide added to the docs site: how crash reporting works, where dumps are written per platform, and how to disable it (RV_CRASH_DUMPS_ENABLED=0).If possible, provide screenshots.
N/A — this is a build/runtime backend feature with no UI.
When launching RV, the crash dumps directory is shown:
Symbolicating a crash report:
Example of a symbolicated crash