codegen: disable broken PostRADualIssue pass (miscompiles on Maxwell)#4
codegen: disable broken PostRADualIssue pass (miscompiles on Maxwell)#4HueponiK wants to merge 1 commit into
Conversation
|
Can you provide a minimal test case of a shader that fails to compile properly? Dual issue is something I explicitly intended to support in UAM, and I would very much prefer to merge a PR that actually fixes issues instead of disabling the dual issue support. Looking at the code, it seems it may be possible to replace https://github.com/devkitPro/uam/blob/master/mesa-imported/codegen/nv50_ir_target_gm107.cpp#L294 with Additionally, the overall style and formatting of both the text in the PR as well as the code strongly reminds me of LLM output. Did you fully author this yourself by hand, or were LLMs involved at any point? |
Summary
PostRADualIssueis an experimental post-RA peephole pass that reordersinstructions to pack dual-issue pairs. Its commutation-legality check
(
isChainedCommutationLegal()→Instruction::isCommutationLegal()) misses adependency, so it can move an instruction across a producer of a value it
reads. The dual-issued instruction then consumes a stale value, producing
an intermittent, timing-dependent miscompile.
This PR disables the pass (it only runs at
optLevel >= 2).Symptom / reproduction
Video of the issue:
https://youtu.be/5T3hu8OA_C0
On a real Tegra X1 (Maxwell, sm_53), the miscompile shows up as dancing black
"knife-cut" artifacts in heavy multi-pass slang shaders — reproduced with
the crt-guest-advanced "deconvergence" final pass running on a deko3d backend.
The fault was isolated on-device with a per-pass disable mask over
optimizePostRA/optimizeSSA:PostRADualIssuedisabledPostRADualIssueenabled (rest off)So the artifact tracks
PostRADualIssuespecifically, not the optimizationlevel in general.
Why disabling is the right call
PostRADualIssueis not part of mainline mesa/nouveau, which performs nodual-issue scheduling. It was imported from karolherbst's never-merged
dual_issue_v3branch plus local tweaks. Disabling it restores theknown-good mainline behaviour; the level-2 SSA passes still provide the
instruction-count reduction.
A complete fix would instead harden
isChainedCommutationLegal()to accountfor the missing dependency; until someone does that, disabling the pass is the
safe choice.
References
There's quite a chain I got there:
https://codeberg.org/hueponik/goosestation-builder - libretro core. It includes needed fixes as patches. Hopefully we'll be able to upstream them
libretro/RetroArch#19043 - deko3d driver for RA
HueponiK/RetroArch#1 - shaders for deko3d driver