Skip to content

[release/1.12.0] Cherry-pick: Fix 'Memory access fault' for fused dense on MI350 (#325)#331

Merged
pragupta merged 1 commit into
release/1.12.0from
cherry-pick/release-1.12.0/fix-fused-dense-mi350
May 26, 2026
Merged

[release/1.12.0] Cherry-pick: Fix 'Memory access fault' for fused dense on MI350 (#325)#331
pragupta merged 1 commit into
release/1.12.0from
cherry-pick/release-1.12.0/fix-fused-dense-mi350

Conversation

@srinivamd
Copy link
Copy Markdown

@srinivamd srinivamd commented May 21, 2026

Cherry-pick of #325 to release/1.12.0

JIRA: ROCM-24646
Original PR: #325
Original commit: af25af4

Summary

Cherry-picks the fix for GPU memory access fault in fused_dense_cuda on MI350 (gfx950).

Root cause: gemm_lt() was using at::cuda::getCurrentCUDABlasHandle() (regular hipBLAS handle) instead of at::cuda::getCurrentCUDABlasLtHandle() (hipBLASLt handle) for hipBLASLt operations, causing a write-to-read-only-page fault.

Fix: Replace getCurrentCUDABlasHandle() with getCurrentCUDABlasLtHandle() in gemm_lt().

Use getCurrentCUDABlasLtHandle() instead of getCurrentCUDABlasHandle()
in gemm_lt() to get the correct hipBLASLt handle, fixing a
write-to-read-only-page fault on gfx950.

Cherry-pick of af25af4
@pragupta pragupta merged commit f5590ec into release/1.12.0 May 26, 2026
@pragupta pragupta deleted the cherry-pick/release-1.12.0/fix-fused-dense-mi350 branch May 26, 2026 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants