[WIP] Enable hlo lit tests for gfx1250 by nurmukhametov · Pull Request #857 · ROCm/xla

nurmukhametov · 2026-05-12T15:12:45Z

Motivation

[Do not merge] To get review

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

i-chaochen · 2026-05-12T15:28:09Z

        "a100_pcie_80",
        "a6000",
        "b200",
+        "gfx1250",


claude · 2026-05-15T20:45:02Z

+// RUN: %if IS_ROCM %{ hlo-opt %s --platform=gpu --stage=llvm-before-optimizations --xla_gpu_target_config_filename=%S/../target_config/specs/gfx1250.txtpb | FileCheck %s --check-prefixes=CHECK-TDM %}
+// RUN: %if IS_ROCM %{ hlo-opt %s --platform=gpu --stage=llvm-before-optimizations --xla_gpu_target_config_filename=%S/../target_config/specs/mi200.txtpb | FileCheck %s --check-prefixes=CHECK-NOTDM %}


nit: Both RUN lines are guarded by %if IS_ROCM, which means this test is a no-op on CUDA builds — it will be scheduled (and vacuously pass) on all seven CUDA targets (a100, a6000, b200, h100, p100, v100) since it's not in their disabled_on_gpus lists.

Consider either:

Adding the CUDA GPUs to disabled_on_gpus for this test, or

Adding %if !IS_ROCM RUN lines that validate something on CUDA (similar to how dot_bf16.hlo handles both paths).

Thanks, addressed this

claude · 2026-05-15T20:45:04Z

+// CHECK-NOTDM-NOT: @llvm.amdgcn.tensor.load.to.lds
+// CHECK-NOTDM-NOT: @llvm.amdgcn.s.wait.tensorcnt
+// CHECK-NOTDM-NOT: @llvm.amdgcn.tensor.store.from.lds
+// CHECK-NOTDM: @llvm.amdgcn.raw.ptr.buffer.load


nit: The CHECK-NOTDM-NOT directives only check for the absence of TDM intrinsics up to the final positive CHECK-NOTDM: @llvm.amdgcn.raw.ptr.buffer.load match. If TDM intrinsics appeared after the buffer load in the output, they wouldn't be caught.

Adding a CHECK-NOTDM-NOT: @llvm.amdgcn.raw.ptr.buffer directive after line 15 would make the negative check cover the rest of the output too.

claude · 2026-05-15T20:45:06Z

+  shared_memory_per_block: 327680
+  shared_memory_per_block_optin: 327680
+  shared_memory_per_core: 327680


The TODO is noted — just flagging the downstream risk: shared_memory_per_block, shared_memory_per_block_optin, and shared_memory_per_core are all set to 327680 (320KB). On NVIDIA GPUs these form a tiered model (per_block < per_block_optin <= per_core). If gfx1250 has a similar tiered model, setting all three equal could cause the Triton codegen to request block-level shared memory beyond what the hardware supports without opt-in. Worth verifying once real hardware specs are available.

claude · 2026-05-15T20:45:19Z

Claude Review Summary

This PR adds gfx1250 support to the HLO lit test infrastructure — new GPU target config spec, a TDM elementwise lit test, and updates to Bazel build rules and lit.bzl.

Key observations:

The new triton_tdm_elemwise.hlo test is ROCm-only but will vacuously pass on CUDA targets — consider disabling it for CUDA GPUs or adding CUDA-specific checks.
The CHECK-NOTDM-NOT pattern has a minor coverage gap after its last positive check.
The shared memory placeholder values (all equal at 320KB) are acknowledged by a TODO — worth verifying against real hardware specs when available.
The disabled_on_gpus list for gfx1250 is clean (only references existing files), unlike some stale entries in mi200's list.

3 inline comments posted with details.

PiperOrigin-RevId: 917794602

Imported from GitHub PR openxla#42500 📝 Summary of Changes Clean-up rocm_configre 🎯 Justification Removing dead code not relevant to current implementation 🚀 Kind of Contribution Please remove what does not apply: ♻️ Cleanup, 📊 Benchmark (for Performance Improvements) Not relevant 🧪 Unit Tests: Not relevant 🧪 Execution Tests: Not relevant Copybara import of the project: -- 122d8d3 by Alexandros Theodoridis <atheodor@amd.com>: Clean-up rocm_configure Merging this change closes openxla#42500 COPYBARA_INTEGRATE_REVIEW=openxla#42500 from ROCm:clean_up_rocm_configure 122d8d3 PiperOrigin-RevId: 917803372

…t mode. Imported from GitHub PR openxla#42185 📝 Summary of Changes The JAX ROCm Bazel presubmit workflow was originally using 64-bit mode. For now, it is being set to 32-bit mode for broader testing. Eventually, both will be enabled for full testing. The only change is to the ROCm CI workflow file (`.github/workflows/rocm_ci.yml`). The flag `JAX_ENABLE_X64` is now set to 0 instead of 1. Please see the associated PR on the JAX side for reference: jax-ml/jax#37475 🎯 Justification This will help the ROCm JAX tests in XLA align with what is being tested in JAX itself on the ROCm side. 🚀 Kind of Contribution 🧪 Tests Copybara import of the project: -- 8b599a9 by tsrw2048 <239799652+tsrw2048@users.noreply.github.com>: [ROCm] Move JAX ROCm Bazel presubmit check to 32-bit mode. The JAX ROCm Bazel presubmit workflow was originally using 64-bit mode. For now, it is being set to 32-bit mode for broader testing. Eventually, both will be enabled for full testing. Merging this change closes openxla#42185 COPYBARA_INTEGRATE_REVIEW=openxla#42185 from ROCm:rocm-move-jax-ci-32bit 8b599a9 PiperOrigin-RevId: 917813152

Minor additions and clarifications. PiperOrigin-RevId: 917816750

Adds a gfx1250 target spec to xla/backends/gpu/target_config/specs and wires it into the hlo_lit_tests suite in xla/backends/gpu/tests/BUILD. Generalizes the existing mi200 special-case in xla/lit.bzl into a small rocm_gpus list so rocm-only tag apply to any ROCm target, not just mi200. Add hlo lit test for Triton TDM intrinsic lowering xla/backends/gpu/tests/triton_tdm_elemwise.hlo

i-chaochen reviewed May 12, 2026

View reviewed changes

Comment thread xla/backends/gpu/tests/BUILD

"a100_pcie_80",

"a6000",

"b200",

"gfx1250",

Copy link
Copy Markdown

Collaborator

i-chaochen May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 :)

nurmukhametov force-pushed the anurmukh/hlo-tests-gfx1250 branch 3 times, most recently from 9a45087 to d425382 Compare May 15, 2026 20:32

nurmukhametov added the claude-review Request a Claude AI code review for this PR label May 15, 2026

claude Bot reviewed May 15, 2026

View reviewed changes

github-actions Bot removed the claude-review Request a Claude AI code review for this PR label May 15, 2026

nurmukhametov force-pushed the anurmukh/hlo-tests-gfx1250 branch 2 times, most recently from d621bc8 to e975b2f Compare May 15, 2026 21:14

dimitar-asenov and others added 5 commits May 19, 2026 06:21

[XLA] Update the PR size message with a link to Small CLs.

119f7d0

PiperOrigin-RevId: 917794602

[XLA] Update the contribution guidelines.

7bc2f6e

Minor additions and clarifications. PiperOrigin-RevId: 917816750

nurmukhametov force-pushed the anurmukh/hlo-tests-gfx1250 branch from e975b2f to 6ec42ef Compare May 19, 2026 14:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Enable hlo lit tests for gfx1250#857

[WIP] Enable hlo lit tests for gfx1250#857
nurmukhametov wants to merge 5 commits into
mainfrom
anurmukh/hlo-tests-gfx1250

nurmukhametov commented May 12, 2026

Uh oh!

i-chaochen May 12, 2026

Uh oh!

claude Bot May 15, 2026

Uh oh!

nurmukhametov May 15, 2026

Uh oh!

claude Bot May 15, 2026

Uh oh!

nurmukhametov May 15, 2026

Uh oh!

claude Bot May 15, 2026

Uh oh!

claude Bot commented May 15, 2026 •

edited by nurmukhametov

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		// RUN: %if IS_ROCM %{ hlo-opt %s --platform=gpu --stage=llvm-before-optimizations --xla_gpu_target_config_filename=%S/../target_config/specs/gfx1250.txtpb \| FileCheck %s --check-prefixes=CHECK-TDM %}
		// RUN: %if IS_ROCM %{ hlo-opt %s --platform=gpu --stage=llvm-before-optimizations --xla_gpu_target_config_filename=%S/../target_config/specs/mi200.txtpb \| FileCheck %s --check-prefixes=CHECK-NOTDM %}

Conversation

nurmukhametov commented May 12, 2026

Motivation

Submission Checklist

Uh oh!

i-chaochen May 12, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

nurmukhametov May 15, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

nurmukhametov May 15, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 15, 2026 • edited by nurmukhametov Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Review Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

claude Bot commented May 15, 2026 •

edited by nurmukhametov

Loading