ideogram: use _scaled_mm for fp8 matmul, sorry non-Ada-or-newer owners by bghira · Pull Request #2740 · bghira/SimpleTuner

bghira · 2026-06-05T03:18:50Z

This pull request introduces support for efficient FP8 matrix multiplication using scaled matmul on compatible CUDA devices in the ideogram quantized loading helpers. The most significant change is the addition of a custom autograd function to accelerate FP8 linear layers when supported hardware and software conditions are met.

FP8 Scaled Matmul Support:

Added a function _scaled_mm_supported to check if the current environment and tensor are suitable for using the optimized FP8 scaled matrix multiplication, with an environment variable override for manual control.
Introduced the _Fp8LinearScaledMm custom autograd function to perform forward and backward passes using torch._scaled_mm for FP8 inputs, including dynamic input scaling and proper handling of gradients.
Updated the forward method in the quantized linear class to use the new scaled FP8 path when supported, falling back to the previous dequantized path otherwise.

General Improvements:

Added import of the os module to support environment variable checks.
Defined FP8_INPUT_DTYPE for clarity and consistency in FP8 input handling.

ideogram: use _scaled_mm for fp8 matmul, sorry non-Ada-or-newer owners

e090bf1

bghira requested a review from Copilot June 5, 2026 03:20

Copilot started reviewing on behalf of bghira June 5, 2026 03:20 View session

This comment was marked as resolved.

Sign in to view

bghira added 5 commits June 5, 2026 13:26

ideogram fp8 accelerated path fix

f168641

Fix Ideogram FP8 scaled-mm bias handling

16adb40

Fix Ideogram FP8 scaled-mm output dtype

560d0a3

Fix Ideogram validation prompt device handling

ed85a4d

Fix Ideogram FP8 scaled-mm scale layout

30299b6

bghira merged commit faf9177 into main Jun 7, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ideogram: use _scaled_mm for fp8 matmul, sorry non-Ada-or-newer owners#2740

ideogram: use _scaled_mm for fp8 matmul, sorry non-Ada-or-newer owners#2740
bghira merged 6 commits into
mainfrom
feature/ideogram-scaled-mm

bghira commented Jun 5, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bghira commented Jun 5, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants