Skip to content

Enable NVFP4 grouped MLP cuDNN wgrad#2

Open
sraman-rgb wants to merge 5 commits into
mainfrom
nvfp4-grouped-mlp-wgrad
Open

Enable NVFP4 grouped MLP cuDNN wgrad#2
sraman-rgb wants to merge 5 commits into
mainfrom
nvfp4-grouped-mlp-wgrad

Conversation

@sraman-rgb

@sraman-rgb sraman-rgb commented May 30, 2026

Copy link
Copy Markdown
Owner

Based on main after PR 3048 merged.

Changes:

  • Add NVFP4 formatting for the cuDNN grouped MLP wgrad helper.
  • Enable the NVFP4 cuDNN wgrad path only when NVTE_CUTEDSL_FUSED_GROUPED_MLP_NVFP4_WGRAD=1.

Prior validation:

  • Python lint: pass
  • Full grouped MLP fused tests with NVFP4 wgrad enabled: pass

@sraman-rgb sraman-rgb changed the base branch from nvfp4-grouped-mlp-srelu to main June 1, 2026 20:40
Signed-off-by: Siddhartha Raman S <sraman@nvidia.com>
@sraman-rgb sraman-rgb force-pushed the nvfp4-grouped-mlp-wgrad branch from 4446afe to 63be7c7 Compare June 1, 2026 20:42
pre-commit-ci Bot and others added 4 commits June 1, 2026 21:41
Signed-off-by: Siddhartha Raman S <sraman@nvidia.com>
Signed-off-by: Siddhartha Raman S <sraman@nvidia.com>
Signed-off-by: Siddhartha Raman S <sraman@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant