Skip to content

[codex] Update MLX submodule for native Vulkan MXFP quantization#56

Merged
goniz merged 2 commits into
mainfrom
codex/update-mlx-native-mxfp-quant
Jun 18, 2026
Merged

[codex] Update MLX submodule for native Vulkan MXFP quantization#56
goniz merged 2 commits into
mainfrom
codex/update-mlx-native-mxfp-quant

Conversation

@goniz

@goniz goniz commented Jun 18, 2026

Copy link
Copy Markdown
Owner

Summary

Updates the mlx submodule pointer to include native Vulkan MXFP4/MXFP8 quantization support from goniz/mlx#56.

This PR intentionally contains only the submodule gitlink update:

  • mlx: 75812951eedf0a7f82b1898e56c2281500d2e6c3 -> 2679a23ab9a46c5e8545e917735fd2468d441978

Validation

Validation was run against the updated submodule checkout:

  • ./dev.sh build
  • DEVICE=gpu ./dev.sh test-py mlx/python/tests/test_quantized.py -k "mxfp4_quantize_dequantize or mxfp8_quantize_dequantize"
  • Full GPU Python suite was run by the wrapper: 699 passed, 13 skipped
  • ./dev.sh generate completed with coherent output
  • ./dev.sh model-report --json-output benchmarks/model_generation_report.json completed successfully for all default models

Benchmarks

Compared against latest matching rows in benchmarks/results.csv from 2026-06-04T07:22:00Z.

Model Metric CSV baseline Current Diff
Qwen3-0.6B-bf16 prompt_tps 2410.127 2873.965 +463.838 (+19.25%)
Qwen3-0.6B-bf16 generation_tps 65.748 65.964 +0.216 (+0.33%)
Qwen3-0.6B-bf16 peak_memory_gb 2.614 2.614 +0.000 (+0.00%)
Qwen3-0.6B-8bit prompt_tps 1360.581 1493.403 +132.822 (+9.76%)
Qwen3-0.6B-8bit generation_tps 87.609 88.624 +1.015 (+1.16%)
Qwen3-0.6B-8bit peak_memory_gb 2.056 2.056 +0.000 (+0.00%)

@goniz goniz marked this pull request as ready for review June 18, 2026 11:21

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

https://github.com/goniz/mlx-vulkan/blob/d65c00a99de9fadf1d31e1380e589862cb45f159/mlx/mlx/backend/vulkan/fast.cpp#L413
P2 Badge Gate f16 norm shader on device support

On devices where VulkanContext::shader_float16_supported() is false, f16 LayerNorm now still selects norm_f16 here, so the fallback block below never casts x to float32 and dispatch_norm_op will try to create a shader using float16_t. The previous path always copied non-f32 inputs to f32 and used norm_f32, so f16 LayerNorm regresses on Vulkan devices without shader-float16 support; return nullopt when the feature is unavailable so the existing f32 fallback runs.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@goniz goniz force-pushed the codex/update-mlx-native-mxfp-quant branch from d65c00a to bd2b46b Compare June 18, 2026 11:28
@goniz goniz force-pushed the codex/update-mlx-native-mxfp-quant branch from bd2b46b to 939abdd Compare June 18, 2026 11:34
@goniz goniz merged commit 4f51590 into main Jun 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant