[codex] Update MLX submodule for native Vulkan MXFP quantization#56
Conversation
There was a problem hiding this comment.
💡 Codex Review
https://github.com/goniz/mlx-vulkan/blob/d65c00a99de9fadf1d31e1380e589862cb45f159/mlx/mlx/backend/vulkan/fast.cpp#L413
Gate f16 norm shader on device support
On devices where VulkanContext::shader_float16_supported() is false, f16 LayerNorm now still selects norm_f16 here, so the fallback block below never casts x to float32 and dispatch_norm_op will try to create a shader using float16_t. The previous path always copied non-f32 inputs to f32 and used norm_f32, so f16 LayerNorm regresses on Vulkan devices without shader-float16 support; return nullopt when the feature is unavailable so the existing f32 fallback runs.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
d65c00a to
bd2b46b
Compare
bd2b46b to
939abdd
Compare
Summary
Updates the
mlxsubmodule pointer to include native Vulkan MXFP4/MXFP8 quantization support from goniz/mlx#56.This PR intentionally contains only the submodule gitlink update:
mlx:75812951eedf0a7f82b1898e56c2281500d2e6c3->2679a23ab9a46c5e8545e917735fd2468d441978Validation
Validation was run against the updated submodule checkout:
./dev.sh buildDEVICE=gpu ./dev.sh test-py mlx/python/tests/test_quantized.py -k "mxfp4_quantize_dequantize or mxfp8_quantize_dequantize"699 passed, 13 skipped./dev.sh generatecompleted with coherent output./dev.sh model-report --json-output benchmarks/model_generation_report.jsoncompleted successfully for all default modelsBenchmarks
Compared against latest matching rows in
benchmarks/results.csvfrom2026-06-04T07:22:00Z.