[codex] Update MLX submodule for native Vulkan MXFP quantization by goniz · Pull Request #56 · goniz/mlx-vulkan

goniz · 2026-06-18T11:15:18Z

Summary

Updates the mlx submodule pointer to include native Vulkan MXFP4/MXFP8 quantization support from goniz/mlx#56.

This PR intentionally contains only the submodule gitlink update:

mlx: 75812951eedf0a7f82b1898e56c2281500d2e6c3 -> 2679a23ab9a46c5e8545e917735fd2468d441978

Validation

Validation was run against the updated submodule checkout:

./dev.sh build
DEVICE=gpu ./dev.sh test-py mlx/python/tests/test_quantized.py -k "mxfp4_quantize_dequantize or mxfp8_quantize_dequantize"
Full GPU Python suite was run by the wrapper: 699 passed, 13 skipped
./dev.sh generate completed with coherent output
./dev.sh model-report --json-output benchmarks/model_generation_report.json completed successfully for all default models

Benchmarks

Compared against latest matching rows in benchmarks/results.csv from 2026-06-04T07:22:00Z.

Model	Metric	CSV baseline	Current	Diff
Qwen3-0.6B-bf16	prompt_tps	2410.127	2873.965	+463.838 (+19.25%)
Qwen3-0.6B-bf16	generation_tps	65.748	65.964	+0.216 (+0.33%)
Qwen3-0.6B-bf16	peak_memory_gb	2.614	2.614	+0.000 (+0.00%)
Qwen3-0.6B-8bit	prompt_tps	1360.581	1493.403	+132.822 (+9.76%)
Qwen3-0.6B-8bit	generation_tps	87.609	88.624	+1.015 (+1.16%)
Qwen3-0.6B-8bit	peak_memory_gb	2.056	2.056	+0.000 (+0.00%)

chatgpt-codex-connector

💡 Codex Review

https://github.com/goniz/mlx-vulkan/blob/d65c00a99de9fadf1d31e1380e589862cb45f159/mlx/mlx/backend/vulkan/fast.cpp#L413
Gate f16 norm shader on device support

On devices where VulkanContext::shader_float16_supported() is false, f16 LayerNorm now still selects norm_f16 here, so the fallback block below never casts x to float32 and dispatch_norm_op will try to create a shader using float16_t. The previous path always copied non-f32 inputs to f32 and used norm_f32, so f16 LayerNorm regresses on Vulkan devices without shader-float16 support; return nullopt when the feature is unavailable so the existing f32 fallback runs.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

goniz marked this pull request as ready for review June 18, 2026 11:21

chatgpt-codex-connector Bot reviewed Jun 18, 2026

View reviewed changes

goniz force-pushed the codex/update-mlx-native-mxfp-quant branch from d65c00a to bd2b46b Compare June 18, 2026 11:28

Update MLX submodule for native Vulkan MXFP quantization

939abdd

goniz force-pushed the codex/update-mlx-native-mxfp-quant branch from bd2b46b to 939abdd Compare June 18, 2026 11:34

updated mlx

56b6384

goniz merged commit 4f51590 into main Jun 18, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] Update MLX submodule for native Vulkan MXFP quantization#56

[codex] Update MLX submodule for native Vulkan MXFP quantization#56
goniz merged 2 commits into
mainfrom
codex/update-mlx-native-mxfp-quant

goniz commented Jun 18, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

goniz commented Jun 18, 2026

Summary

Validation

Benchmarks

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant