Add VLM components to _default_disabled_quantizer_cfg?

Hello, I'd like to propose a small change to how multimodal models are handled during quantization. Please correct me if my understanding is incorrect.

Currently, when quantizing a multimodal model with any recipe, ModelOpt attaches observers to the vision tower and multimodal projector alongside the language model. For models like `moonshotai/Kimi-K2.6`, this breaks at export with  `ValueError: tensor column shape must be divisible by the given group_size 32 but got 4304 (raised by compressed_tensors/quantization/lifecycle/forward_helpers.py:138 from inside modelopt.export_hf_checkpoint)"`. The failure happens after calibration completes, so the issue only happens after the run.

Every published VLM NVFP4 checkpoint I could find like `nvidia/Kimi-K2.5-NVFP4`, `wafer-ai/Kimi-K2.6-NVFP4`, `RedHatAI/Kimi-K2.6-NVFP4` keeps vision and projector layers in BF16. I think intuitively this makes sense because vision encoders are tiny on large VLMs (typically <1% of params), so quantizing them yields negligible memory savings and quantization noise at the visual feature stage compounds through the LLM layers.

I think we can append ~4 patterns to _default_disabled_quantizer_cfg, this is the same universal-disable list that already covers `lm_head`, `MoE` routers, `BatchNorm`, and `Mamba conv1d`. This makes sense to me intuitively because looking at the code, the convention there seems to be "always wrong to quantize regardless of recipe," and I think VLM vision/projector components appear to meet that bar?

I had to fix these issues to get a working Kimi2.6 quant that serves inference via SGLang. Happy to provide details if needed and make a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add VLM components to _default_disabled_quantizer_cfg? #1396

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add VLM components to _default_disabled_quantizer_cfg? #1396

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions