feat: Adding Support for SD.Next Quantization Engine (SDNQ) (Flux1&Flux2klein4B/9B&Z-Image) by Pfannkuchensack · Pull Request #9228 · invoke-ai/InvokeAI

Pfannkuchensack · 2026-05-24T02:43:05Z

Summary

Adds support for SDNQ (SD.Next Quantization) as a new quantization format in InvokeAI, enabling memory-efficient inference for large models on consumer GPUs.

What's included:

New sdnq quantization backend (invokeai/backend/quantization/sdnq/) with SDNQTensor, dequant utils, and safetensors loaders (incl. multi-shard support)
Model config + loader support for SDNQ-quantized:
- FLUX.1 transformers (with BFL ↔ diffusers norm_out scale/shift fix)
- FLUX.2 Klein 4B/9B transformers (incl. dynamic mixed-precision Klein pipelines)
- Z-Image full ZImagePipeline diffusers folders (all submodels dispatched via SDNQ loader)
- T5 and Qwen3 text encoders
Config discriminator: SDNQ-quantized diffusers folders are now correctly identified as SDNQ instead of plain diffusers (avoids crashes when reading packed uint8 weights as bf16)
Loader treats SDNQ ZImagePipeline / Flux2KleinPipeline folders as main_is_diffusers so submodels auto-extract (no separate VAE/Qwen3 source required)
Frontend: new SDNQ model format badge, schema/types regeneration, readiness updates, Klein FE combobox now accepts SDNQ pipeline configs
Starter models entries + user-facing docs at docs/src/content/docs/configuration/sdnq-quantization.mdx
Tests: tests/backend/quantization/sdnq/ covering tensor dequant + loader behavior; custom-modules tests extended

Why: SDNQ enables running FLUX, FLUX.2, and Z-Image on lower-VRAM GPUs by loading pre-quantized weight folders directly, without runtime conversion overhead.

Related Issues / Discussions

Closes #8789

QA Instructions

Install an SDNQ-quantized model folder for each supported architecture and verify identification:
- FLUX (BFL + diffusers variants)
- FLUX.2 dev + FLUX.2 Klein (dynamic mixed-precision)
- Z-Image full pipeline
- T5 / Qwen3 encoders (standalone + bundled in pipelines)
In the Model Manager, confirm the model is tagged with the SDNQ format badge.
Run a generation with each model and verify:
- Submodels auto-extract from the pipeline folder (no extra VAE/text-encoder sources needed)
- Multi-shard diffusion_pytorch_model-*-of-*.safetensors files merge correctly (Klein 9B, FLUX.2 dev)
- No crashes from bf16 reads on packed uint8 weights
Verify FLUX output quality is unchanged (regression check for the BFL norm_out scale/shift swap).
Run the new tests: uv run --extra cuda pytest tests/backend/quantization/sdnq/.

Merge Plan

Needs Testing

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration — n/a
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

Add support for loading SDNQ-quantized models with on-the-fly CPU dequantization, similar to existing GGUF support. New features: - SDNQTensor class with __torch_dispatch__ for automatic dequantization - Support for symmetric/asymmetric int8/uint8/fp8 quantization - Optional SVD correction (low-rank approximation) - Model loaders for Flux and Z-Image SDNQ models - Automatic format detection via weight+scale key pairs New files: - invokeai/backend/quantization/sdnq/ (core module) - tests/backend/quantization/sdnq/ (unit tests) Modified files: - taxonomy.py: Add ModelFormat.SDNQQuantized - configs/main.py: Add Main_SDNQ_FLUX_Config, Main_SDNQ_ZImage_Config - configs/factory.py: Register SDNQ configs - model_loaders/flux.py: Add FluxSDNQCheckpointModel - model_loaders/z_image.py: Add ZImageSDNQCheckpointModel

- Add uint4 per-group quantization with packed weight unpacking - Handle 1D flattened weights (reshape to 2D before unpacking) - Support SDNQ diffusers format for FLUX transformer and T5 - Add SDNQ VAE loading with AutoencoderKL - Add diagnostic logging for debugging dequantization - Fix bit order in uint4 unpacking (lower, upper)

…tion The test was checking `(weight - zero_point) * scale`, but SDNQ (Disty0/sdnq) defines asymmetric dequantization as `zero_point + weight * scale` (via torch.addcmul), where zero_point is a post-scale bias rather than a pre-scale integer offset. The implementation already follows this convention; only the test expectation was wrong.

…tion The test was checking `(weight - zero_point) * scale`, but SDNQ (Disty0/sdnq) defines asymmetric dequantization as `zero_point + weight * scale` (via torch.addcmul), where zero_point is a post-scale bias rather than a pre-scale integer offset. The implementation already follows this convention; only the test expectation was wrong. feat(sdnq): support sidecar LoRA application on SDNQ-quantized layers Bring SDNQ to feature parity with GGUF in the sidecar patching path so LoRA, LoKr, DoRA, FullLayer, and FluxControl patches apply correctly to SDNQ-quantized Linear and Conv2d modules. Without this, the sidecar aggregate replaced the SDNQTensor weight with a meta tensor and patches silently produced wrong results. - Add SDNQTensor branch in CustomModuleMixin._aggregate_patch_parameters mirroring the GGMLTensor branch. - Extend the (GGMLTensor) dtype-cast exclusion to also cover SDNQTensor in CustomLinear, CustomConv2d, CustomInvokeLinearNF4, and CustomInvokeLinear8bitLt. - Add `linear_with_sdnq_quantized_tensor` and `linear_sdnq_quantized` fixtures so the existing custom-module test matrix exercises SDNQ alongside GGUF, BnB-8bit, and NF4.

Add T5Encoder_SDNQ_Config for diffusers-style T5 bundles whose text_encoder_2/ folder holds SDNQ-quantized safetensors (detected via quantization_config.json's quant_method or via the SDNQ-style weight+scale key pairs). Add T5EncoderSDNQLoader that materializes the T5EncoderModel on meta, then loads the SDNQ state dict, and re-shares the embed_tokens/shared weight per HuggingFace's tied- weight convention.

Add Main_SDNQ_Flux2_Config covering Klein 4B/9B and their Base variants (detected via _get_flux2_variant on the dequantized SDNQTensor shapes plus the existing filename heuristic), and Flux2SDNQCheckpointModel that loads diffusers-layout SDNQ FLUX.2 checkpoints straight into Flux2Transformer2DModel. Architecture (num_layers, hidden_size, attention head count, guidance presence) is detected from state-dict shapes the same way the fp16 loader does, since SDNQTensor.shape reports the dequantized shape. BFL-layout SDNQ FLUX.2 checkpoints are not supported here — that would require an SDNQTensor-aware port of the _convert_flux2_bfl_to_diffusers fuse logic.

Add Main_SDNQ_Diffusers_ZImage_Config so a complete SDNQ ZImagePipeline folder (model_index.json + transformer/ + text_encoder/ + tokenizer/ + vae/) is recognised on install and its submodels are wired up. Extend ZImageSDNQCheckpointModel to load the transformer from the subfolder using ZImageTransformer2DModel.from_config() so non-default architecture parameters (e.g. axes_lens [1536,512,512] in newer Z-Image Turbo SDNQ exports) are honoured instead of the single-file path's hardcoded [1024,512,512]. Verified end-to-end against Tongyi-MAI/Z-Image-Turbo-SDNQ-uint4-svd-r32: 269 quantized + 252 regular tensors load into a 6.15B-param model with 0 missing / 0 unexpected keys.

T5Encoder_SDNQ_Config originally only looked for text_encoder_2/ as a subfolder of mod.path, which works for standalone T5 bundles but misses the case where a parent FluxPipeline / similar config registers its T5 submodel with path_or_prefix pointing straight at the text_encoder_2 folder. Allow both layouts in both the config's detection logic and T5EncoderSDNQLoader's te_dir resolution. Verified end-to-end with Disty0/FLUX.1-schnell-SDNQ-uint4-svd-r32.

The diffusers→BFL state-dict converter renamed norm_out.linear.{weight,bias} to final_layer.adaLN_modulation.1.{weight,bias} but did not swap the two halves along dim 0. diffusers' AdaLayerNormContinuous packs the linear output as (scale, shift); BFL's LastLayer packs as (shift, scale). Without the swap, the final adaLN modulation runs with scale and shift permuted, which produces structured-but-very-noisy output for every pixel. Reuse the same pattern the FLUX.2 converter applies for the analogous adaLN_modulation key.

ZImageSDNQCheckpointModel only handled the Transformer submodel, so attempts to use an SDNQ ZImagePipeline as the "Qwen3 & VAE source model" (which triggers loads for TextEncoder / Tokenizer / VAE) crashed with "Only Transformer submodels are currently supported". Add per-submodel handlers that load text_encoder/ via sdnq_sd_loader into an empty Qwen3ForCausalLM (re-sharing lm_head with embed_tokens when tied), tokenizer/ via AutoTokenizer, and vae/ via AutoencoderKL.from_pretrained. The single-file SDNQ checkpoint path keeps its transformer-only behaviour but now raises a clearer error when asked for a different submodel.

Add support for SDNQ-quantized Flux2KleinPipeline folders, which mix uint4 and int5 dtypes across layers (chosen dynamically by SDNQ during quantization to stay under a per-group loss budget). Core changes: - Add INT5_ASYM quantization type + unpack_uint5 + dequantize_int5_per_group. Sign-extension matches Disty0/sdnq's unpack_int convention (raw 0..31 - 16). zero_point is optional (dynamic-mixed sometimes emits scale-only int5 tensors). - _infer_quantization_type now takes a per_tensor_dtype override; the loader builds an inverted map from quantization_config.json's modules_dtype_dict. - _get_original_shape uses the packed weight size as the authoritative source for in_features, fixing a bug where Klein 4B's group_size=64 layers were misread as group_size=128 (the previous fallback). Pipeline integration: - Add Main_SDNQ_Diffusers_Flux2_Config matching Flux2Pipeline / Flux2KleinPipeline folders with quantized transformer. - Flux2SDNQCheckpointModel now dispatches all pipeline submodels: transformer (Flux2Transformer2DModel.from_config + sdnq state dict), text_encoder (Qwen3ForCausalLM SDNQ + lm_head/embed_tokens tie), tokenizer (AutoTokenizer), vae (AutoencoderKLFlux2 / AutoencoderKL). - Extend flux2_klein_model_loader._validate_diffusers_format and the isFlux2DiffusersMainModelConfig FE filter to also accept SDNQ pipeline configs (when submodels is populated). Verified against Disty0/FLUX.2-klein-4B-SDNQ-4bit-dynamic: 98 uint4 + 2 int5 tensors load into a 3.88B-param Flux2Transformer2DModel with 0 missing / 0 unexpected keys; both dequant paths produce reasonable zero-centred weight distributions.

Main_Diffusers_Flux2_Config so identification routes them to the SDNQ configs instead. Without this both configs accept the folder and the plain diffusers loader wins, then crashes when reading packed uint8 weights as bf16.

diffusion_pytorch_model-{00001,00002}-of-00002.safetensors and FLUX.2 dev's sharded transformer both load. Detect cross-shard key collisions as a corruption signal.

"main_is_diffusers" in z_image_model_loader and flux2_klein_model_loader so the auto-extract-submodels branch handles them. Without this the loader demanded a separate VAE/Qwen3 source even though the SDNQ pipeline carries those submodels itself. - Drop the ui_model_format=Diffusers hint on Klein's qwen3_source_model field so the FE combobox can also show SDNQ pipeline configs (the FE filter already accepts them).

Loading the Klein 4B SDNQ pipeline as the main model errored with "No Qwen3 Encoder selected" in the UI even though the pipeline carries its own Qwen3 + VAE submodels, and the Model Manager showed no format badge at all on SDNQ models. - flux2_klein_model_loader now treats SDNQ-with-submodels as main_is_diffusers, so the auto-extract-submodels branch handles SDNQ pipelines exactly like plain diffusers. Drop the ui_model_format=Diffusers hint on qwen3_source_model so the combobox can also show SDNQ pipeline configs. - readiness.ts no longer demands a standalone VAE/Qwen3 for FLUX.2 Klein when the main model is itself a pipeline (diffusers or SDNQ-with-submodels). Without this the Invoke button stayed disabled with "Non-diffusers FLUX.2 Klein models require a standalone Qwen3 Encoder" even when the SDNQ pipeline could self-source everything. - Register sdnq_quantized in zModelFormat, the manually-edited OpenAPI schema, ModelFormatBadge, and MODEL_FORMAT_TO_LONG_NAME so SDNQ models render an "sdnq" badge instead of an empty placeholder.

- 4 new starter models covering all SDNQ pipelines verified end-to-end in this branch: FLUX.1 schnell, Z-Image Turbo, FLUX.2 Klein 4B (dynamic mixed), FLUX.2 Klein 9B (dynamic mixed + SVD). Each entry is self-contained (no separate encoder/VAE dependencies because the SDNQ pipeline folder bundles them). - New /configuration/sdnq-quantization/ page: support matrix, VRAM footprints, install steps (Starter Models + HF + Folder), LoRA compatibility notes, SDNQ-vs-SVDQuant/Nunchaku disambiguation, comparison with GGUF/NF4/FP8, troubleshooting. - Cross-link from fp8-storage.mdx's "no-op on quantized" caution.

Z-Image and Qwen3 SDNQ configs were missing `variant` (and `cpu_only` on Qwen3) fields that exist on the other variants of the same union, breaking TypeScript narrowing on the FE. - Main_SDNQ_ZImage_Config: add variant (default Turbo) - Main_SDNQ_Diffusers_ZImage_Config: add variant, detect from scheduler_config.json shift value - Qwen3Encoder_SDNQ_Config: add cpu_only + variant, detect from embed_tokens shape - Qwen3Encoder_SDNQ_Folder_Config: add cpu_only + variant, detect from config.json hidden_size - Regenerate FE schema.ts Discriminator tags are unchanged since variant has no default.

Pfannkuchensack added 17 commits May 23, 2026 23:14

- Merge multi-shard safetensors in sdnq_sd_loader so Klein 9B's

010a83f

diffusion_pytorch_model-{00001,00002}-of-00002.safetensors and FLUX.2 dev's sharded transformer both load. Detect cross-shard key collisions as a corruption signal.

Chore Fix Path

75ca01f

Pfannkuchensack requested review from JPPhoto, blessedcoolant, dunkeroni and lstein as code owners May 24, 2026 02:43

github-actions Bot added python PRs that change python files invocations PRs that change invocations backend PRs that change backend files frontend PRs that change frontend files python-tests PRs that change python tests docs PRs that change docs labels May 24, 2026

Pfannkuchensack added 3 commits May 24, 2026 19:49

Fix openapi schema.ts

5b1c658

Fix Path

acad055

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Adding Support for SD.Next Quantization Engine (SDNQ) (Flux1&Flux2klein4B/9B&Z-Image)#9228

feat: Adding Support for SD.Next Quantization Engine (SDNQ) (Flux1&Flux2klein4B/9B&Z-Image)#9228
Pfannkuchensack wants to merge 20 commits into
invoke-ai:mainfrom
Pfannkuchensack:feature/svd-quantization

Pfannkuchensack commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Pfannkuchensack commented May 24, 2026

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant