Move built-in PTQ quantization configs to YAML#1423
Move built-in PTQ quantization configs to YAML#1423shengliangxu wants to merge 22 commits intomainfrom
Conversation
Have load_config return Pydantic-normalized values when schema_type or modelopt-schema is present, including typed recipe metadata and quantization config entries. Update recipe loading, docs, and unit tests for typed config objects and normalized quant_cfg handling. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Convert QuantizerCfgEntry into a ModeloptBaseConfig-backed Pydantic model with validation while preserving dict-style access for callers. Normalize schema-loaded quant_cfg snippets through model_dump, simplify quantizer cfg handling, and cover both dict and QuantizeConfig need_calibration inputs. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Update normalize_quant_cfg_list to accept dict entries, typed entries, and legacy dict formats while returning QuantizerCfgEntry objects. Preserve already parsed entries, handle implicit enable values in consumers, and cover mixed typed/dict inputs in tests. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Make ModeloptBaseConfig a MutableMapping and use Mapping/MutableMapping protocol checks for typed quantizer config entries and attributes. Convert predefined quantization recipes to QuantizeConfig objects while preserving dict-style callers and compatibility paths. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Cover normalization after mutating raw dict quantizer entries and schema-backed ModeloptBaseConfig entries. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1423 +/- ##
==========================================
- Coverage 76.91% 76.88% -0.04%
==========================================
Files 478 478
Lines 51434 51619 +185
==========================================
+ Hits 39563 39687 +124
- Misses 11871 11932 +61
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
What does this PR do?
Type of change: refactor
This PR is stacked on #1405 and only describes the changes added on top of that PR.
This PR moves the built-in PTQ quantization config definitions out of hard-coded Python dictionaries and into schema-backed YAML config files.
modelopt_recipes/configs/numerics/.modelopt_recipes/configs/ptq/presets/model/.modelopt_recipes/configs/ptq/presets/kv/.kv_fp8_affine,kv_nvfp4,kv_nvfp4_affine, andkv_nvfp4_rotate.modelopt.torch.quantization.configbuilt-in config constants to loadQuantizeConfigobjects from YAML withload_config(..., schema_type=QuantizeConfig)._load_quantize_configwrapper.Usage
Existing Python imports continue to work:
The built-in constants are still schema-backed
QuantizeConfigobjects with mapping-style access, but their definitions now come from YAML snippets and presets.Reusable YAML snippets can also be composed through
$import, for example:Testing
Local checks run:
$importreferencesgit diff --checkNot run locally:
python -m pytest ...because the local environment used for this branch did not havepytestinstalled.Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: N/AAdditional Information
This PR depends on #1405 because it relies on schema-backed config loading and typed
QuantizeConfigparsing introduced there.This PR intentionally excludes the schema/mapping changes from #1405 and focuses on converting built-in PTQ config definitions to YAML-backed presets and reusable snippets.