Guard _convert_peft_config_moe against regex-string target_modules#3232
Guard _convert_peft_config_moe against regex-string target_modules#32321fanwang wants to merge 1 commit into
Conversation
`_convert_peft_config_moe` called `set(peft_config.target_modules or [])`
unconditionally. When a user passes `target_modules` as a regex string
(e.g. ms-swift's `get_multimodal_target_regex` emits something like
`^(thinker\.model(?=\.).*\.(q_proj|k_proj|...))$` for multimodal MoE
models such as Qwen3-Omni), the `set()` iterates the string character
by character. That produces a nonsense set of regex metacharacters and
letters and fails downstream with:
ValueError: Target modules {'i', 'e', 'p', 'q', 't', '(', 'o', '|',
'.', ')', 'h', '_', 'j', 'r', '^', '\\', '$', 'd', 'k', '?', 'v',
'n', 'm', 'l', '=', '*'} not found in the base model.
The MoE conversion loop further down assumes module-name granularity,
so a regex pattern can't be remapped automatically. Detect the regex
case explicitly, emit a UserWarning that points users at the v5
module names, and return without touching the config.
Fixes huggingface#3229.
|
@1fanwang Thanks for providing a PR to fix the problem. In the future, please ask first if you can work on the issue (or instruct the agent to do that) because otherwise, this may create duplicate work. We may close PRs without inspection if they're already being worked on by someone else. As for your solution: I wonder if we can use |
|
Apologies on the duplicate-work front — happy to wait for a green light before drafting next time. On # convert_peft_config_for_transformers, transformers_weight_conversion.py L444
_convert_peft_config_moe(peft_config, model_type, model=model)then inside Two things made me back off:
If you'd prefer the resolve-via-model path I'm happy to switch — option (2) above (regex stays, only parameters/patterns updated) feels safer than bailing, but it's a bigger diff than this PR. Want me to take that direction in this PR or stack it as a follow-up after the bail lands? |
Yes, passing the model directly (instead of
There is only one call site, you can consider this to be "private".
I don't get what you mean by that. Can you give an example? |
|
One more thing, if we allow strings here, we also need to take care of |
Closes #3229.
_convert_peft_config_moecallsset(peft_config.target_modules or [])unconditionally. Whentarget_modulesis a regex string (e.g. ms-swift'sget_multimodal_target_regexproduces^(thinker\.model(?=\.).*\.(q_proj|k_proj|v_proj|...))$for Qwen3-Omni),set(str)splits the regex into individual characters, then PEFT's downstream module-matching fails with:The MoE conversion loop assumes module-name granularity — there's no sensible way to rewrite a regex into per-module mappings. The fix emits a
UserWarningpointing at the v5 module names whentarget_modulesis astr, and returns early. Callers using regextarget_modulesget a clear signal instead of a confusing downstream error.Tests
Two parametrized tests in
TestTransformersV5:^.*q_proj$, an empty regex) — all warn + skip the conversiontarget_modulescontinue to flow through normallyBoth fail on
main(the regex tests hit the bug path without warning); both pass with the fix.