fix(save_and_load): save base_layer.bias for bias="lora_only"#3307
Open
Anai-Guo wants to merge 1 commit into
Open
fix(save_and_load): save base_layer.bias for bias="lora_only"#3307Anai-Guo wants to merge 1 commit into
Anai-Guo wants to merge 1 commit into
Conversation
When a tuner targets a layer, the original module is wrapped and its bias lives at <module>.base_layer.bias. The previous key reconstruction produced <module>.bias, so get_peft_model_state_dict and save_pretrained silently dropped the trained bias for bias="lora_only" (and bias="boft_only"), breaking adapter round-trips. Check the base_layer.bias name as well, keeping the legacy <module>.bias name for backward compatibility. Fixes huggingface#3306
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
LoraConfig(bias="lora_only")correctly marks a targeted layer's bias as trainable, butget_peft_model_state_dict()/save_pretrained()silently drop it, so a reloaded adapter does not reproduce the trained outputs.Fixes #3306
Why
For a targeted layer the original module is wrapped, and its bias lives at
<module>.base_layer.bias. The bias-key reconstruction was:but the real trained key is
base_model.model.proj.base_layer.bias, so the lookup never matched and the bias was excluded from the saved state dict. (bias="all"works because it keys off the substring"bias"directly.)Fix
Look up
<prefix>base_layer.bias(current tuner-layer structure) in addition to the legacy<prefix>bias. The same one-line pattern was applied to the identicalbias="boft_only"branch for BOFT.Test
Added
TestLoraInitialization::test_lora_only_bias_is_saved_and_reloaded, which perturbs all trainable params (incl. the base_layer bias), asserts the bias key is present in bothget_peft_model_state_dict()output and the savedadapter_model.safetensors, and checks that reloading reproduces the pre-save output.Reproducer from the issue now round-trips with max diff
0.0(was1.21).🤖 Generated with Claude Code