Skip to content

Add UniLoRA tuner to PEFT#2968

Closed
KaiyangLi1992 wants to merge 4 commits into
huggingface:mainfrom
KaiyangLi1992:unilora-submit
Closed

Add UniLoRA tuner to PEFT#2968
KaiyangLi1992 wants to merge 4 commits into
huggingface:mainfrom
KaiyangLi1992:unilora-submit

Conversation

@KaiyangLi1992

Copy link
Copy Markdown

Motivation

This PR adds UniLoRA, a LoRA-style parameter-efficient fine-tuning method
that introduces a unified parameterization for low-rank adaptations, enabling
further reductions in the number of trainable parameters while preserving
the standard PEFT workflow.

What's included

  • UniLoRA tuner implementation
  • Configuration class and registry integration
  • Save/load support
  • Unit tests

Checklist

  • Compatible with existing PEFT APIs
  • Backward-compatible
  • Tests added

@githubnemo githubnemo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @KaiyangLi1992, thanks for the PR :)

  • Most general: let's rename UniLoRA* to UniLora which makes it easier to remember how the method is typed in code (to be consistent with LoraModel and friends).
  • I noticed that the copyright notice says 2024, let's use the correct starting year in all newly introduced files: 2025.
  • Before pushing changes it is always good to run make style to correct any style issues automatically, otherwise the CI will not be happy

It is good to see that you've already added some tests. Let's extend those by adding UniLoRA to the TEST_CASES list in tests/test_custom_models.py - this will already give quite a bit of coverage. You can check the results by running:

pytest tests/test_custom_models.py

If those tests run we can extend the tests to test_decoder_models.py and test_encoder_decoder_models.py in a similar fashion.

For this to be mergable we also need documentation in docs/source/package_reference/unilora.md (added to the _toctree.md).

Comment thread src/peft/tuners/unilora/__init__.py Outdated
@@ -0,0 +1,23 @@
# Copyright 2024-present the HuggingFace Inc. team.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright 2024-present the HuggingFace Inc. team.
# Copyright 2025-present the HuggingFace Inc. team.

Comment thread src/peft/tuners/unilora/config.py Outdated
Comment on lines +119 to +123
"help": (
"Names or patterns of modules to apply UniLoRA to. Accepts a list of "
"module name suffixes, a regex string, or the special value "
"'all-linear' to match all Linear/Conv1D layers except the output layer."
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just copy the documentation from the docstring above for the help string of the config values.

Comment thread src/peft/tuners/unilora/layer.py
Comment thread src/peft/tuners/unilora/layer.py Outdated
Comment thread src/peft/tuners/unilora/layer.py Outdated
Comment thread src/peft/tuners/unilora/model.py Outdated
Comment on lines +171 to +196
def get_nb_savable_parameters(self, adapter="default") -> tuple[int, int]:
"""
Returns the number of savable Uni-LoRA parameters and other savable parameters.
"""
theta_d_params = 0
other_params = 0
for name, param in self.named_parameters():
if "unilora_theta_d" in name:
theta_d_params += param.numel()
elif "unilora_indices" in name:
other_params += param.numel()
elif "unilora_scales" in name:
other_params += param.numel()

unilora_params = theta_d_params
return unilora_params, other_params

def print_savable_parameters(self) -> None:
"""
Prints the number of savable Uni-LoRA parameters and total savable parameters.
"""
unilora_params, other_params = self.get_nb_savable_parameters()
print(
f"Uni-LoRA params to-be-saved (float32-equivalent): {unilora_params:,d} "
f"|| total params to-be-saved: {(unilora_params + other_params):,d}"
) No newline at end of file

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these two functions ( get_nb_savable_parameters, print_savable_parameters) have a particular use or are they for debugging? If it is the latter, let's remove them.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still open (get_nb... is removed but print_saveable_parameters still uses it).

Comment thread src/peft/tuners/unilora/model.py Outdated
Comment thread src/peft/tuners/unilora/model.py Outdated
Comment thread src/peft/tuners/unilora/model.py Outdated
Comment on lines +69 to +70
for module, (scale_a, scale_b) in zip(uni_modules, zip(*[iter(norm_factors)] * 2)):
module.update_norm(adapter_name, scale_a, scale_b)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this mean that scale_a == scale_b? Let's simplify the zip() statement then and only pass scale for both scale params. If the user wants to experiment with this setting they can set the scale attribute on the layers manually.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! For this part specifically, scale_a and scale_b are not necessarily equal — the current design allows them to differ. So we can’t simplify it to a single scale here.
All other suggestions are accepted, thanks!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand. How can scale_a and scale_b differ? They're duplicated from the [iter(norm_factors)]*2 statement.

Comment thread src/peft/tuners/unilora/model.py Outdated
@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@KaiyangLi1992

Copy link
Copy Markdown
Author

Thanks for the helpful suggestion! I have updated the relevant code accordingly.

I ran:
pytest tests/test_custom_models.py -k UniLora
pytest tests/test_decoder_models.py -k UniLora
pytest tests/test_encoder_decoder_models.py -k UniLora

All UniLoRA-related tests passed successfully.

Please let me know if further adjustments are needed!

@shihaoji

shihaoji commented Feb 9, 2026

Copy link
Copy Markdown

Can some reviewers review this merge request? It has been in pending status for 3 weeks.

@githubnemo githubnemo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @KaiyangLi1992,

thanks for nicely integrating the requested changes. I've flagged some comments where it seemed that they've passed under the radar and added some new ones.

Please make sure that the branch is up-to-date with main and to run make style to run the code formatter.

Once all changes are integrated it would make sense to add an experiment for Uni-LoRA in the MetaMathQA test suite, maybe using method_comparison/MetaMathQA/experiments/lora/* as a starting point. Let's also add a quick example on how to use UniLoRA, e.g. using the examples/miss_finetuning as an example.

@@ -0,0 +1,153 @@
# Copyright 2024-present the HuggingFace Inc. team.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright 2024-present the HuggingFace Inc. team.
# Copyright 2025-present the HuggingFace Inc. team.

@@ -0,0 +1,293 @@
# Copyright 2024-present the HuggingFace Inc. team.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright 2024-present the HuggingFace Inc. team.
# Copyright 2025-present the HuggingFace Inc. team.

# List all names of layers that may contain adapter weights
# unilora_theta_d is a shared parameter.
#But it is referenced within individual layers.
adapter_layer_names = ("unilora_theta_d",)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still open as far as I can see

Comment on lines +50 to +59
base_layer = self.get_base_layer()
if isinstance(base_layer, nn.Linear):
in_features, out_features = base_layer.in_features, base_layer.out_features
elif isinstance(base_layer, Conv1D):
in_features, out_features = (
base_layer.weight.ds_shape if hasattr(base_layer.weight, "ds_shape") else base_layer.weight.shape
)

self.in_features = in_features
self.out_features = out_features

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still open.

Comment on lines +124 to +125
"""
Updates the scaling factors.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring currently has no explanatory value. Let's remove or improve it.

Comment on lines +80 to +91
import numpy as np
total_length = lora_param_count
num_unique = theta_d_length
base_count = total_length // num_unique
remaining = total_length % num_unique
rng = np.random.default_rng(proj_seed)
data = np.repeat(np.arange(num_unique), base_count)
if remaining > 0:
extras = rng.choice(num_unique, size=remaining, replace=False)
data = np.concatenate([data, extras])
rng.shuffle(data)
return torch.tensor(data)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, isn't generate_index basically np.random.choice(np.arange(theta_d_length), size=lora_param_count)?

I can see that results might differ for low lora_param_count values (<10k) but asymptotically it should be equal?

If we decide not to use choice(), let's document why we sample the values this way.

Comment thread src/peft/tuners/unilora/model.py Outdated
Comment on lines +69 to +70
for module, (scale_a, scale_b) in zip(uni_modules, zip(*[iter(norm_factors)] * 2)):
module.update_norm(adapter_name, scale_a, scale_b)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand. How can scale_a and scale_b differ? They're duplicated from the [iter(norm_factors)]*2 statement.

Comment on lines +139 to +174
def _ensure_device(self, adapter):
"""
Ensure all UniLoRA buffers/params for the given adapter are on the same device as base_layer.
This is lazy-migration (only happens if device mismatch is detected).
"""
# get target device from base_layer
device = next(self.base_layer.parameters()).device

# ---- indices ----
if adapter in self.unilora_indices_A:
t = self.unilora_indices_A[adapter]
if t.device != device:
self.unilora_indices_A[adapter] = t.to(device)

if adapter in self.unilora_indices_B:
t = self.unilora_indices_B[adapter]
if t.device != device:
self.unilora_indices_B[adapter] = t.to(device)

# ---- scales ----
if adapter in self.unilora_scales_A:
t = self.unilora_scales_A[adapter]
if t.device != device:
self.unilora_scales_A[adapter] = t.to(device)

if adapter in self.unilora_scales_B:
t = self.unilora_scales_B[adapter]
if t.device != device:
self.unilora_scales_B[adapter] = t.to(device)

# ---- theta_d ---- (ParameterDict, but ensure consistency)
if adapter in self.unilora_theta_d:
p = self.unilora_theta_d[adapter]
if p.device != device:
# Parameter migration: need .data to avoid creating new graph edges
self.unilora_theta_d[adapter].data = p.data.to(device)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think _ensure_device can be replaced with self._move_adapter_to_device_of_base_layer(adapter) if all params are registered in adapter_layer_names or other_param_names in the layer class.

def _init_unilora_theta_d(self, config: UniLoraConfig, adapter_name: str) -> None:
unilora_theta_d = torch.zeros(config.theta_d_length)
torch.nn.init.uniform_(unilora_theta_d, -config.init_theta_d_bound, config.init_theta_d_bound)
self.unilora_theta_d[adapter_name] = unilora_theta_d

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_init_unilora_theta_d should respect the config.init_weights flag and leave the weights random if init_weights is False.

Comment on lines +187 to +190
elif config.peft_type == PeftType.UNILORA:
to_return = {}
to_return["base_model.unilora_theta_d." + adapter_name] = state_dict["base_model.unilora_theta_d." + adapter_name]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give the user the option to save the indices alongside the theta weights. This will allow the user to use a saved adapter even though the index sampling has changed for some reason.

VeRA (a few lines below this) can be used as inspiration.

@github-actions

Copy link
Copy Markdown

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@github-actions github-actions Bot closed this Mar 23, 2026
@githubnemo githubnemo reopened this Mar 24, 2026
@githubnemo

Copy link
Copy Markdown
Collaborator

gentle ping @KaiyangLi1992

@KaiyangLi1992

Copy link
Copy Markdown
Author

gentle ping @KaiyangLi1992
Apologies for the delay as I have been preparing a new submission for NeurIPS. I will fix my pull request immediately after the deadline. With the assistance of AI agents, I am confident that these issues will be resolved very quickly. Thank you for reopening this PR!

@github-actions

Copy link
Copy Markdown

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@github-actions github-actions Bot closed this Apr 27, 2026
@KaiyangLi1992

Copy link
Copy Markdown
Author

Hi @githubnemo , apologies again for the delay. I now have time to continue this PR.

I have updated the branch against the latest main, addressed the remaining review comments, and ran make style / make quality plus the UniLora-related tests. I also added the requested documentation/example/benchmark pieces.

Could you please reopen the PR when you have a chance? If you prefer a fresh PR instead, I can open a new one and reference this PR for review history. Thank you!

@githubnemo

Copy link
Copy Markdown
Collaborator

Hi @KaiyangLi1992, unfortunately I can't reopen this PR, github doesn't let me - it seems you force-pushed or re-created the branch when the PR was already closed. The usual workaround for this is to reset your local branch to the commit that was last used in the PR (879e58d), push and then I can re-open the PR (as described here). This would let us keep the PR's history which would make reviewing easier.

Alternatively you can open a new PR with an updated description and link it here.

@KaiyangLi1992

Copy link
Copy Markdown
Author

Hi @KaiyangLi1992, unfortunately I can't reopen this PR, github doesn't let me - it seems you force-pushed or re-created the branch when the PR was already closed. The usual workaround for this is to reset your local branch to the commit that was last used in the PR (879e58d), push and then I can re-open the PR (as described here). This would let us keep the PR's history which would make reviewing easier.

Alternatively you can open a new PR with an updated description and link it here.

Hi @githubnemo, thanks for the suggestion.

I opened a fresh PR here: #3257 .

The new PR is rebased on the latest main and includes the remaining review follow-ups from this PR, including the docs, tests, example, MetaMathQA config, deterministic index generation documentation, init_weights, and optional index saving.

Thanks again for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants