Add UniLoRA tuner to PEFT by KaiyangLi1992 · Pull Request #2968 · huggingface/peft

KaiyangLi1992 · 2025-12-23T16:14:29Z

Motivation

This PR adds UniLoRA, a LoRA-style parameter-efficient fine-tuning method
that introduces a unified parameterization for low-rank adaptations, enabling
further reductions in the number of trainable parameters while preserving
the standard PEFT workflow.

What's included

UniLoRA tuner implementation
Configuration class and registry integration
Save/load support
Unit tests

Checklist

Compatible with existing PEFT APIs
Backward-compatible
Tests added

githubnemo

Hey @KaiyangLi1992, thanks for the PR :)

Most general: let's rename UniLoRA* to UniLora which makes it easier to remember how the method is typed in code (to be consistent with LoraModel and friends).
I noticed that the copyright notice says 2024, let's use the correct starting year in all newly introduced files: 2025.
Before pushing changes it is always good to run make style to correct any style issues automatically, otherwise the CI will not be happy

It is good to see that you've already added some tests. Let's extend those by adding UniLoRA to the TEST_CASES list in tests/test_custom_models.py - this will already give quite a bit of coverage. You can check the results by running:

pytest tests/test_custom_models.py

If those tests run we can extend the tests to test_decoder_models.py and test_encoder_decoder_models.py in a similar fashion.

For this to be mergable we also need documentation in docs/source/package_reference/unilora.md (added to the _toctree.md).

githubnemo · 2026-01-14T09:30:14Z

@@ -0,0 +1,23 @@
+# Copyright 2024-present the HuggingFace Inc. team.


Suggested change

# Copyright 2024-present the HuggingFace Inc. team.

# Copyright 2025-present the HuggingFace Inc. team.

githubnemo · 2026-01-14T10:03:37Z

+            "help": (
+                "Names or patterns of modules to apply UniLoRA to. Accepts a list of "
+                "module name suffixes, a regex string, or the special value "
+                "'all-linear' to match all Linear/Conv1D layers except the output layer."
+            )


You can just copy the documentation from the docstring above for the help string of the config values.

githubnemo · 2026-01-14T13:11:12Z

+    def get_nb_savable_parameters(self, adapter="default") -> tuple[int, int]:
+        """
+        Returns the number of savable Uni-LoRA parameters and other savable parameters.
+        """
+        theta_d_params = 0
+        other_params = 0
+        for name, param in self.named_parameters():
+            if "unilora_theta_d" in name:
+                theta_d_params += param.numel()
+            elif "unilora_indices" in name:
+                other_params += param.numel()
+            elif "unilora_scales" in name:
+                other_params += param.numel()
+
+        unilora_params = theta_d_params 
+        return unilora_params, other_params
+
+    def print_savable_parameters(self) -> None:
+        """
+        Prints the number of savable Uni-LoRA parameters and total savable parameters.
+        """
+        unilora_params, other_params = self.get_nb_savable_parameters()
+        print(
+            f"Uni-LoRA params to-be-saved (float32-equivalent): {unilora_params:,d} "
+            f"|| total params to-be-saved: {(unilora_params + other_params):,d}"
+        )


Do these two functions ( get_nb_savable_parameters, print_savable_parameters) have a particular use or are they for debugging? If it is the latter, let's remove them.

This is still open (get_nb... is removed but print_saveable_parameters still uses it).

githubnemo · 2026-01-14T15:55:11Z

+        for module, (scale_a, scale_b) in zip(uni_modules, zip(*[iter(norm_factors)] * 2)):
+            module.update_norm(adapter_name, scale_a, scale_b)


doesn't this mean that scale_a == scale_b? Let's simplify the zip() statement then and only pass scale for both scale params. If the user wants to experiment with this setting they can set the scale attribute on the layers manually.

Thanks for the suggestion! For this part specifically, scale_a and scale_b are not necessarily equal — the current design allows them to differ. So we can’t simplify it to a single scale here.
All other suggestions are accepted, thanks!

Sorry, I don't understand. How can scale_a and scale_b differ? They're duplicated from the [iter(norm_factors)]*2 statement.

review-notebook-app · 2026-01-21T18:24:33Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

KaiyangLi1992 · 2026-01-21T18:31:27Z

Thanks for the helpful suggestion! I have updated the relevant code accordingly.

I ran:
pytest tests/test_custom_models.py -k UniLora
pytest tests/test_decoder_models.py -k UniLora
pytest tests/test_encoder_decoder_models.py -k UniLora

All UniLoRA-related tests passed successfully.

Please let me know if further adjustments are needed!

shihaoji · 2026-02-09T00:55:20Z

Can some reviewers review this merge request? It has been in pending status for 3 weeks.

githubnemo

Hey @KaiyangLi1992,

thanks for nicely integrating the requested changes. I've flagged some comments where it seemed that they've passed under the radar and added some new ones.

Please make sure that the branch is up-to-date with main and to run make style to run the code formatter.

Once all changes are integrated it would make sense to add an experiment for Uni-LoRA in the MetaMathQA test suite, maybe using method_comparison/MetaMathQA/experiments/lora/* as a starting point. Let's also add a quick example on how to use UniLoRA, e.g. using the examples/miss_finetuning as an example.

githubnemo · 2026-02-17T16:21:12Z

@@ -0,0 +1,153 @@
+# Copyright 2024-present the HuggingFace Inc. team.


Suggested change

# Copyright 2024-present the HuggingFace Inc. team.

# Copyright 2025-present the HuggingFace Inc. team.

githubnemo · 2026-02-17T16:21:26Z

@@ -0,0 +1,293 @@
+# Copyright 2024-present the HuggingFace Inc. team.


Suggested change

# Copyright 2024-present the HuggingFace Inc. team.

# Copyright 2025-present the HuggingFace Inc. team.

githubnemo · 2026-02-17T16:22:01Z

+    # List all names of layers that may contain adapter weights
+    # unilora_theta_d is a shared parameter.
+    #But it is referenced within individual layers.
+    adapter_layer_names = ("unilora_theta_d",)


This is still open as far as I can see

githubnemo · 2026-02-17T16:23:17Z

+        base_layer = self.get_base_layer()
+        if isinstance(base_layer, nn.Linear):
+            in_features, out_features = base_layer.in_features, base_layer.out_features
+        elif isinstance(base_layer, Conv1D):
+            in_features, out_features = (
+                base_layer.weight.ds_shape if hasattr(base_layer.weight, "ds_shape") else base_layer.weight.shape
+            )
+
+        self.in_features = in_features
+        self.out_features = out_features


This is still open.

githubnemo · 2026-02-17T16:27:03Z

+        """
+        Updates the scaling factors. 


The docstring currently has no explanatory value. Let's remove or improve it.

githubnemo · 2026-02-17T17:24:56Z

+        import numpy as np
+        total_length = lora_param_count
+        num_unique = theta_d_length
+        base_count = total_length // num_unique
+        remaining = total_length % num_unique
+        rng = np.random.default_rng(proj_seed)
+        data = np.repeat(np.arange(num_unique), base_count)
+        if remaining > 0:
+            extras = rng.choice(num_unique, size=remaining, replace=False)
+            data = np.concatenate([data, extras])
+        rng.shuffle(data)
+        return torch.tensor(data)


I wonder, isn't generate_index basically np.random.choice(np.arange(theta_d_length), size=lora_param_count)?

I can see that results might differ for low lora_param_count values (<10k) but asymptotically it should be equal?

If we decide not to use choice(), let's document why we sample the values this way.

githubnemo · 2026-02-17T17:38:04Z

+        for module, (scale_a, scale_b) in zip(uni_modules, zip(*[iter(norm_factors)] * 2)):
+            module.update_norm(adapter_name, scale_a, scale_b)


Sorry, I don't understand. How can scale_a and scale_b differ? They're duplicated from the [iter(norm_factors)]*2 statement.

githubnemo · 2026-02-17T17:47:52Z

+    def _ensure_device(self, adapter):
+        """
+        Ensure all UniLoRA buffers/params for the given adapter are on the same device as base_layer.
+        This is lazy-migration (only happens if device mismatch is detected).
+        """
+        # get target device from base_layer
+        device = next(self.base_layer.parameters()).device
+
+        # ---- indices ----
+        if adapter in self.unilora_indices_A:
+            t = self.unilora_indices_A[adapter]
+            if t.device != device:
+                self.unilora_indices_A[adapter] = t.to(device)
+
+        if adapter in self.unilora_indices_B:
+            t = self.unilora_indices_B[adapter]
+            if t.device != device:
+                self.unilora_indices_B[adapter] = t.to(device)
+
+        # ---- scales ----
+        if adapter in self.unilora_scales_A:
+            t = self.unilora_scales_A[adapter]
+            if t.device != device:
+                self.unilora_scales_A[adapter] = t.to(device)
+
+        if adapter in self.unilora_scales_B:
+            t = self.unilora_scales_B[adapter]
+            if t.device != device:
+                self.unilora_scales_B[adapter] = t.to(device)
+
+        # ---- theta_d ---- (ParameterDict, but ensure consistency)
+        if adapter in self.unilora_theta_d:
+            p = self.unilora_theta_d[adapter]
+            if p.device != device:
+                # Parameter migration: need .data to avoid creating new graph edges
+                self.unilora_theta_d[adapter].data = p.data.to(device)


I think _ensure_device can be replaced with self._move_adapter_to_device_of_base_layer(adapter) if all params are registered in adapter_layer_names or other_param_names in the layer class.

githubnemo · 2026-02-17T17:52:53Z

+    def _init_unilora_theta_d(self, config: UniLoraConfig, adapter_name: str) -> None:
+        unilora_theta_d = torch.zeros(config.theta_d_length)
+        torch.nn.init.uniform_(unilora_theta_d, -config.init_theta_d_bound, config.init_theta_d_bound)
+        self.unilora_theta_d[adapter_name] = unilora_theta_d


_init_unilora_theta_d should respect the config.init_weights flag and leave the weights random if init_weights is False.

githubnemo · 2026-02-17T17:57:40Z

+    elif config.peft_type == PeftType.UNILORA:
+        to_return = {}
+        to_return["base_model.unilora_theta_d." + adapter_name] = state_dict["base_model.unilora_theta_d." + adapter_name]
+


Let's give the user the option to save the indices alongside the theta weights. This will allow the user to use a saved adapter even though the index sampling has changed for some reason.

VeRA (a few lines below this) can be used as inspiration.

github-actions · 2026-03-14T15:07:17Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

githubnemo · 2026-03-24T15:22:04Z

gentle ping @KaiyangLi1992

KaiyangLi1992 · 2026-03-25T20:54:41Z

gentle ping @KaiyangLi1992
Apologies for the delay as I have been preparing a new submission for NeurIPS. I will fix my pull request immediately after the deadline. With the assistance of AI agents, I am confident that these issues will be resolved very quickly. Thank you for reopening this PR!

github-actions · 2026-04-19T15:14:45Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

KaiyangLi1992 · 2026-05-13T03:25:00Z

Hi @githubnemo , apologies again for the delay. I now have time to continue this PR.

I have updated the branch against the latest main, addressed the remaining review comments, and ran make style / make quality plus the UniLora-related tests. I also added the requested documentation/example/benchmark pieces.

Could you please reopen the PR when you have a chance? If you prefer a fresh PR instead, I can open a new one and reference this PR for review history. Thank you!

githubnemo · 2026-05-18T10:46:28Z

Hi @KaiyangLi1992, unfortunately I can't reopen this PR, github doesn't let me - it seems you force-pushed or re-created the branch when the PR was already closed. The usual workaround for this is to reset your local branch to the commit that was last used in the PR (879e58d), push and then I can re-open the PR (as described here). This would let us keep the PR's history which would make reviewing easier.

Alternatively you can open a new PR with an updated description and link it here.

KaiyangLi1992 · 2026-05-23T22:56:35Z

Hi @KaiyangLi1992, unfortunately I can't reopen this PR, github doesn't let me - it seems you force-pushed or re-created the branch when the PR was already closed. The usual workaround for this is to reset your local branch to the commit that was last used in the PR (879e58d), push and then I can re-open the PR (as described here). This would let us keep the PR's history which would make reviewing easier.

Alternatively you can open a new PR with an updated description and link it here.

Hi @githubnemo, thanks for the suggestion.

I opened a fresh PR here: #3257 .

The new PR is rebased on the latest main and includes the remaining review follow-ups from this PR, including the docs, tests, example, MetaMathQA config, deterministic index generation documentation, init_weights, and optional index saving.

Thanks again for the review!

Add UniLoRA tuner to PEFT

ea3fb8e

KaiyangLi1992 mentioned this pull request Dec 23, 2025

Integration into PEFT KaiyangLi1992/Uni-LoRA#3

Closed

BenjaminBossan requested a review from githubnemo January 5, 2026 14:10

githubnemo reviewed Jan 14, 2026

View reviewed changes

KaiyangLi1992 added 2 commits January 21, 2026 13:22

fix: update unilora implementation

3ccc225

remove unintended files from commit

435385c

Merge branch 'main' into unilora-submit

879e58d

githubnemo reviewed Feb 17, 2026

View reviewed changes

github-actions Bot closed this Mar 23, 2026

githubnemo reopened this Mar 24, 2026

github-actions Bot closed this Apr 27, 2026

KaiyangLi1992 mentioned this pull request May 23, 2026

Add UniLora tuner to PEFT #3257

Open

		@@ -0,0 +1,23 @@
		# Copyright 2024-present the HuggingFace Inc. team.

	# Copyright 2024-present the HuggingFace Inc. team.
	# Copyright 2025-present the HuggingFace Inc. team.

		for module, (scale_a, scale_b) in zip(uni_modules, zip([iter(norm_factors)] 2)):
		module.update_norm(adapter_name, scale_a, scale_b)

		@@ -0,0 +1,153 @@
		# Copyright 2024-present the HuggingFace Inc. team.

		@@ -0,0 +1,293 @@
		# Copyright 2024-present the HuggingFace Inc. team.

Conversation

KaiyangLi1992 commented Dec 23, 2025

Motivation

What's included

Checklist

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

review-notebook-app Bot commented Jan 21, 2026

Uh oh!

KaiyangLi1992 commented Jan 21, 2026

Uh oh!

shihaoji commented Feb 9, 2026

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 14, 2026

Uh oh!

githubnemo commented Mar 24, 2026

Uh oh!

KaiyangLi1992 commented Mar 25, 2026

Uh oh!

github-actions Bot commented Apr 19, 2026

Uh oh!

KaiyangLi1992 commented May 13, 2026

Uh oh!

githubnemo commented May 18, 2026

Uh oh!

KaiyangLi1992 commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants