-
Notifications
You must be signed in to change notification settings - Fork 2k
Pull Request: Adding HiRA integration into PEFT library #2668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR to add HiRA to PEFT. The method looks promising and the provided code is already quite mature.
When I started reading the paper, I was at first reminded of FedPara, aka LoHa, which is already integrated into PEFT, as that method also relies on the Hadamard product. However, IIUC, the two methods are still distinct: HiRA basically corresponds to LoRA, but instead of adding dW, we multiply it. In that way, it is much closer to LoRA than to LoHa. Still, I wanted to flag this, as I'm not sure you are aware (your paper doesn't seem to be reference FedPara).
At the moment, I haven't done a full in-depth review, but I think that makes more sense once we have completed the next step.
I noticed that you have formatted some unrelated files in method_comparison
, could you please undo those changes? Usually, when you run make style
, that directory should not be included.
I think a good next step is to add HiRA to the testing matrix we have in PEFT. For now, let's add some entries similar to the ones you can find here:
peft/tests/test_custom_models.py
Lines 70 to 72 in 92d65ca
("Vanilla MLP 1 LoRA", "MLP", LoraConfig, {"target_modules": "lin0"}), | |
("Vanilla MLP 2 LoRA", "MLP", LoraConfig, {"target_modules": ["lin0"]}), | |
("Vanilla MLP 3 LoRA", "MLP", LoraConfig, {"target_modules": ["lin1"]}), |
Since you also support embedding and conv layers, please make sure to include examples with those layers as well (basically, copy the relevant examples from LoRA and adjust them).
Then, please run pytest tests/test_custom_models.py -k "hira and not shira" -v
and see if those tests pass. Once we get there, we can discuss the best next steps.
@@ -0,0 +1,55 @@ | |||
# Copyright 2023-present the HuggingFace Inc. team. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Copyright 2023-present the HuggingFace Inc. team. | |
# Copyright 2025-present the HuggingFace Inc. team. |
Let's update every date to 2025.
|
||
Args: | ||
r (`int`): | ||
HiRA r configuration (the "r"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a few more details here :)
|
||
|
||
@dataclass | ||
class HiRARuntimeConfig: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this, as it's something DoRA-specific. Please remove the class and all associated code and doc references.
@@ -136,6 +136,8 @@ def starcoder_model_postprocess_past_key_value(past_key_values): | |||
"qwen3": ["q_proj", "v_proj"], | |||
} | |||
|
|||
TRANSFORMERS_MODELS_TO_HIRA_TARGET_MODULES_MAPPING = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TRANSFORMERS_MODELS_TO_HIRA_TARGET_MODULES_MAPPING = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING | |
TRANSFORMERS_MODELS_TO_HIRA_TARGET_MODULES_MAPPING = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING.copy() |
@@ -0,0 +1,164 @@ | |||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing copyright notice.
Feature request
This request proposes integrating HiRA (Hadamard High-Rank Adaptation) as described in the ICLR 2025 oral paper (https://openreview.net/pdf?id=TwJrTz9cRS) (https://iclr.cc/virtual/2025/oral/31839) and implemented in the hqsiswiliam/hira repository into the core PEFT library. This will enable users to apply HiRA through the familiar
get_peft_model
API and benefit from its high-rank updates without adding any inference overhead.Motivation
General Motivation
PEFT methods like LoRA achieve parameter-efficient fine-tuning by injecting low-rank updates into pre-trained weights. While effective, purely low-rank adaptation can struggle to capture complex patterns in large language models.
1. Expressiveness grows with the rank
Empirically, increasing the LoRA rank in LLM training yields better downstream performance:
Higher LoRA rank correlates with improved task accuracy.
2. HiRA: Hadamard high-rank updates without extra parameters
HiRA sidesteps the expressiveness constraint by computing a Hadamard-enhanced update:
HiRA uses the Hadamard product to inject high-rank structure into the frozen weight matrix
3. Singular-value patterns
After training, HiRA exhibits a rich singular-value pattern, akin to full-rank fine-tuning (FFT), indicating its ability to model complex transformations without the expensive computational overhead:
HiRA’s singular-value distribution closely mirrors that of FFT.
4. Performance gains
Across commonsense reasoning benchmarks, HiRA outperforms LoRA and other PEFT baselines:
HiRA delivers notable accuracy improvements over baseline adapters.
5. No extra parameter or compute cost
Despite its high-rank behaviour, HiRA introduces no additional trainable parameters compared to LoRA:
HiRA matches LoRA’s GRAM usage and training hours.
6. Complementary with LoRA (HiLoRA)
Combining HiRA and LoRA into a hybrid “HiLoRA” setup yields even stronger results than either method alone:
HiLoRA leverages both low-rank and Hadamard high-rank updates for better expressiveness.
By integrating HiRA into PEFT, users gain richer adaptation capability without sacrificing the parameter efficiency and usability that PEFT provides.
Your contribution
We would be pleased to submit a pull request to integrate HiRA class implementation into the PEFT framework. We welcome any suggestions for alternative integration approaches and appreciate any guidance on best practices.