Low-Rank Interconnected Adaptation across Layers
Low-rank adaptation (LoRA) is a powerful parameter-efficient fine-tuning method that utilizes low-rank projectors $A$ and $B$ to learn weight updates $\Delta W$ for adaptation targets $W$. Previous research has shown that LoRA is essentially a gradient compressor, performing random projections on th...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Low-rank adaptation (LoRA) is a powerful parameter-efficient fine-tuning
method that utilizes low-rank projectors $A$ and $B$ to learn weight updates
$\Delta W$ for adaptation targets $W$. Previous research has shown that LoRA is
essentially a gradient compressor, performing random projections on the
gradient using a fixed projection matrix $A_0$. However, this setup restricts
the overall weight update to be low-rank, which limits the adaptation
performance. In this paper, we propose low-rank interconnected adaptation
across layers (Lily). Specifically, we employ a hierarchical framework where
low-dimensional projectors (LPs) retained for downward projection at a
particular level, while globally-shared high-dimensional projector (HP) experts
perform upward projection across all levels of layers. Lily uniquely connects
each LP to all HP experts, therefore the gradient projections are no longer
dominated by fixed projection matrices, but rather by selective combinations of
all the projectors, thereby breaking the low-rank constraint of LoRA.
Furthermore, Lily's cross-layer connections facilitate the capture of intricate
information and dependencies across different layers, thereby enhancing the
model's representational capabilities. Experiments across various modalities,
architectures, and model sizes underscore Lily's great performance and
efficiency. Code is available on github https://github.com/yibozhong/lily. |
---|---|
DOI: | 10.48550/arxiv.2407.09946 |