Structure-Aware Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

With the growing scale of pre-trained language models (PLMs), full parameter fine-tuning becomes prohibitively expensive and practically infeasible. Therefore, parameter-efficient adaptation techniques for PLMs have been proposed to learn through incremental updates of pre-trained weights, such as i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mathematics (Basel) 2023-10, Vol.11 (20), p.4317
Hauptverfasser:	Hu, Yahao, Xie, Yifei, Wang, Tianfeng, Chen, Man, Pan, Zhisong
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Computational linguistics intrinsic rank Language processing low-rank adaptation Mathematical models Modules Natural language interfaces Optimization parameter-efficient fine-tuning Parameters Performance evaluation pre-trained language models Random variables Rankings Training training efficiency Triplets
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	With the growing scale of pre-trained language models (PLMs), full parameter fine-tuning becomes prohibitively expensive and practically infeasible. Therefore, parameter-efficient adaptation techniques for PLMs have been proposed to learn through incremental updates of pre-trained weights, such as in low-rank adaptation (LoRA). However, LoRA relies on heuristics to select the modules and layers to which it is applied, and assigns them the same rank. As a consequence, any fine-tuning that ignores the structural information between modules and layers is suboptimal. In this work, we propose structure-aware low-rank adaptation (SaLoRA), which adaptively learns the intrinsic rank of each incremental matrix by removing rank-0 components during training. We conduct comprehensive experiments using pre-trained models of different scales in both task-oriented (GLUE) and task-agnostic (Yelp and GYAFC) settings. The experimental results show that SaLoRA effectively captures the structure-aware intrinsic rank. Moreover, our method consistently outperforms LoRA without significantly compromising training efficiency.
ISSN:	2227-7390 2227-7390
DOI:	10.3390/math11204317