Advancing Parameter Efficiency in Fine-tuning via Representation Editing
Parameter Efficient Fine-Tuning (PEFT) techniques have drawn significant attention due to their ability to yield competitive results while updating only a small portion of the adjustable parameters. However, existing PEFT methods pose challenges in hyperparameter selection, such as choosing the rank...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Parameter Efficient Fine-Tuning (PEFT) techniques have drawn significant
attention due to their ability to yield competitive results while updating only
a small portion of the adjustable parameters. However, existing PEFT methods
pose challenges in hyperparameter selection, such as choosing the rank for LoRA
or Adapter, or specifying the length of soft prompts. To address these
challenges, we propose a novel fine-tuning approach for neural models, named
Representation EDiting (RED), which modifies the representations generated at
some layers through the application of scaling and biasing operations. While
existing PEFT methods still demonstrate over-parameterization that could
potentially undermine the generalization ability acquired from pre-training,
RED can substantially reduce the number of trainable parameters by a factor of
25, 700 compared to full parameter fine-tuning and by a factor of 32 relative
to LoRA. Remarkably, RED achieves results comparable or superior to both full
parameter fine-tuning and other PEFT methods. Extensive experiments across
various model architectures and scales, including RoBERTa, GPT-2, T5, and
LLaMA-2, have demonstrated the effectiveness and efficiency of RED1, thereby
positioning it as a promising PEFT strategy for large-scale neural models. |
---|---|
DOI: | 10.48550/arxiv.2402.15179 |