VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections

Large language models (LLMs) have recently emerged as powerful tools for tackling many language-processing tasks. Despite their success, training and fine-tuning these models is still far too computationally and memory intensive. In this paper, we identify and characterise the important components n...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-10
Hauptverfasser:	Miles, Roy, Reddy, Pradyumna, Ismail Elezi, Deng, Jiankang
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Back propagation Effectiveness Large language models Memory tasks Performance degradation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!