eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models

Since Large Language Models or LLMs have demonstrated high-quality performance on many complex language tasks, there is a great interest in bringing these LLMs to mobile devices for faster responses and better privacy protection. However, the size of LLMs (i.e., billions of parameters) requires high...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Cho, Minsik, Vahid, Keivan A, Fu, Qichen, Adya, Saurabh, Del Mundo, Carlo C, Rastegari, Mohammad, Naik, Devang, Zatloukal, Peter
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!