T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives

Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the number of devices increases. While some distributed techniques can overlap, and thus, hide this communi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-01
Hauptverfasser:	Pati, Suchita, Shaizeen Aga, Islam, Mahzabeen, Jayasena, Nuwan, Sinclair, Matthew D
Format:	Artikel
Sprache:	eng
Schlagworte:	Co-design Communication Hardware Large language models Software Tensors
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!