Merak: An Efficient Distributed DNN Training Framework With Automated 3D Parallelism for Giant Foundation Models

Foundation models are in the process of becoming the dominant deep learning technology. Pretraining a foundation model is always time-consuming due to the large scale of both the model parameter and training dataset. Besides being computing-intensive, the pretraining process is extremely memory- and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems 2023-05, Vol.34 (5), p.1466-1478
Hauptverfasser:	Lai, Zhiquan, Li, Shengwei, Tang, Xudong, Ge, Keshi, Liu, Weijie, Duan, Yabo, Qiao, Linbo, Li, Dongsheng
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Automation Computation Computational modeling Computer memory Critical path Data models Deep learning distributed systems foundation model training Mathematical models Parallel processing Parameters Pipelines Pipelining (computers) Resource utilization Solid modeling Tensors Three-dimensional displays Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!