On Optimizing the Communication of Model Parallelism

We study a novel and important communication pattern in large-scale model-parallel deep learning (DL), which we call cross-mesh resharding. This pattern emerges when the two paradigms of model parallelism - intra-operator and inter-operator parallelism - are combined to support large models on large...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhuang, Yonghao, Zhao, Hexu, Zheng, Lianmin, Li, Zhuohan, Xing, Eric P, Ho, Qirong, Gonzalez, Joseph E, Stoica, Ion, Zhang, Hao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!