Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Deep learning have achieved promising results on a wide spectrum of AI applications. Larger datasets and models consistently yield better performance. However, we generally spend longer training time on more computation and communication. In this survey, we aim to provide a clear sketch about the op...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep learning have achieved promising results on a wide spectrum of AI
applications. Larger datasets and models consistently yield better performance.
However, we generally spend longer training time on more computation and
communication. In this survey, we aim to provide a clear sketch about the
optimizations for large-scale deep learning with regard to the model accuracy
and model efficiency. We investigate algorithms that are most commonly used for
optimizing, elaborate the debatable topic of generalization gap arises in
large-batch training, and review the SOTA strategies in addressing the
communication overhead and reducing the memory footprints. |
---|---|
DOI: | 10.48550/arxiv.2111.00856 |