PERFORMING SYNCHRONIZATION IN THE BACKGROUND FOR HIGHLY SCALABLE DISTRIBUTED TRAINING

In one embodiment, a method for training a machine-learning model having multiple parameters includes instantiating trainers each associated with at least a worker thread, a synchronization thread, and a local version of the parameters, using the worker threads to perform training operations that co...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	YANG, Jiyan, SU, Bor-Yiing, JIN, Ou, ZHENG, Qinqing, AZZOLINI, Alisson Gusatti, WU, Qiang
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In one embodiment, a method for training a machine-learning model having multiple parameters includes instantiating trainers each associated with at least a worker thread, a synchronization thread, and a local version of the parameters, using the worker threads to perform training operations that comprise generating an updated local version of the parameters for each trainer using its associated worker thread, while the worker threads are performing training operations, using the synchronization threads to perform synchronization operations that comprise generating a global version of the parameters based on the updated local versions of the parameters and generating a synchronized local version of the parameters for each trainer based on the global version, continuing performing training operations based on the synchronized local versions of the parameters, and determining the parameters at the end of training based on at least a final local version of the parameters associated with one trainer.