SPECULATIVE TRAINING USING PARTIAL GRADIENTS UPDATE

The exchange of weight gradients among the processing nodes can introduce a substantial bottleneck to the training process. Instead of remaining idle during the weight gradients exchange process, a processing node can update its own set of weights for the next iteration of the training process using...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: KAPLAN PATRICIO, HUANG RANDY RENFU
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The exchange of weight gradients among the processing nodes can introduce a substantial bottleneck to the training process. Instead of remaining idle during the weight gradients exchange process, a processing node can update its own set of weights for the next iteration of the training process using the processing node's local weight gradients. The next iteration of training can be started by using these speculative weights until the weight gradients exchange process completes and a global weights update is available. If the speculative weights is close enough to the weight values from the global weights update, the training process at the processing node can continue training using the results computed from the speculative weights to reduce the overall training time. 本申请公开了使用部分梯度更新的推测性训练。处理节点之间权重梯度的交换会给训练过程带来实质性的瓶颈。处理节点可以使用处理节点的局部权重梯度更新其自身的权重集合以用于训练过程的下一次迭代,而不是在权重梯度交换过程期间保持空闲。通过使用这些推测性权重直到权重梯度交换过程完成并且全局权重更新可用,可以开始训练的下一次迭代。如果推测性权重与来自全局权重更新的权重值足够接近,则处理节点处的训练过程可以使用从推测性权重计算出的结果继续训练,以减少整体训练时间。