Decoupled neural network training with re-computation and weight prediction

To break the three lockings during backpropagation (BP) process for neural network training, multiple decoupled learning methods have been investigated recently. These methods either lead to significant drop in accuracy performance or suffer from dramatic increase in memory usage. In this paper, a n...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PloS one 2023-02, Vol.18 (2), p.e0276427-e0276427
Hauptverfasser:	Peng, Jiawei, Xu, Yicheng, Lin, Zhiping, Weng, Zhenyu, Yang, Zishuo, Zhuang, Huiping
Format:	Artikel
Sprache:	eng
Schlagworte:	Analysis Artificial neural networks Back propagation Back propagation networks Biology and Life Sciences Computation Computer and Information Sciences Deep learning Explosions Humans Learning Memory Disorders Methods Neural networks Neural Networks, Computer Physical Sciences Predictions Research and Analysis Methods Social Sciences Teaching methods Theoretical analysis Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	To break the three lockings during backpropagation (BP) process for neural network training, multiple decoupled learning methods have been investigated recently. These methods either lead to significant drop in accuracy performance or suffer from dramatic increase in memory usage. In this paper, a new form of decoupled learning, named decoupled neural network training scheme with re-computation and weight prediction (DTRP) is proposed. In DTRP, a re-computation scheme is adopted to solve the memory explosion problem, and a weight prediction scheme is proposed to deal with the weight delay caused by re-computation. Additionally, a batch compensation scheme is developed, allowing the proposed DTRP to run faster. Theoretical analysis shows that DTRP is guaranteed to converge to crical points under certain conditions. Experiments are conducted by training various convolutional neural networks on several classification datasets, showing comparable or better results than the state-of-the-art methods and BP. These experiments also reveal that adopting the proposed method, the memory explosion problem is effectively solved, and a significant acceleration is achieved.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0276427