NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm

Deep neural networks (DNNs) have begun to have a pervasive impact on various applications of machine learning. However, the problem of finding an optimal DNN architecture for large applications is challenging. Common approaches go for deeper and larger DNN architectures but may incur substantial red...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on computers 2019-10, Vol.68 (10), p.1487-1497
Hauptverfasser:	Dai, Xiaoliang, Yin, Hongxu, Jha, Niraj K.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Architecture synthesis Artificial neural networks Biological neural networks Computer architecture Correlation Floating point arithmetic grow-and-prune paradigm Machine learning Manganese network parameters Network synthesis neural network Neural networks Neurons Parameters Pruning Redundancy Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep neural networks (DNNs) have begun to have a pervasive impact on various applications of machine learning. However, the problem of finding an optimal DNN architecture for large applications is challenging. Common approaches go for deeper and larger DNN architectures but may incur substantial redundancy. To address these problems, we introduce a network growth algorithm that complements network pruning to learn both weights and compact DNN architectures during training. We propose a DNN synthesis tool (NeST) that combines both methods to automate the generation of compact and accurate DNNs. NeST starts with a randomly initialized sparse network called the seed architecture. It iteratively tunes the architecture with gradient-based growth and magnitude-based pruning of neurons and connections. Our experimental results show that NeST yields accurate, yet very compact DNNs, with a wide range of seed architecture selection. For the LeNet-300-100 (LeNet-5) architecture, we reduce network parameters by 70.2\times70.2× (74.3\times74.3×) and floating-point operations (FLOPs) by 79.4\times79.4× (43.7\times43.7×). For the AlexNet, VGG-16, and ResNet-50 architectures, we reduce network parameters (FLOPs) by 15.7\times15.7× (4.6\times4.6×), 33.2\times33.2× (8.9\times8.9×), and 4.1\times4.1× (2.1\times2.1×) respectively. NeST's grow-and-prune paradigm delivers significant additional parameter and FLOPs reduction relative to pruning-only methods.
ISSN:	0018-9340 1557-9956
DOI:	10.1109/TC.2019.2914438