Spatially-Coupled Neural Network Architectures

In this work, we leverage advances in sparse coding techniques to reduce the number of trainable parameters in a fully connected neural network. While most of the works in literature impose \(\ell_1\) regularization, DropOut or DropConnect techniques to induce sparsity, our scheme considers feature...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2019-07
Hauptverfasser:	Hasanzadeh, Arman, Janakiraman, Nagaraj T, Amalladinne, Vamsi K, Narayanan, Krishna R
Format:	Artikel
Sprache:	eng
Schlagworte:	Architecture Computer simulation Neural networks Neurons Parameters Regularization Sparsity Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this work, we leverage advances in sparse coding techniques to reduce the number of trainable parameters in a fully connected neural network. While most of the works in literature impose \(\ell_1\) regularization, DropOut or DropConnect techniques to induce sparsity, our scheme considers feature importance as a criterion to allocate the trainable parameters (resources) efficiently in the network. Even though sparsity is ensured, \(\ell_1\) regularization requires training on all the resources in a deep neural network. The DropOut/DropConnect techniques reduce the number of trainable parameters in the training stage by dropping a random collection of neurons/edges in the hidden layers. However, both these techniques do not pay heed to the underlying structure in the data when dropping the neurons/edges. Moreover, these frameworks require a storage space equivalent to the number of parameters in a fully connected neural network. We address the above issues with a more structured architecture inspired from spatially-coupled sparse constructions. The proposed architecture is shown to have a performance akin to a conventional fully connected neural network with dropouts, and yet achieving a \(94\%\) reduction in the training parameters. Extensive simulations are presented and the performance of the proposed scheme is compared against traditional neural network architectures.
ISSN:	2331-8422