LINEAR NEURAL RECONSTRUCTION FOR DEEP NEURAL NETWORK COMPRESSION

A method and apparatus for performing deep neural network compression of convolutional and fully connected layers using a linear approximation of their outputs with information, such as in matrices representing weights, biases and non-linearities, to iteratively compress a pre-trained deep neural ne...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Robin, David A. R, Jain, Swayambhoo
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A method and apparatus for performing deep neural network compression of convolutional and fully connected layers using a linear approximation of their outputs with information, such as in matrices representing weights, biases and non-linearities, to iteratively compress a pre-trained deep neural network by low displacement rank based approximation of the network layer weight matrices. Extension of the technique enables consecutive layers to be compressed jointly, allowing compression and speeding inference by reducing the number of channels/hidden neurons in the network.