LINEAR NEURAL RECONSTRUCTION FOR DEEP NEURAL NETWORK COMPRESSION
A method and apparatus for performing deep neural network compression of convolutional and fully connected layers using a linear approximation of their outputs with information, such as in matrices representing weights, biases and non-linearities, to iteratively compress a pre-trained deep neural ne...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method and apparatus for performing deep neural network compression of convolutional and fully connected layers using a linear approximation of their outputs with information, such as in matrices representing weights, biases and non-linearities, to iteratively compress a pre-trained deep neural network by low displacement rank based approximation of the network layer weight matrices. Extension of the technique enables consecutive layers to be compressed jointly, allowing compression and speeding inference by reducing the number of channels/hidden neurons in the network. |
---|