Compressing a neural network
A method of compressing a neural network where rows/columns of a matrix representing coefficients of a neural network layer have been rearranged and non-zero values (coefficients) in the matrix representing layer weights are organised in submatrix blocks by gathering non-zero values of weight matric...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method of compressing a neural network where rows/columns of a matrix representing coefficients of a neural network layer have been rearranged and non-zero values (coefficients) in the matrix representing layer weights are organised in submatrix blocks by gathering non-zero values of weight matrices into submatrices to create blocks with high density of non-zero values. The re-organised sub-matrices may be used in matrix multiplication, the coefficients may represent filters and the layers may be convolutional layers. The matrix may be a block diagonal matrix or a singly bordered block diagonal matrix and a partitioned hypergraph model may be used to rearrange the rows/columns of the matrix. |
---|