Deep neural network model compression method and device
The embodiment of the invention provides a compression method and device for a deep neural network model, and belongs to the technical field of computers. The method comprises the steps of calculatinga norm corresponding to a model channel in a to-be-compressed deep neural network model in a current...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The embodiment of the invention provides a compression method and device for a deep neural network model, and belongs to the technical field of computers. The method comprises the steps of calculatinga norm corresponding to a model channel in a to-be-compressed deep neural network model in a current model training period; cutting the model channel according to the norm and the corresponding initialization weight threshold to obtain a cut deep neural network model; judging whether the difference between the model precision and the expected precision of the cut deep neural network model is greater than zero or not; when the difference value is greater than zero, according to the difference value and the initialization weight threshold, an adaptive weight threshold corresponding to each layer of neural network is obtained; determining a quantified deep neural network model according to the influence degree of the quantization result of each parameter in the cut model on the loss function; and taking the quantize |
---|