Deep neural network model compression method and device

The embodiment of the invention provides a compression method and device for a deep neural network model, and belongs to the technical field of computers. The method comprises the steps of calculatinga norm corresponding to a model channel in a to-be-compressed deep neural network model in a current...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: LONG XITIAN, LIU RUI, QIAO LEI, CHI YINGYING, ZHENG ZHE, LI JING, CUI WENPENG, NIE YUHU
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The embodiment of the invention provides a compression method and device for a deep neural network model, and belongs to the technical field of computers. The method comprises the steps of calculatinga norm corresponding to a model channel in a to-be-compressed deep neural network model in a current model training period; cutting the model channel according to the norm and the corresponding initialization weight threshold to obtain a cut deep neural network model; judging whether the difference between the model precision and the expected precision of the cut deep neural network model is greater than zero or not; when the difference value is greater than zero, according to the difference value and the initialization weight threshold, an adaptive weight threshold corresponding to each layer of neural network is obtained; determining a quantified deep neural network model according to the influence degree of the quantization result of each parameter in the cut model on the loss function; and taking the quantize