Neural network model quantification method
The embodiment of the invention provides a neural network model quantification method, and the method comprises the steps: obtaining a quantification data set, and determining the initial parameter sensitivity feature information of each neural network layer according to the quantification data set;...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The embodiment of the invention provides a neural network model quantification method, and the method comprises the steps: obtaining a quantification data set, and determining the initial parameter sensitivity feature information of each neural network layer according to the quantification data set; generating reference parameter information according to the initial parameter sensitivity feature information corresponding to each neural network layer; quantizing each reference parameter information based on at least one quantization strategy, and generating at least one neural network layer quantization result corresponding to each neural network layer; and determining a neural network model quantization result corresponding to the neural network model according to a target storage threshold value of a target storage device, the initial volume of the neural network model and each neural network layer quantization result corresponding to each neural network layer. The parameter information is generated accordin |
---|