Neural network model training acceleration method and system based on dynamic precision quantification
The invention provides a neural network model training acceleration method and system based on dynamic precision quantization, and the method comprises the steps: dividing a data matrix involved in an operation process into small blocks in logic before the training of a neural network model is start...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a neural network model training acceleration method and system based on dynamic precision quantization, and the method comprises the steps: dividing a data matrix involved in an operation process into small blocks in logic before the training of a neural network model is started; in the training process, the quantization sensitivity of each block of data is calculated according to the quantization range of each block of data and the gradient value corresponding to the block of data, and the sensitivity is expressed by the optimal relative quantization bit width between the blocks; determining a currently required average quantization bit width target according to the current training step number of the network; dynamically determining the weight value of each block and the absolute quantization bit width of the activation value data by combining the relative quantization bit width, the average quantization bit width target and preset maximum and minimum calculation bit width parameters |
---|