Neural network model training acceleration method and system based on dynamic precision quantification

The invention provides a neural network model training acceleration method and system based on dynamic precision quantization, and the method comprises the steps: dividing a data matrix involved in an operation process into small blocks in logic before the training of a neural network model is start...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: LIU YONGPAN, LIU RUOYANG, WEI CHENHAN, YANG HUAZHONG, YANG YIXIONG, WANG WENXUN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a neural network model training acceleration method and system based on dynamic precision quantization, and the method comprises the steps: dividing a data matrix involved in an operation process into small blocks in logic before the training of a neural network model is started; in the training process, the quantization sensitivity of each block of data is calculated according to the quantization range of each block of data and the gradient value corresponding to the block of data, and the sensitivity is expressed by the optimal relative quantization bit width between the blocks; determining a currently required average quantization bit width target according to the current training step number of the network; dynamically determining the weight value of each block and the absolute quantization bit width of the activation value data by combining the relative quantization bit width, the average quantization bit width target and preset maximum and minimum calculation bit width parameters