Hybrid precision quantification method of deep convolutional neural network and related equipment
The invention provides a hybrid precision quantification method for a deep convolutional neural network and related equipment, and the method comprises the steps: carrying out the scaling of the weight and bias represented by floating-point numbers of an input layer, each convolution layer and each...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a hybrid precision quantification method for a deep convolutional neural network and related equipment, and the method comprises the steps: carrying out the scaling of the weight and bias represented by floating-point numbers of an input layer, each convolution layer and each full-connection layer of a full-precision deep convolutional neural network model after sample training into the weight and bias represented by real numbers, and the network output value of each layer of the full-precision deep convolutional neural network model is quantized correspondingly, model testing is performed on different precision combinations of the quantized deep convolutional neural network model, and an optimal quantization precision combination is selected from test accuracy results. The method has the beneficial effects that the performance of the network is ensured while the memory occupied by the network is reduced and the reasoning speed is increased to the maximum extent.
本发明提供了一种深度卷积神经网络的混合精度量化 |
---|