MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks
Quantization, which involves bit-width reduction, is considered as one of the most effective approaches to rapidly and energy-efficiently deploy deep convolutional neural networks (DCNNs) on resource-constrained embedded hardware. However, bit-width reduction on the weights and activations of DCNNs...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2021-07, Vol.51 (7), p.4561-4574 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Quantization, which involves bit-width reduction, is considered as one of the most effective approaches to rapidly and energy-efficiently deploy deep convolutional neural networks (DCNNs) on resource-constrained embedded hardware. However, bit-width reduction on the weights and activations of DCNNs seriously degrades accuracy. To solve this problem, in this paper we propose a mixed hardware-friendly quantization (MXQN) method that applies fixed-point quantization and logarithmic quantization for DCNNs without the necessity to retrain and fine-tune the DCNN. Our MXQN algorithm is a multi-staged process where, first, we employ a signal-to-quantization-noise ratio (SQNR) process as the metric to estimate the interplay between the parameter quantization errors of each layer and the overall model prediction accuracy. Then, we utilize a fixed-point quantization process to quantize weights, and depending on the SQNR metric we empirically select either a logarithmic or a fixed-point quantization process to quantize activations. For improved accuracy, we propose an optimized logarithmic quantization scheme that affords a fine-grained step size. We evaluate the performance of MXQN utilizing the VGG16 network on the MNIST, CIFAR-10, CIFAR-100, and the ImageNet datasets, as well as VGG19 and ResNet (ResNet18, ResNet34, ResNet50) networks on the ImageNet, and demonstrate that the MXQN-quantized DCNN despite not being retrained and fine-tuned, it still achieves high accuracy close to the original DCNN. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-020-02109-0 |