Learning Bilateral Clipping Parametric Activation for Low-Bit Neural Networks

Among various network compression methods, network quantization has developed rapidly due to its superior compression performance. However, trivial activation quantization schemes limit the compression performance of network quantization. Most conventional activation quantization methods directly ut...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mathematics (Basel) 2023-04, Vol.11 (9), p.2001
Hauptverfasser: Ding, Yunlong, Chen, Di-Rong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Among various network compression methods, network quantization has developed rapidly due to its superior compression performance. However, trivial activation quantization schemes limit the compression performance of network quantization. Most conventional activation quantization methods directly utilize the rectified activation functions to quantize models, yet their unbounded outputs generally yield drastic accuracy degradation. To tackle this problem, we propose a comprehensive activation quantization technique namely Bilateral Clipping Parametric Rectified Linear Unit (BCPReLU) as a generalized version of all rectified activation functions, which limits the quantization range more flexibly during training. Specifically, trainable slopes and thresholds are introduced for both positive and negative inputs to find more flexible quantization scales. We theoretically demonstrate that BCPReLU has approximately the same expressive power as the corresponding unbounded version and establish its convergence in low-bit quantization networks. Extensive experiments on a variety of datasets and network architectures demonstrate the effectiveness of our trainable clipping activation function.
ISSN:2227-7390
2227-7390
DOI:10.3390/math11092001