MixQuantBio: Towards extreme face and periocular recognition model compression with mixed-precision quantization

Current periocular and face recognition approaches utilize computationally costly deep neural networks, achieving notable recognition accuracies. Deploying such solutions in applications with limited computational resources requires minimizing their computational demand while maintaining similar rec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering applications of artificial intelligence 2024-11, Vol.137, p.109114, Article 109114
Hauptverfasser: Kolf, Jan Niklas, Elliesen, Jurek, Damer, Naser, Boutros, Fadi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Current periocular and face recognition approaches utilize computationally costly deep neural networks, achieving notable recognition accuracies. Deploying such solutions in applications with limited computational resources requires minimizing their computational demand while maintaining similar recognition accuracies. Model compression techniques like model quantization can be used to reduce the computational costs of deep models. This approach is widely studied and applied to different machine-learning tasks, however it is understudied and investigated for biometrics. We propose in this work to reduce the computational cost of face and periocular recognition models using fixed- and mixed-precision model quantization. Specifically, we first quantize the full-precision models to fixed 8 and 6 bits, reducing the required memory footprint by 5x while maintaining, to a very large degree, the recognition accuracies. However, our achieved results demonstrated that by quantizing the models to extremely low b bits, e.g., below 6 bits, the accuracies significantly dropped, which motivated our investigation on mixed-precision quantization. Hence, we propose to utilize an iterative mixed-precision quantization scheme. In each iteration, the least important parameters are selected based on their weight magnitude and quantized to low b-bit precision and the model is fine-tuned. This approach is repeated until all parameters are quantized to low b-bit precision, achieving extreme reduction in memory footprint, e.g., 16x times, without significant loss in the model accuracies. The effectiveness of mixed- and fixed-precision quantization for biometric recognition models is studied and proved for two modalities, face and periocular, using three different deep network architectures and using different b bit precision. •In-depth investigation on quantization for face and periocular recognition models.•Detailed investigation on mixed and fixed-precision quantization.•Propose to quantize face and periocular recognition models with mixed-precision.•Extensive evaluation using different low b bit precision and network architectures.•Successfully reducing the memory footprint up to 16x with minimum performance loss.
ISSN:0952-1976
DOI:10.1016/j.engappai.2024.109114