Quantized hashing: enabling resource-efficient deep learning models at the edge

Edge computing is the best savior for latency-critical applications. It brings processing closer to the end-user and provides a secure platform for the enormous data generated by billions of IoT devices, the vast majority of which are user-centric. Neural networks are the finest models for processin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of information technology (Singapore. Online) 2024-04, Vol.16 (4), p.2353-2361
Hauptverfasser: Nazir, Azra, Mir, Roohie Naaz, Qureshi, Shaima
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Edge computing is the best savior for latency-critical applications. It brings processing closer to the end-user and provides a secure platform for the enormous data generated by billions of IoT devices, the vast majority of which are user-centric. Neural networks are the finest models for processing this massive data, but their computational budget has always been a concern. The file size and float operation mode have restricted the adoption of neural networks on edge devices. To address these issues, we explore the quantization of models compressed using Hashing Trick like HashedNets and FreshNets. The experimental evaluation suggests that quantifying a trained HashedNet, decreased the model’s accuracy significantly. On the other hand, training-aware quantization is complex, but it leads to significant memory and computational saving. While the memory footprint is approximately half the original HashedNet, the accuracy drops by 10%. Instead of uniform quantization, multiple priority classes with different precisions are employed for FreshNets, resulting in a 33.5% reduction in model size with 6% decrease in model accuracy.
ISSN:2511-2104
2511-2112
DOI:10.1007/s41870-024-01767-4