Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems

As the complexity of Deep Neural Network (DNN) models increases, their deployment on mobile devices becomes increasingly challenging, especially in complex vision tasks such as image classification. Many of recent contributions aim either to produce compact models matching the limited computing capa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2020, Vol.8, p.212177-212193
Hauptverfasser: Matsubara, Yoshitomo, Callegaro, Davide, Baidya, Sabur, Levorato, Marco, Singh, Sameer
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As the complexity of Deep Neural Network (DNN) models increases, their deployment on mobile devices becomes increasingly challenging, especially in complex vision tasks such as image classification. Many of recent contributions aim either to produce compact models matching the limited computing capabilities of mobile devices or to offload the execution of such burdensome models to a compute-capable device at the network edge - the edge servers. In this paper, we propose to modify the structure and training process of DNN models for complex image classification tasks to achieve in-network compression in the early network layers. Our training process stems from knowledge distillation, a technique that has been traditionally used to build small - student - models mimicking the output of larger - teacher - models. Here, we adopt this idea to obtain aggressive compression while preserving accuracy. Our results demonstrate that our approach is effective for state-of-the-art models trained over complex datasets, and can extend the parameter region in which edge computing is a viable and advantageous option. Additionally, we demonstrate that in many settings of practical interest we reduce the inference time with respect to specialized models such as MobileNet v2 executed at the mobile device, while improving accuracy.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.3039714