Synthetic data generation method for data-free knowledge distillation in regression neural networks

Knowledge distillation is the technique of compressing a larger neural network, known as the teacher, into a smaller neural network, known as the student, while still trying to maintain the performance of the larger neural network as much as possible. Existing methods of knowledge distillation are m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2023-10, Vol.227, p.120327, Article 120327
Hauptverfasser: Zhou, Tianxun, Chiam, Keng-Hwee
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Knowledge distillation is the technique of compressing a larger neural network, known as the teacher, into a smaller neural network, known as the student, while still trying to maintain the performance of the larger neural network as much as possible. Existing methods of knowledge distillation are mostly applicable for classification tasks. Many of them also require access to the data used to train the teacher model. To address the problem of knowledge distillation for regression tasks in the absence of original training data, the existing method uses a generator model trained adversarially against the student model to generate synthetic data to train the student model. In this study, we propose a new synthetic data generation strategy that directly optimizes for a large but bounded difference between the student and teacher model. Our results on benchmark experiments demonstrate that the proposed strategy allows the student model to learn better and emulate the performance of the teacher model more closely.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2023.120327