Student network training method for knowledge distillation in mixed learning

The invention discloses a student network training method for knowledge distillation in mixed learning. The method comprises the following steps: 1) selecting a training sample of a target domain from training data; 2) preprocessing the training samples, and inputting the preprocessed training sampl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: WANG RUI, SUN SHANGQUAN, XIAO YAN, CAO XIAOCHUN, REN WENQI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a student network training method for knowledge distillation in mixed learning. The method comprises the following steps: 1) selecting a training sample of a target domain from training data; 2) preprocessing the training samples, and inputting the preprocessed training samples into a student network and a teacher network respectively to obtain a corresponding student network logit and a teacher network logit; 3) respectively carrying out Z-score standardization processing on each student network logit and each teacher network logit; 4) converting the teacher network logit and the student network logit after Z-score standardization into a probability form; and 5) randomly selecting a probability corresponding to a teacher network logit and a probability corresponding to a student network logit, calculating a KL divergence between the two selected probabilities as a loss function, and performing gradient descent to optimize and distill the student network. According to the invention, th