Student network training method for knowledge distillation in mixed learning

The invention discloses a student network training method for knowledge distillation in mixed learning. The method comprises the following steps: 1) selecting a training sample of a target domain from training data; 2) preprocessing the training samples, and inputting the preprocessed training sampl...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	WANG RUI, SUN SHANGQUAN, XIAO YAN, CAO XIAOCHUN, REN WENQI
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a student network training method for knowledge distillation in mixed learning. The method comprises the following steps: 1) selecting a training sample of a target domain from training data; 2) preprocessing the training samples, and inputting the preprocessed training samples into a student network and a teacher network respectively to obtain a corresponding student network logit and a teacher network logit; 3) respectively carrying out Z-score standardization processing on each student network logit and each teacher network logit; 4) converting the teacher network logit and the student network logit after Z-score standardization into a probability form; and 5) randomly selecting a probability corresponding to a teacher network logit and a probability corresponding to a student network logit, calculating a KL divergence between the two selected probabilities as a loss function, and performing gradient descent to optimize and distill the student network. According to the invention, th