Student network training method for knowledge distillation in mixed learning
The invention discloses a student network training method for knowledge distillation in mixed learning. The method comprises the following steps: 1) selecting a training sample of a target domain from training data; 2) preprocessing the training samples, and inputting the preprocessed training sampl...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a student network training method for knowledge distillation in mixed learning. The method comprises the following steps: 1) selecting a training sample of a target domain from training data; 2) preprocessing the training samples, and inputting the preprocessed training samples into a student network and a teacher network respectively to obtain a corresponding student network logit and a teacher network logit; 3) respectively carrying out Z-score standardization processing on each student network logit and each teacher network logit; 4) converting the teacher network logit and the student network logit after Z-score standardization into a probability form; and 5) randomly selecting a probability corresponding to a teacher network logit and a probability corresponding to a student network logit, calculating a KL divergence between the two selected probabilities as a loss function, and performing gradient descent to optimize and distill the student network. According to the invention, th |
---|