MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
Unlike the conventional Knowledge Distillation (KD), Self-KD allows a network to learn knowledge from itself without any guidance from extra networks. This paper proposes to perform Self-KD from image Mixture (MixSKD), which integrates these two techniques into a unified framework. MixSKD mutually d...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Unlike the conventional Knowledge Distillation (KD), Self-KD allows a network
to learn knowledge from itself without any guidance from extra networks. This
paper proposes to perform Self-KD from image Mixture (MixSKD), which integrates
these two techniques into a unified framework. MixSKD mutually distills feature
maps and probability distributions between the random pair of original images
and their mixup images in a meaningful way. Therefore, it guides the network to
learn cross-image knowledge by modelling supervisory signals from mixup images.
Moreover, we construct a self-teacher network by aggregating multi-stage
feature maps for providing soft labels to supervise the backbone classifier,
further improving the efficacy of self-boosting. Experiments on image
classification and transfer learning to object detection and semantic
segmentation demonstrate that MixSKD outperforms other state-of-the-art Self-KD
and data augmentation methods. The code is available at
https://github.com/winycg/Self-KD-Lib. |
---|---|
DOI: | 10.48550/arxiv.2208.05768 |