Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation
Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage. To reduce the necessity of training a large teacher model, the recent literatures introduced...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Knowledge distillation is a method of transferring the knowledge from a
pretrained complex teacher model to a student model, so a smaller network can
replace a large teacher network at the deployment stage. To reduce the
necessity of training a large teacher model, the recent literatures introduced
a self-knowledge distillation, which trains a student network progressively to
distill its own knowledge without a pretrained teacher network. While
Self-knowledge distillation is largely divided into a data augmentation based
approach and an auxiliary network based approach, the data augmentation
approach looses its local information in the augmentation process, which
hinders its applicability to diverse vision tasks, such as semantic
segmentation. Moreover, these knowledge distillation approaches do not receive
the refined feature maps, which are prevalent in the object detection and
semantic segmentation community. This paper proposes a novel self-knowledge
distillation method, Feature Refinement via Self-Knowledge Distillation
(FRSKD), which utilizes an auxiliary self-teacher network to transfer a refined
knowledge for the classifier network. Our proposed method, FRSKD, can utilize
both soft label and feature-map distillations for the self-knowledge
distillation. Therefore, FRSKD can be applied to classification, and semantic
segmentation, which emphasize preserving the local information. We demonstrate
the effectiveness of FRSKD by enumerating its performance improvements in
diverse tasks and benchmark datasets. The implemented code is available at
https://github.com/MingiJi/FRSKD. |
---|---|
DOI: | 10.48550/arxiv.2103.08273 |