Balanced self-distillation for long-tailed recognition

In long-tailed recognition tasks, the knowledge distillation technology is widely adopted for improving performance of deep neural networks. These methods distill the knowledge from the pretrained teacher model to the student model, which enables higher long-tailed recognition accuracy. However, the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2024-04, Vol.290, p.111504, Article 111504
Hauptverfasser: Ren, Ning, Li, Xiaosong, Wu, Yanxia, Fu, Yan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In long-tailed recognition tasks, the knowledge distillation technology is widely adopted for improving performance of deep neural networks. These methods distill the knowledge from the pretrained teacher model to the student model, which enables higher long-tailed recognition accuracy. However, the dependence on accompanying assistive models complicates the single network’s training process in the need for large memory and time costs. In this work, we present Balanced Self-Distillation (BSD) to distill tail knowledge by a single network without the assistive models. Specifically, BSD distills knowledge between different distortions of the same samples to stimulate the representation learning potential of the single network and adopts a balanced class weight for shifting the distillation focus from head-to-tail classes. Comprehensive experimentation across diverse datasets, including CIFAR-10-LT, CIFAR-100-LT and TinyImageNet-LT, consistently outperforms robust baseline methods. Specifically, BSD achieves improvements of 8.13% on CIFAR-100-LT with an imbalance ratio of 100 compared to the baseline (cross entropy). Furthermore, the proposed method enables seamless integration with contemporary techniques like re-sampling, meta-learning, and cost-sensitive learning. It emerges as a versatile tool capable of effectively addressing the challenges of long-tailed scenarios.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2024.111504