A non-negative feedback self-distillation method for salient object detection

Self-distillation methods utilize Kullback-Leibler divergence (KL) loss to transfer the knowledge from the network itself, which can improve the model performance without increasing computational resources and complexity. However, when applied to salient object detection (SOD), it is difficult to ef...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PeerJ. Computer science 2023-06, Vol.9, p.e1435-e1435, Article e1435
Hauptverfasser:	Chen, Lei, Cao, Tieyong, Zheng, Yunfei, Yang, Jibin, Wang, Yang, Wang, Yekui, Zhang, Bo
Format:	Artikel
Sprache:	eng
Schlagworte:	Analysis Artificial Intelligence Computer Vision Kullback-Leibler divergence Loss function Methods Neural Networks Salient object detection Self-distillation Teachers
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Self-distillation methods utilize Kullback-Leibler divergence (KL) loss to transfer the knowledge from the network itself, which can improve the model performance without increasing computational resources and complexity. However, when applied to salient object detection (SOD), it is difficult to effectively transfer knowledge using KL. In order to improve SOD model performance without increasing computational resources, a non-negative feedback self-distillation method is proposed. Firstly, a virtual teacher self-distillation method is proposed to enhance the model generalization, which achieves good results in pixel-wise classification task but has less improvement in SOD. Secondly, to understand the behavior of the self-distillation loss, the gradient directions of KL and Cross Entropy (CE) loss are analyzed. It is found that KL can create inconsistent gradients with the opposite direction to CE in SOD. Finally, a non-negative feedback loss is proposed for SOD, which uses different ways to calculate the distillation loss of the foreground and background respectively, to ensure that the teacher network transfers only positive knowledge to the student. The experiments on five datasets show that the proposed self-distillation methods can effectively improve the performance of SOD models, and the average is increased by about 2.7% compared with the baseline network.
ISSN:	2376-5992 2376-5992
DOI:	10.7717/peerj-cs.1435