LaRW: boosting open-set semi-supervised learning with label-guided re-weighting

The superior performance of traditional Semi-Supervised Learning (SSL) methods are generally achieved in strictly data-constrained scenarios, e.g. the class distribution of labeled and unlabeled data is matched. However, in realistic scenarios, unlabeled data is gathered from a variety of sources an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2024-05, Vol.83 (15), p.46419-46437
Hauptverfasser:	Ouyang, Jihong, Mao, Dong, Meng, Qingyi
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Classification Computer Communication Networks Computer Science Data Structures and Information Theory Datasets Labels Learning disabilities Machine learning Methods Multimedia Multimedia Information Systems Propagation Semi-supervised learning Special Purpose and Application-Based Systems Track 6: Computer Vision for Multimedia Applications
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The superior performance of traditional Semi-Supervised Learning (SSL) methods are generally achieved in strictly data-constrained scenarios, e.g. the class distribution of labeled and unlabeled data is matched. However, in realistic scenarios, unlabeled data is gathered from a variety of sources and it is difficult to ensure a consistent class distribution with labeled data. Therefore, this paper considers a more realistic and widespread paradigm in which the labeled and unlabeled data come from the mismatched distribution, dubbed as Open-Set Semi-Supervised Learning (OS-SSL). Specifically, unlabeled data contains out of distribution (OOD) samples, which are samples that do not fall into the labeled categories. Existing research demonstrates that OOD samples can damage classification performance. Therefore, the OS-SSL methods usually filter out OOD samples during model training. In this work, we propose a simple but effective method, namely LaRW, which takes into account the overconfidence prediction of classifiers and the learning difficulty of each category, while attempting to utilize the OOD samples. First, we propose to apply the label propagation algorithm at the feature-level to assist in producing pseudo-labels, which improve the quality of pseudo-labels. Further, we design a novel OOD detection score to better filter OOD samples. Finally, we evaluate our method against the existing SSL and OS-SSL methods under several settings. Extensive empirical results demonstrate the effectiveness and expandability of our proposed method.
ISSN:	1573-7721 1380-7501 1573-7721
DOI:	10.1007/s11042-023-17357-8