Local rough set-based feature selection for label distribution learning with incomplete labels

Label distribution learning, as a new learning paradigm under the machine learning framework, is widely applied to address label ambiguity. However, most existing label distribution learning methods require complete supervised information, which is obtained through costly and laborious efforts to la...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of machine learning and cybernetics 2022-08, Vol.13 (8), p.2345-2364
Hauptverfasser: Qian, Wenbin, Dong, Ping, Wang, Yinglong, Dai, Shiming, Huang, Jintao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Label distribution learning, as a new learning paradigm under the machine learning framework, is widely applied to address label ambiguity. However, most existing label distribution learning methods require complete supervised information, which is obtained through costly and laborious efforts to label the data. In reality, the annotation information may be incomplete and traditional methods cannot directly deal with the incomplete data. Hence, a new theoretical framework is proposed to handle the limited labeled data, which is called the local rough set. In addition, label distribution learning also experiences the “curse of dimensionality” problem, and it is essential to adopt some pre-processing methods, such as feature selection, to reduce the data dimensionality. Nevertheless, few feature selection algorithms are designed for handling label distribution data. Motivated by this, a model based on local rough set and neighborhood granularity, which can effectively and efficiently work with incompletely labeled data, is introduced in this paper. Furthermore, a local rough set-based incomplete label distribution feature selection algorithm is proposed to reduce the data dimensionality. Experimental results on 12 real-world label distribution datasets indicate that the proposed method outperforms the global rough set in computational efficiency and achieves better classification performance than the other five methods.
ISSN:1868-8071
1868-808X
DOI:10.1007/s13042-022-01528-4