Safe semi-supervised classification algorithm combined with active learning sampling strategy

In order to improve the performance of semi-supervised learning, a kind of safe semi-supervised classification algorithm based active learning sampling strategy is proposed. First, an active learning sampling method based on uncertainty and representativenes is designed. The weighted algorithm combi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of intelligent & fuzzy systems 2018-01, Vol.35 (4), p.4001-4010
Hauptverfasser: Zhao, Jianhua, Liu, Ning, Malov, A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In order to improve the performance of semi-supervised learning, a kind of safe semi-supervised classification algorithm based active learning sampling strategy is proposed. First, an active learning sampling method based on uncertainty and representativenes is designed. The weighted algorithm combining the uncertainty and representativenesss is used to select the unlabeled samples with rich information and representation, providing for semi-supervised learning. Second, a method of label prediction based on grouping verification is designed. Prelabeling is executed on unlabeled sample selected by active learning. The sample with pseudo-label is added into the labeled sample set to carry out grouping, training and testing. The corresponding errors of various pseudo-labels are calculated and the pseudo-label making the accuracy least is selected as the candidate label of the unlabeled sample. Third, a method of security verification is designed. Only the label making the accuracy lower than before is selected as the final label of the unlabeled sample to expand the number of labeled samples. Iterations are repeatedly executed until a certain precision is met. Finally, the classifier is trained using the final labeled set. The experiments are carried out on semi-supervised datasets and UCI datasets, and the results show that the proposed algorithms are effective.
ISSN:1064-1246
1875-8967
DOI:10.3233/JIFS-169722