A New Data Selection Principle for Semi-Supervised Incremental Learning

Current semi-supervised incremental learning approaches select unlabeled examples with predicted high confidence for model re-training. We show that for many applications this data selection strategy is not correct. This is because the confidence score is primarily a metric to measure the classifica...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rong Zhang, Rudnicky, A.I.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Current semi-supervised incremental learning approaches select unlabeled examples with predicted high confidence for model re-training. We show that for many applications this data selection strategy is not correct. This is because the confidence score is primarily a metric to measure the classification correctness on a particular example, rather than one to measure the example's contribution to the training of an improved model, especially in the case that the information used in the confidence annotator is correlated with that generated by the classifier. To address this problem, we propose a performance-driven principle for unlabeled data selection in which only the unlabeled examples that help to improve classification accuracy are selected for semi-supervised learning. Encouraging results are presented for a variety of public benchmark datasets
ISSN:1051-4651
2831-7475
DOI:10.1109/ICPR.2006.115