Robust Semi-Supervised Learning With Multi-Consistency and Data Augmentation
In this paper, we address the problem of noisy datasets by proposing a dual screening scheme to improve the performance of models trained on two public noisy datasets: Clothing1M and Animal-10N. As Web crawlers generate both datasets, their label error levels cannot be estimated. We use a warm-up mo...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on consumer electronics 2024-02, Vol.70 (1), p.414-424 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we address the problem of noisy datasets by proposing a dual screening scheme to improve the performance of models trained on two public noisy datasets: Clothing1M and Animal-10N. As Web crawlers generate both datasets, their label error levels cannot be estimated. We use a warm-up model to separate the data into labeled and unlabeled data, which are then classified by multi-model consistency. We select consistent data from the dataset and provide pseudo-labels for training, while the remaining data is not trained as noisy data. This approach reduces the impact of noisy data and mislabeling. To improve the model's robustness, we combine clean data and unlabeled data with strong data augmentation and train them using the Mixup algorithm. Experimental results show that our proposed methods boost classification performance: the accuracy of Clothing1M is 0.1% higher than the state-of-the-art method, and the accuracy of Animal-10N is 2% higher than the state-of-the-art method. The significant contributions of this paper are: 1) adding strong data augmentation to enhance the model, 2) using multi-consistency to reduce the impact of noisy data, and 3) boosting performance through semi-supervised learning. |
---|---|
ISSN: | 0098-3063 1558-4127 |
DOI: | 10.1109/TCE.2023.3331700 |