Binary cross-entropy with dynamical clipping

We investigate the adverse effect of noisy labels in a training dataset on a neural network’s precision in an image classification task. The importance of this research lies in the fact that most datasets include noisy labels. To reduce the impact of noisy labels, we propose to extend the binary cro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2022-07, Vol.34 (14), p.12029-12041
Hauptverfasser: Hurtik, Petr, Tomasiello, Stefania, Hula, Jan, Hynar, David
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We investigate the adverse effect of noisy labels in a training dataset on a neural network’s precision in an image classification task. The importance of this research lies in the fact that most datasets include noisy labels. To reduce the impact of noisy labels, we propose to extend the binary cross-entropy by dynamical clipping, which clips all samples’ loss values in a mini-batch by a clipping constant. Such a constant is dynamically determined for every single mini-batch using its statistics. The advantage is the dynamic adaptation to any number of noisy labels in a training dataset. Thanks to that, the proposed binary cross-entropy with dynamical clipping can be used in any model utilizing cross-entropy or focal loss, including pre-trained models. We prove that the proposed loss function is an α -calibrated classification loss, implying consistency and robustness to noise misclassification in more general asymmetric problems. We demonstrate our loss function’s usefulness on Fashion MNIST, CIFAR-10, CIFAR-100 datasets, where we heuristically create training data with noisy labels and achieve a nice performance boost compared to the standard binary cross-entropy. These results are also confirmed in the second experiment, where we use a trained model on Google Images to classify the ImageWoof dataset, and the third experiment, where we deal with the WebVision and ANIMAL-10N datasets. We also show that the proposed technique yields significantly better performance than the gradient clipping. Code: gitlab.com/irafm-ai/clipping_cross_entropy
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-022-07091-x