An error analysis for deep binary classification with sigmoid loss

Deep neural networks have demonstrated remarkable efficacy in diverse classification tasks. In this paper, we specifically focus on the predictive performance in deep binary classification problems with the sigmoid loss. Given that sigmoid loss is categorized as a non-convex and bounded loss functio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences 2024-10, Vol.681, p.121166, Article 121166
Hauptverfasser: Li, Changshi, Jiao, Yuling, Yang, Jerry Zhijian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deep neural networks have demonstrated remarkable efficacy in diverse classification tasks. In this paper, we specifically focus on the predictive performance in deep binary classification problems with the sigmoid loss. Given that sigmoid loss is categorized as a non-convex and bounded loss function, it exhibits potential resilience against the disruptive impact of outlier noises. We first derive the convergence rate of the excess misclassification risk for deep ReLU neural networks with the sigmoid loss, a result that attains minimax optimality. To the best of our acknowledge, we are the first to derive the convergence rate for the sigmoid loss. Moreover, we extend our analysis to derive a faster convergence rate under margin assumptions. This achievement renders our findings comparable to those of commonly employed convex loss functions operating under analogous assumptions. Lastly, we undertake a comprehensive validation of the robustness inherent in the sigmoid loss across diverse datasets.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2024.121166