Evaluating Classification Model Against Bayes Error Rate

For a classification task, we usually select an appropriate classifier via model selection. How to evaluate whether the chosen classifier is optimal? One can answer this question via Bayes error rate (BER). Unfortunately, estimating BER is a fundamental conundrum. Most existing BER estimators focus...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2023-08, Vol.45 (8), p.1-16
Hauptverfasser:	Chen, Qingqiang, Cao, Fuyuan, Xing, Ying, Liang, Jiye
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Bayes Error Rate Bayes Theorem Classification Classifiers Data models Datasets Error analysis Estimation Estimators Label Propagation Lower bounds Model Evaluation Noise measurement Percolation Theory Reliability theory Support vector machines Task analysis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	For a classification task, we usually select an appropriate classifier via model selection. How to evaluate whether the chosen classifier is optimal? One can answer this question via Bayes error rate (BER). Unfortunately, estimating BER is a fundamental conundrum. Most existing BER estimators focus on giving the upper and lower bounds of the BER. However, evaluating whether the selected classifier is optimal based on these bounds is hard. In this paper, we aim to learn the exact BER instead of bounds on BER. The core of our method is to transform the BER calculation problem into a noise recognition problem. Specifically, we define a type of noise called Bayes noise and prove that the proportion of Bayes noisy samples in a data set is statistically consistent with the BER of the data set. To recognize the Bayes noisy samples, we present a method consisting of two parts: selecting reliable samples based on percolation theory and then employing a label propagation algorithm to recognize the Bayes noisy samples based on the selected reliable samples. The superiority of the proposed method compared to the existing BER estimators is verified on extensive synthetic, benchmark, and image data sets.
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2023.3240194