mlscorecheck: Testing the consistency of reported performance scores and experiments in machine learning

Addressing the reproducibility crisis in artificial intelligence through the validation of reported experimental results is a challenging task. It necessitates either the reimplementation of techniques or a meticulous assessment of papers for deviations from the scientific method and best statistica...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neurocomputing (Amsterdam) 2024-05, Vol.583, p.127556, Article 127556
Hauptverfasser:	Kovács, György, Fazekas, Attila
Format:	Artikel
Sprache:	eng
Schlagworte:	Binary classification Consistency testing Multiclass classification Open source Performance scores Regression
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Addressing the reproducibility crisis in artificial intelligence through the validation of reported experimental results is a challenging task. It necessitates either the reimplementation of techniques or a meticulous assessment of papers for deviations from the scientific method and best statistical practices. To facilitate the validation of reported results, we have developed numerical techniques capable of identifying inconsistencies between reported performance scores and various experimental setups in machine learning problems, including binary/multiclass classification and regression. These consistency tests are integrated into the open-source package mlscorecheck, which also provides specific test bundles designed to detect systematically recurring flaws in various fields, such as retina image processing and synthetic minority oversampling.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2024.127556