Why rankings of biomedical image analysis competitions should be interpreted with care

International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a compre...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications 2018-12, Vol.9 (1), p.5217-13, Article 5217
Hauptverfasser: Maier-Hein, Lena, Eisenmann, Matthias, Reinke, Annika, Onogur, Sinan, Stankovic, Marko, Scholz, Patrick, Arbel, Tal, Bogunovic, Hrvoje, Bradley, Andrew P., Carass, Aaron, Feldmann, Carolin, Frangi, Alejandro F., Full, Peter M., van Ginneken, Bram, Hanbury, Allan, Honauer, Katrin, Kozubek, Michal, Landman, Bennett A., März, Keno, Maier, Oskar, Maier-Hein, Klaus, Menze, Bjoern H., Müller, Henning, Neher, Peter F., Niessen, Wiro, Rajpoot, Nasir, Sharp, Gregory C., Sirinukunwattana, Korsuk, Speidel, Stefanie, Stock, Christian, Stoyanov, Danail, Taha, Abdel Aziz, van der Sommen, Fons, Wang, Ching-Wei, Weber, Marc-André, Zheng, Guoyan, Jannin, Pierre, Kopp-Schneider, Annette
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a comprehensive analysis of biomedical image analysis challenges conducted up to now. We demonstrate the importance of challenges and show that the lack of quality control has critical consequences. First, reproducibility and interpretation of the results is often hampered as only a fraction of relevant information is typically provided. Second, the rank of an algorithm is generally not robust to a number of variables such as the test data used for validation, the ranking scheme applied and the observers that make the reference annotations. To overcome these problems, we recommend best practice guidelines and define open research questions to be addressed in the future. Biomedical image analysis challenges have increased in the last ten years, but common practices have not been established yet. Here the authors analyze 150 recent challenges and demonstrate that outcome varies based on the metrics used and that limited information reporting hampers reproducibility.
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-018-07619-7