Reproducibility in Machine Learning-based Research: Overview, Barriers and Drivers
Research in various fields is currently experiencing challenges regarding the reproducibility of results. This problem is also prevalent in machine learning (ML) research. The issue arises, for example, due to unpublished data and/or source code and the sensitivity of ML training conditions. Althoug...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Research in various fields is currently experiencing challenges regarding the
reproducibility of results. This problem is also prevalent in machine learning
(ML) research. The issue arises, for example, due to unpublished data and/or
source code and the sensitivity of ML training conditions. Although different
solutions have been proposed to address this issue, such as using ML platforms,
the level of reproducibility in ML-driven research remains unsatisfactory.
Therefore, in this article, we discuss the reproducibility of ML-driven
research with three main aims: (i) identifying the barriers to reproducibility
when applying ML in research as well as categorize the barriers to different
types of reproducibility (description, code, data, and experiment
reproducibility), (ii) discussing potential drivers such as tools, practices,
and interventions that support ML reproducibility, as well as distinguish
between technology-driven drivers, procedural drivers, and drivers related to
awareness and education, and (iii) mapping the drivers to the barriers. With
this work, we hope to provide insights and to contribute to the decision-making
process regarding the adoption of different solutions to support ML
reproducibility. |
---|---|
DOI: | 10.48550/arxiv.2406.14325 |