Advances in Biomedical Missing Data Imputation: A Survey

Ensuring good data quality in biomedical sciences is crucial for reliable research outcomes, particularly as precision medicine continues to gain prominence. Missing values compromise data quality and can difficult to perform data-based studies. The origins of missing values in biomedical datasets a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2024-12, p.1-1
Hauptverfasser: Barrabes, Miriam, Perera, Maria, Moriano, Victor Novelle, Giro-I-Nieto, Xavier, Montserrat, Daniel Mas, Ioannidis, Alexander G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Ensuring good data quality in biomedical sciences is crucial for reliable research outcomes, particularly as precision medicine continues to gain prominence. Missing values compromise data quality and can difficult to perform data-based studies. The origins of missing values in biomedical datasets are diverse, including experimental errors, equipment malfunctions, and variations in data collection protocols tailored to individual patient conditions. To address the complex nature of missing values and the unique characteristics of biomedical data, a diverse spectrum of computational imputation techniques has emerged. These methods range from traditional statistical analysis to more modern approaches such as discriminative machine learning models and deep generative networks. This survey paper provides a comprehensive overview of the extensive literature on missing data imputation techniques, with a specific focus on applications in genomics, single-cell RNA sequencing, health records, and medical imaging. We outline the fundamental principles underlying each imputation technique and present a detailed analysis of their advantages and disadvantages, categorized by missing data patterns. To aid practitioners in method selection, we offer practical recommendations based on critical factors such as dataset size, data type, and missingness rate. By synthesizing insights from existing literature, we provide a holistic perspective on the effectiveness of various imputation methods under different biomedical contexts, thereby facilitating informed decision-making for researchers and practitioners in applying imputation techniques to biomedical data processing.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3516506