What Is a Good Imputation Under MAR Missingness?
Missing values pose a persistent challenge in modern data science. Consequently, there is an ever-growing number of publications introducing new imputation methods in various fields. The present paper attempts to take a step back and provide a more systematic analysis. Starting from an in-depth disc...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Missing values pose a persistent challenge in modern data science.
Consequently, there is an ever-growing number of publications introducing new
imputation methods in various fields. The present paper attempts to take a step
back and provide a more systematic analysis. Starting from an in-depth
discussion of the Missing at Random (MAR) condition for nonparametric
imputation, we first develop an identification result, showing that the widely
used Multiple Imputation by Chained Equations (MICE) approach indeed identifies
the right conditional distributions. Building on this analysis, we propose
three essential properties a successful imputation method should meet, thus
enabling a more principled evaluation of existing methods and more targeted
development of new methods. In particular, we introduce a new imputation
method, denoted mice-DRF, that meets two out of the three criteria. We then
discuss and refine ways to rank imputation methods, developing a powerful,
easy-to-use scoring algorithm to rank missing value imputations. |
---|---|
DOI: | 10.48550/arxiv.2403.19196 |