Uncertainty management in model-based imputation for missing data

In semiconductor industry like many other applications, the failure data is rarely available in complete form and is often flawed by missing records. When the missing process is random, the missing data can be safely ignored without major conceptual impact on the statistics of the experiment. The po...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Azarkhail, M., Woytowitz, P.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In semiconductor industry like many other applications, the failure data is rarely available in complete form and is often flawed by missing records. When the missing process is random, the missing data can be safely ignored without major conceptual impact on the statistics of the experiment. The potential flaw with ignoring the missing data, however, is that the remaining complete observations may not carry enough statistical power, due to small sample size of the remaining population of complete failures. In some cases, the modeler may be able to describe the missing records as a function of other independent information available. Imputation of missing records from such empirical model is a typical way by which the lateral information about missing records can be leveraged. These models often carry considerable uncertainty that needs to be effectively incorporated into the data analysis process, in order to avoid false overconfidence in estimated reliability measures. In this article the uncertainty management during the model-based imputation process for missing data is discussed. The case study consists of Weibull analysis for a reliability critical component when a simple linear model is available for the missing records. Ignoring the missing records will result in relatively large uncertainty over the calculated reliability measures. The single imputation from correlation model will mark the other end of the spectrum due to an artificial boost in the statistical significance of the results as expected. The Multiple imputations and Bayesian likelihood averaging methods seem to be the most viable options when it comes to the uncertainty management in this problem. There seems to be some differences, however, that will be explained in detail.
ISSN:0149-144X
2577-0993
DOI:10.1109/RAMS.2013.6517697