Robust Estimation of Mixture Complexity
In many applications, it is important to find the mixture with fewest number of components, known as the mixture complexity, that provides a satisfactory fit to the data. This article focuses on developing an estimator of mixture complexity that is consistent when the form of component densities are...
Gespeichert in:
Veröffentlicht in: | Journal of the American Statistical Association 2006-12, Vol.101 (476), p.1475-1486 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In many applications, it is important to find the mixture with fewest number of components, known as the mixture complexity, that provides a satisfactory fit to the data. This article focuses on developing an estimator of mixture complexity that is consistent when the form of component densities are unknown but are postulated to be members of some parametric family and is simultaneously robust against model misspecification. We treat the estimation of mixture complexity as a model selection problem and construct an estimator of mixture complexity as a byproduct of minimizing a Hellinger information criterion. This estimator is shown to be consistent for any parametric family of mixtures. When the model is correctly specified, Monte Carlo simulations for a wide variety of normal mixtures show that our estimator is very competitive with several others in the literature in correctly identifying the true mixture complexity. The basic construction, being firmly rooted in the minimum Hellinger distance approach, enables our estimator to naturally inherit the property of robustness, which is examined, through simulations, under symmetric departures from postulated component normality. In terms of correctly identifying the mixture complexity under model misspecification, our estimator performs much better than an estimator based on the Kullback-Leibler distance due to James, Priebe, and Marchette. An example concerning hypertension is revisited to further illustrate the performance of our estimator. |
---|---|
ISSN: | 0162-1459 1537-274X |
DOI: | 10.1198/016214506000000555 |