Testing for Group Structure in High-Dimensional Data
With the use of finite mixture models for the clustering of a data set, the crucial question of how many clusters there are in the data can be addressed by testing for the smallest number of components in the mixture model compatible with the data. We investigate the performance of a resampling appr...
Gespeichert in:
Veröffentlicht in: | Journal of biopharmaceutical statistics 2011-11, Vol.21 (6), p.1113-1125 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the use of finite mixture models for the clustering of a data set, the crucial question of how many clusters there are in the data can be addressed by testing for the smallest number of components in the mixture model compatible with the data. We investigate the performance of a resampling approach to this latter problem in the context of high-dimensional data, where the number of variables p is extremely large relative to the number of observations n. In order to be able to fit normal mixture models to such data, some form of dimension reduction has to be performed. This raises the question of whether a practically significant bias results if the bootstrapping is undertaken solely on the basis of the reduced dimensional form of the data, rather than using the full data from which to draw the bootstrap sample replications. |
---|---|
ISSN: | 1054-3406 1520-5711 |
DOI: | 10.1080/10543406.2011.608342 |