ACCURACY OF NONPARAMETRIC DENSITY ESTIMATION FOR UNIVARIATE GAUSSIAN MIXTURE MODELS: A COMPARATIVE STUDY
Flexible and reliable probability density estimation is fundamental in unsupervised learning and classification. Finite Gaussian mixture models are commonly used for this purpose. However, the parametric form of the distribution is not always known. In this case, non-parametric density estimation me...
Gespeichert in:
Veröffentlicht in: | Mathematical modelling and analysis 2020-10, Vol.25 (4), p.622-641 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Flexible and reliable probability density estimation is fundamental in unsupervised learning and classification. Finite Gaussian
mixture models are commonly used for this purpose. However, the parametric form of the distribution is not always known. In this case, non-parametric
density estimation methods are used. Usually, these methods become computationally demanding as the number of components increases. In this paper,
a comparative study of accuracy of some nonparametric density estimators is made by means of simulation. The following approaches have been considered:
an adaptive bandwidth kernel estimator, a projection pursuit estimator, a logspline estimator, and a k-nearest neighbor estimator. It was concluded that
data clustering as a pre-processing step improves the estimation of mixture densities. However, in case data does not have clearly defined clusters,
the pre-preprocessing step does not give that much of advantage. The application of density estimators is illustrated using municipal solid waste data
collected in Kaunas (Lithuania). The data distribution is similar (i.e., with kurtotic unimodal density) to the benchmark distribution introduced by
Marron and Wand. Based on the homogeneity tests it can be concluded that distributions of the municipal solid waste fractions in Kutaisi (Georgia),
Saint-Petersburg (Russia), and Boryspil (Ukraine) are statistically indifferent compared to the distribution of waste fractions in Kaunas. The distribution
of waste data collected in Kaunas (Lithuania) follows the general observations introduced by Marron and Wand (i.e., has one mode and certain kurtosis). |
---|---|
ISSN: | 1392-6292 1648-3510 |
DOI: | 10.3846/mma.2020.10505 |