SOM: Stochastic initialization versus principal components

Selection of a good initial approximation is a well known problem for all iterative methods of data approximation, from k-means to Self-Organizing Maps (SOM) and manifold learning. The quality of the resulting data approximation depends on the initial approximation. Principal components are popular...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences 2016-10, Vol.364-365, p.213-221
Hauptverfasser: Akinduko, Ayodeji A., Mirkes, Evgeny M., Gorban, Alexander N.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Selection of a good initial approximation is a well known problem for all iterative methods of data approximation, from k-means to Self-Organizing Maps (SOM) and manifold learning. The quality of the resulting data approximation depends on the initial approximation. Principal components are popular as an initial approximation for many methods of nonlinear dimensionality reduction because its convenience and exact reproducibility of the results. Nevertheless, the reports about the results of the principal component initialization are controversial. In this work, we separate datasets into two classes: quasilinear and essentially nonlinear datasets. We demonstrate on learning of one-dimensional SOM (models of principal curves) that for the quasilinear datasets the principal component initialization of the self-organizing maps is systematically better than the random initialization, whereas for the essentially nonlinear datasets the random initialization may perform better. Performance is evaluated by the fraction of variance unexplained in numerical experiments.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2015.10.013