Data clustering method based on dimensionality reduction and sampling

The invention discloses a data clustering method based on dimensionality reduction and sampling, comprising steps of performing dimensionality reduction processing on a data set through a piecewise mean value algorithm, constructing a random function, performing random sampling from a large-scale cl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Li Zhong, Zhang Tiefeng, Gu Mingdi
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a data clustering method based on dimensionality reduction and sampling, comprising steps of performing dimensionality reduction processing on a data set through a piecewise mean value algorithm, constructing a random function, performing random sampling from a large-scale clustering data set to obtain a working set with a relatively small scale, performing k-means clustering on the working set to obtain a random sampling clustering result, and performing classification on residual samples through measuring a relation between the residual samples and the obtained sampling clustering result. The data clustering method based on dimensionality reduction and sampling adopts the dimensionality reduction and sampling to reduce the number and dimensionalities of data samples which participate iteration, greatly reduces the complexity of the k-means algorithm under the condition that a good clustering effect is maintained, and realizes high efficiency clustering for big-scale data. 种基于降维和抽样的数据