Data clustering method based on dimensionality reduction and sampling
The invention discloses a data clustering method based on dimensionality reduction and sampling, comprising steps of performing dimensionality reduction processing on a data set through a piecewise mean value algorithm, constructing a random function, performing random sampling from a large-scale cl...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a data clustering method based on dimensionality reduction and sampling, comprising steps of performing dimensionality reduction processing on a data set through a piecewise mean value algorithm, constructing a random function, performing random sampling from a large-scale clustering data set to obtain a working set with a relatively small scale, performing k-means clustering on the working set to obtain a random sampling clustering result, and performing classification on residual samples through measuring a relation between the residual samples and the obtained sampling clustering result. The data clustering method based on dimensionality reduction and sampling adopts the dimensionality reduction and sampling to reduce the number and dimensionalities of data samples which participate iteration, greatly reduces the complexity of the k-means algorithm under the condition that a good clustering effect is maintained, and realizes high efficiency clustering for big-scale data.
种基于降维和抽样的数据 |
---|