Fuzzy Relational Scattered Distance Based Clustering Method for Sparsely Distributed High Dimensional Data Objects
Clustering is one of the most significant ideas in data mining. It is an unsupervised learning model. Clustering technique in handling high dimensional data is more complex due to intrinsic sparsity nature of high dimensional data. Though, existing methods to reduce immaterial clusters were based on...
Gespeichert in:
Veröffentlicht in: | International journal of recent technology and engineering 2020-01, Vol.8 (5), p.4044-4049 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Clustering is one of the most significant ideas in data mining. It is an unsupervised learning model. Clustering technique in handling high dimensional data is more complex due to intrinsic sparsity nature of high dimensional data. Though, existing methods to reduce immaterial clusters were based on spectral clustering algorithm and graph-based learning algorithm, whose lack of sparsity and polynomial time complexity compromises their efficiency when applied to sparse high dimensional data. This paper concentrates to cluster the sparsely distributed high dimensional data objects. Fuzzy Relational Scattered Distance Based Clustering (FRSDBC) method is developed with three models such as Geometric Median Based Fuzzy model, Scattered Distance measure model, Grid based clustered sparse data representation model. Geometric Median Based Fuzzy model calculates the geometric median of similar sparse data and then the non similar sparse data objects to fitting the relational fuzziness across data points. It involves in the subspace reduction of data objects. Scattered Distance measure model is used to measure the distance between the inner and outer object. Grid based clustering is used to calculate the area of the cluster in FRSDBC method. The main idea of the FRSDBC method is to clustering data points over sparsely distributed data within limited processing time. The Clustering Time, Clustering Accuracy and Space Complexity of each method is analyzed. The result of the FRSDBC method is compared with other techniques, the results obtained are more accurate, easy to understand and the clustering time was substantially low in FRSDBC method. It is widely used in many practical applications such as weather forecast, share trading, medical data analysis and aerial data analysis. |
---|---|
ISSN: | 2277-3878 2277-3878 |
DOI: | 10.35940/ijrte.E6633.018520 |