Proposing a Dimensionality Reduction Technique With an Inequality for Unsupervised Learning from High-Dimensional Big Data
Data-clustering task can be considered as the most important unsupervised learning algorithms. For about all clustering algorithms, finding the Nearest Neighbors of a point within a certain radius r (NN- r ), is a critical task. For a high-dimensional dataset, this task becomes too time consuming. T...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on systems, man, and cybernetics. Systems man, and cybernetics. Systems, 2023-06, Vol.53 (6), p.1-10 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Data-clustering task can be considered as the most important unsupervised learning algorithms. For about all clustering algorithms, finding the Nearest Neighbors of a point within a certain radius r (NN- r ), is a critical task. For a high-dimensional dataset, this task becomes too time consuming. This article proposes a simple dimensionality reduction (DR) technique. For point p in d -dimensional space, it produces point p^{\prime} in d^{\prime} -dimensional space, where d^{\prime} r , then q can not be in NN- r of p . Using this trick, the task of finding the NN- r is speeded up. Then, as a case study, it is applied to accelerate the k -means, one of the most famous unsupervised learning algorithms, where it can automatically determine the d^{\prime} . The proposed NN- r method and the accelerated k -means are compared with recent state-of-the-arts, and both yield favorable results. |
---|---|
ISSN: | 2168-2216 2168-2232 |
DOI: | 10.1109/TSMC.2023.3234227 |