Proposing a Dimensionality Reduction Technique With an Inequality for Unsupervised Learning from High-Dimensional Big Data

Data-clustering task can be considered as the most important unsupervised learning algorithms. For about all clustering algorithms, finding the Nearest Neighbors of a point within a certain radius r (NN- r ), is a critical task. For a high-dimensional dataset, this task becomes too time consuming. T...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on systems, man, and cybernetics. Systems man, and cybernetics. Systems, 2023-06, Vol.53 (6), p.1-10
Hauptverfasser: Ismkhan, Hassan, Izadi, Mohammad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Data-clustering task can be considered as the most important unsupervised learning algorithms. For about all clustering algorithms, finding the Nearest Neighbors of a point within a certain radius r (NN- r ), is a critical task. For a high-dimensional dataset, this task becomes too time consuming. This article proposes a simple dimensionality reduction (DR) technique. For point p in d -dimensional space, it produces point p^{\prime} in d^{\prime} -dimensional space, where d^{\prime} r , then q can not be in NN- r of p . Using this trick, the task of finding the NN- r is speeded up. Then, as a case study, it is applied to accelerate the k -means, one of the most famous unsupervised learning algorithms, where it can automatically determine the d^{\prime} . The proposed NN- r method and the accelerated k -means are compared with recent state-of-the-arts, and both yield favorable results.
ISSN:2168-2216
2168-2232
DOI:10.1109/TSMC.2023.3234227