Proposing a Dimensionality Reduction Technique With an Inequality for Unsupervised Learning from High-Dimensional Big Data

Data-clustering task can be considered as the most important unsupervised learning algorithms. For about all clustering algorithms, finding the Nearest Neighbors of a point within a certain radius r (NN- r ), is a critical task. For a high-dimensional dataset, this task becomes too time consuming. T...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on systems, man, and cybernetics. Systems man, and cybernetics. Systems, 2023-06, Vol.53 (6), p.1-10
Hauptverfasser:	Ismkhan, Hassan, Izadi, Mohammad
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Big Data Clustering Clustering algorithms Dimensionality reduction dimensionality reduction (DR) Euclidean geometry Feature extraction high-dimensional data k-means Machine learning nearest neighbor (NN) Reduction Task analysis Transforms Unsupervised learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Data-clustering task can be considered as the most important unsupervised learning algorithms. For about all clustering algorithms, finding the Nearest Neighbors of a point within a certain radius r (NN- r ), is a critical task. For a high-dimensional dataset, this task becomes too time consuming. This article proposes a simple dimensionality reduction (DR) technique. For point p in d -dimensional space, it produces point p^{\prime} in d^{\prime} -dimensional space, where d^{\prime} r , then q can not be in NN- r of p . Using this trick, the task of finding the NN- r is speeded up. Then, as a case study, it is applied to accelerate the k -means, one of the most famous unsupervised learning algorithms, where it can automatically determine the d^{\prime} . The proposed NN- r method and the accelerated k -means are compared with recent state-of-the-arts, and both yield favorable results.
ISSN:	2168-2216 2168-2232
DOI:	10.1109/TSMC.2023.3234227