Detection of pre-cluster nano-tendency through multi-viewpoints cosine-based similarity approach

Pre-clusters assessment is a significant problem in data clustering. It found that visual cluster tendency assessment (VAT) is majorly focused on addressing the problem of pre-clusters assessment. This visual technique initially derives the similarity features of data objects using either cosine or...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nanotechnology for environmental engineering 2022-03, Vol.7 (1), p.259-268
Hauptverfasser: Basha, M. Suleman, Mouleeswaran, S. K., Prasad, K. Rajendra
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Pre-clusters assessment is a significant problem in data clustering. It found that visual cluster tendency assessment (VAT) is majorly focused on addressing the problem of pre-clusters assessment. This visual technique initially derives the similarity features of data objects using either cosine or Euclidean distance metrics. Cosine is considering both magnitudes and direction of the vectors; thus, it greatly succeeded in data clustering applications. Only a single viewpoint (i.e., origin) is used in the cosine metric. Finding the similarity features using multiple viewpoints is more accurate than a single viewpoint cosine metric. This paper presents the multi-viewpoints cosine-based similarity VAT (MVS-VAT) which considers the multi-viewpoints for an effective assessment of nano-pre-clusters (or nano-cluster tendency). Clustering accuracy (CA) and normalized mutual information (NMI) are taken for measuring the performance of the existing and proposed methods. It is proved that the efficiency of the proposed MVS-VAT is improved from 20 to 40% compared to VAT and cVAT concerning the parameters of CA and NMI. Therefore, the quality of data clusters is obtained through the proposed technique MVS-VAT. Experimental is conducted on several benchmarked datasets for illustration of an empirical study of the existing and proposed techniques.
ISSN:2365-6379
2365-6387
DOI:10.1007/s41204-022-00222-8