Pre-Cutoff Value Calculation Method for Accelerating Metric Space Outlier Detection

Outlier detection is an important data mining technique. In this article, the triangle inequality of distances is leveraged to design a pre-cutoff value (PCV) algorithm that calculates the outlier degree pre-threshold without additional distance computations. This algorithm is suitable for accelerat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of grid and high performance computing 2024-01, Vol.16 (1), p.1-17
Hauptverfasser: Xu, Honglong, Liang, Zhonghao, Huang, Kaide, Huang, Guoshun, He, Yan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Outlier detection is an important data mining technique. In this article, the triangle inequality of distances is leveraged to design a pre-cutoff value (PCV) algorithm that calculates the outlier degree pre-threshold without additional distance computations. This algorithm is suitable for accelerating various metric space outlier detection algorithms. Experimental results on multiple real datasets demonstrate that the PCV algorithm reduces the runtime and number of distance computations for the iORCA algorithm by 14.59% and 15.73%, respectively. Even compared to the new high-performance algorithm ADPOD, the PCV algorithm achieves 1.41% and 0.45% reductions. Notably, the non-outlier exclusion for the first data block in the dataset is significantly improved, with an exclusion rate of up to 36.5%, leading to a 23.54% reduction in detection time for that data block. While demonstrating excellent results, the PCV algorithm maintains the data type generality of metric space algorithms.
ISSN:1938-0259
1938-0267
DOI:10.4018/IJGHPC.334125