Outlier Detection in Wellness Data using Probabilistic Mapped Mean-Shift Algorithms

In this paper, the Probabilistic Mapped Mean-Shift Algorithm is proposed to detect anomalous data in public datasets and local hospital children’s wellness clinic databases. The proposed framework consists of two main parts. First, the Probabilistic Mapping step consists of k-NN instance acquisition...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ECTI Transactions on Computer and Information Technology 2021-08, Vol.15 (2), p.258-266
Hauptverfasser: Phongsasiri, Siriwan, Rasmequan, Suwanna
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, the Probabilistic Mapped Mean-Shift Algorithm is proposed to detect anomalous data in public datasets and local hospital children’s wellness clinic databases. The proposed framework consists of two main parts. First, the Probabilistic Mapping step consists of k-NN instance acquisition, data distribution calculation, and data point reposition.  Truncated Gaussian Distribution (TGD) was used for controlling the boundary of the mapped points. Second, the Outlier Detection step consists of outlier score calculation and outlier selection.  Experimental results show that the proposed algorithm outperformed the existing algorithms with real-world benchmark datasets and  a Children’s Wellness Clinic dataset (CWD). Outlier detection accuracy obtained from the proposed algorithm based on Wellness, Stamps, Arrhythmia, Pima, and Parkinson datasets was 93%, 94%, 80%, 75%, and 72%, respectively.
ISSN:2286-9131
2286-9131
DOI:10.37936/ecti-cit.2021152.244971