A novel approach for detecting anomalies in clusters using soft computing techniques

Data mining techniques are used to generate patterns and collect meaningful information from big databases using data mining concepts. Classification, grouping, and outlier analysis are some of the well-known activities related with data mining techniques. Outliers are items that depart from other o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Prasad, Padavala Sai, Sangeetha, Tamilarasu, Sree, Pokkuluri Kiran, Reddy, Vuyyuru Krishna
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Data mining techniques are used to generate patterns and collect meaningful information from big databases using data mining concepts. Classification, grouping, and outlier analysis are some of the well-known activities related with data mining techniques. Outliers are items that depart from other objects despite being categorised under the same category. Outliers are objects that researchers focus their examination on a particular or specialized quality. Human errors, instrumental errors in taking measurements or conducting experiments, and novel patterns formed in the dataset are all causes of outliers. There is ambiguity and uncertainty in data in the real world. To deal with uncertain data, a sophisticated mathematical technique called rough set is required. The concept of approximation is used in rough set theory. Apply the suggested approach, a crude entropy-based weighted density method on individual clusters, to identify outlier items. As a result, the proposed approach works with unsupervised data; it creates weighted density values for both objects and conditional attributes (excluding decision attributes) to identify outliers in a way that existing methods fail to do. The benchmark breast cancer dataset from the UCI repository was used for analysis, and purity measures for individual clusters were generated.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0123212