Outlier Recognition via Linguistic Aggregation of Graph Databases

Datasets frequently contain uncertain data that, if not interpreted with care, may affect information analysis negatively. Such rare, strange, or imperfect data, here called “outliers” or “exceptions” can be ignored in further processing or, on the other hand, handled by dedicated algorithms to deci...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2021-08, Vol.11 (16), p.7434
Hauptverfasser:	Niewiadomski, Adam, Duraj, Agnieszka, Bartczak, Monika
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Customer relationship management Datasets Fuzzy logic Fuzzy sets Information management Linguistics Noise outlier recognition Outliers (statistics) outliers in graph datasets outliers in terms of linguistic quantification Recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Datasets frequently contain uncertain data that, if not interpreted with care, may affect information analysis negatively. Such rare, strange, or imperfect data, here called “outliers” or “exceptions” can be ignored in further processing or, on the other hand, handled by dedicated algorithms to decide if they contain valuable, though very rare, information. There are different definitions and methods for handling outliers, and here, we are interested, in particular, in those based on linguistic quantification and fuzzy logic. In this paper, for the first time, we apply definitions of outliers and methods for recognizing them based on fuzzy sets and linguistically quantified statements to find outliers in non-relational, here graph-oriented, databases. These methods are proposed and exemplified to identify objects being outliers (e.g., to exclude them from processing). The novelty of this paper are the definitions and recognition algorithms for outliers using fuzzy logic and linguistic quantification, if traditional quantitative and/or measurable information is inaccessible, that frequently takes place in the graph nature of considered datasets.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app11167434