A one-class classification decision tree based on kernel density estimation

One-class Classification (OCC) is an important field of machine learning which aims at predicting a single class on the basis of its lonely representatives and potentially some additional counter-examples. OCC is thus opposed to traditional classification problems involving two or more classes, and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied soft computing 2020-06, Vol.91, p.106250, Article 106250
Hauptverfasser: Itani, Sarah, Lecron, Fabian, Fortemps, Philippe
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:One-class Classification (OCC) is an important field of machine learning which aims at predicting a single class on the basis of its lonely representatives and potentially some additional counter-examples. OCC is thus opposed to traditional classification problems involving two or more classes, and addresses the issue of class unbalance. There is a wide range of one-class models which give satisfaction in terms of performance. But at the time of explainable artificial intelligence, there is an increasing need for interpretable models. The present work advocates a novel one-class model which tackles this challenge. Within a greedy and recursive approach, our proposal for an explainable One-Class decision Tree (OC-Tree) rests on kernel density estimation to split a data subset on the basis of one or several intervals of interest. Thus, the OC-Tree encloses data within hyper-rectangles of interest which can be described by a set of rules. Against state-of-the-art methods such as Cluster Support Vector Data Description (ClusterSVDD), One-Class Support Vector Machine (OCSVM) and isolation Forest (iForest), the OC-Tree performs favorably on a range of benchmark datasets. Furthermore, we propose a real medical application for which the OC-Tree has demonstrated effectiveness, through the ability to tackle interpretable medical diagnosis aid based on unbalanced datasets. •One-Class Classification (OCC) addresses the challenging issue of class unbalance.•OCC models are trained on the instances of a class and some few potential outliers.•A new One-Class Tree (OC-Tree) is proposed for explainable and accurate decisions.•The tree induction is driven by density estimation to isolate target groupings.•The model proved efficient to diagnose ADHD and is promising for clinical practice.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2020.106250