Data Fusion for Outlier Detection through Pseudo-ROC Curves and Rank Distributions
This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications in...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications inspired the naming of such problems as security classification problems (SCP). The area under the ROC curve (AUC) is the metric utilized to measure classifier performance, and in order to better understand AUC and ROC behavior, pseudo-ROC curves created from simulated data are introduced. ROC curves depend entirely upon the rankings created by classifiers. The rank distributions discussed in this paper display classifier performance in a novel form, and the behavior of these rank distributions provides insight into classifier fusion for the SCP. Rank distributions, which illustrate the probability of a particular rank containing a positive or negative instance, will be introduced and used to explain why synergistic classifier fusion occurs. |
---|---|
ISSN: | 2161-4393 2161-4407 |
DOI: | 10.1109/IJCNN.2006.246989 |