DBIG-US: A two-stage under-sampling algorithm to face the class imbalance problem
The class imbalance problem occurs when one class far outnumbers the other classes, causing most traditional classifiers perform poorly on the minority classes. To tackle this problem, a plethora of techniques have been proposed, especially centered around resampling methods. This paper introduces a...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2021-04, Vol.168, p.114301, Article 114301 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The class imbalance problem occurs when one class far outnumbers the other classes, causing most traditional classifiers perform poorly on the minority classes. To tackle this problem, a plethora of techniques have been proposed, especially centered around resampling methods. This paper introduces a two-stage method that combines the DBSCAN clustering algorithm to filter noisy majority class instances with a graph-based procedure to overcome the class imbalance. We then experimentally evaluate the behavior of the proposed method on a collection of two-class imbalanced data sets. The experimental results show an improvement in the classification performance measured by the geometric mean of the accuracy on each class and also a higher reduction in the imbalance ratio when compared to several state-of-the-art under-sampling techniques.
•We introduce a new under-sampling algorithm for class-imbalanced data.•This is a hybrid method based on DBSCAN clustering and a graph-based procedure.•Experiments compare the method with other state-of-the-art undersampling algorithms.•The method achieves the best classification performance and reduction in imbalance. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2020.114301 |