A new approach for imbalanced data classification based on data gravitation

Imbalanced classification is an important machine learning research topic that troubles most general classification models because of the imbalanced class distribution. A newly developed physical-inspired classification method, i.e., the data gravitation-based classification (DGC) model, performs we...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences 2014-12, Vol.288, p.347-373
Hauptverfasser: Peng, Lizhi, Zhang, Hongli, Yang, Bo, Chen, Yuehui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Imbalanced classification is an important machine learning research topic that troubles most general classification models because of the imbalanced class distribution. A newly developed physical-inspired classification method, i.e., the data gravitation-based classification (DGC) model, performs well in many general classification problems. However, like other general classifiers, the performance of DGC suffers in imbalanced tasks. In this study, we develop a specific DGC model namely Imbalanced DGC (IDGC) model for imbalanced problems. The amplified gravitation coefficient (AGC) is introduced for gravitation computing. AGC is a type of coefficient that contains class imbalance information, which can strengthen and weaken the gravitational field of the minority and majority classes. We also design a fitness evaluation function in the weight optimization procedure of the data distribution to ensure that the model parameters adapt to the imbalanced class distributions. A total of 44 binary class data sets and 15 multiclass imbalanced data sets are used to test the performance of the proposed method. Experimental results show that the adapted DGC model is effective for imbalanced problems.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2014.04.046