Artificial bee colony optimization-based weighted extreme learning machine for imbalanced data learning

The imbalanced datasets are common in real-world application and the problem of imbalanced dataset affect classification performance of many standard learning approaches. To address imbalanced datasets, a weighted extreme learning machine (WELM) solving the L 2 -regularized weighted least squares pr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Cluster computing 2019-05, Vol.22 (Suppl 3), p.6937-6952
Hauptverfasser: Tang, Xiaofen, Chen, Li
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The imbalanced datasets are common in real-world application and the problem of imbalanced dataset affect classification performance of many standard learning approaches. To address imbalanced datasets, a weighted extreme learning machine (WELM) solving the L 2 -regularized weighted least squares problem is presented to avoid the generation of an over-fitting model and obtain better generalization ability compared with ELM. However, the weight generated according to class distribution of training data leads to lack of finding optimal weight with good generalization performance and the randomness of input weight and hidden biases of network makes the algorithm produce suboptimal classification model. In this paper, a weighted extreme learning machine based on hybrid artificial bee colony (HABC) is proposed to obtain better performance than WELM, in which input weights and hidden bias of WELM and the weight assigned to training samples are optimized by the hybrid artificial bee colony algorithm. HABC combines the diversities of the perturbed parameter vectors of differential evolution with the best solution information of the artificial bee colony effectively. In the empirical study, different class imbalance data handling methods including four WELM-based methods, weighted support vector machine, four ensemble methods which combine data sampling and the Bagging or Boosting are compared with our method. The experimental results on 15 imbalanced datasets show that the proposed method outperforms most methods, which indicates its superiority.
ISSN:1386-7857
1573-7543
DOI:10.1007/s10586-018-1808-9