Bankruptcy prediction using optimal ensemble models under balanced and imbalanced data

This study explores the performance of gradient boosting methods in bankruptcy prediction for a highly imbalanced dataset. We developed different heterogenous ensemble models based on three popular gradient boosting methods—XGBoost, LightGBM, and CatBoost. Our ensemble models were optimized using th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems 2024-08, Vol.41 (8), p.n/a
Hauptverfasser: Amirshahi, Bahareh, Lahmiri, Salim
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This study explores the performance of gradient boosting methods in bankruptcy prediction for a highly imbalanced dataset. We developed different heterogenous ensemble models based on three popular gradient boosting methods—XGBoost, LightGBM, and CatBoost. Our ensemble models were optimized using the cross‐validation method and the results of the hold‐out test sets showed that the optimized ensemble models not only outperform their base learners, but also improve the state‐of‐the‐art benchmark results on the same dataset. Interestingly, we observed that the data oversampling technique that is commonly used to address the class imbalance issue had an adverse impact on our ensemble models' performance. This indicates that our models are robust to the imbalanced dataset problem that typically degrades the classification performance of machine learning models.
ISSN:0266-4720
1468-0394
DOI:10.1111/exsy.13599