Business Failure Prediction Based on a Cost-Sensitive Extreme Gradient Boosting Machine

Business failure prediction is very important for the sustainable development of enterprises. Machine learning algorithms, especially ensemble algorithms, have shown great economic benefits in enterprise financial early warning. However, the highly imbalanced class distribution of financial risk dat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2022, Vol.10, p.42623-42639
Hauptverfasser: Zou, Yao, Gao, Changchun, Gao, Han
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Business failure prediction is very important for the sustainable development of enterprises. Machine learning algorithms, especially ensemble algorithms, have shown great economic benefits in enterprise financial early warning. However, the highly imbalanced class distribution of financial risk data and the inexplainable of most machine learning-based early distress warning models limit their commercial application. To address the above limitations, we enhance the business failure prediction performance by tree-ensemble in a boosting manner. Moreover, to solve the class imbalanced issue in business failure datasets, a weighted objective function, weighted cross-entropy, is embedded into the boosted tree framework, making the weighted XGBoost a cost-sensitive business failure prediction model. Besides, to tackle the second issue, we explore the intrinsic interpretability of the proposed method by visualizing the feature importance and incorporating a partial dependence plot technique to locally interpret the individual business failure event. Experimental results on business failure datasets with different predictive horizons collected from China Security Market Accounting Research (CSMAR) database show the proposed weighted XGBoost is a good solution to reduce the error on recognizing firms in business failure. Furthermore, the visualized feature importance score and partial dependence plot result both demonstrate that the cost-sensitive tree-based ensemble can be a good tool to guide the investors in making rational as well as provide interpretable business prediction results as a reference for the policy-making of the regulators.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3168857