Bankruptcy prediction using optimal ensemble models under balanced and imbalanced data
This study explores the performance of gradient boosting methods in bankruptcy prediction for a highly imbalanced dataset. We developed different heterogenous ensemble models based on three popular gradient boosting methods—XGBoost, LightGBM, and CatBoost. Our ensemble models were optimized using th...
Gespeichert in:
Veröffentlicht in: | Expert systems 2024-08, Vol.41 (8), p.n/a |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | n/a |
---|---|
container_issue | 8 |
container_start_page | |
container_title | Expert systems |
container_volume | 41 |
creator | Amirshahi, Bahareh Lahmiri, Salim |
description | This study explores the performance of gradient boosting methods in bankruptcy prediction for a highly imbalanced dataset. We developed different heterogenous ensemble models based on three popular gradient boosting methods—XGBoost, LightGBM, and CatBoost. Our ensemble models were optimized using the cross‐validation method and the results of the hold‐out test sets showed that the optimized ensemble models not only outperform their base learners, but also improve the state‐of‐the‐art benchmark results on the same dataset. Interestingly, we observed that the data oversampling technique that is commonly used to address the class imbalance issue had an adverse impact on our ensemble models' performance. This indicates that our models are robust to the imbalanced dataset problem that typically degrades the classification performance of machine learning models. |
doi_str_mv | 10.1111/exsy.13599 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3075437005</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3075437005</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2969-22c4e10f4a26eafbcb8d96808b746f11e7795ccf70e734cdeebbfd1cd41c07f23</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMoWFcv_oKAN6Fr0qZJe9Rl_YAFD36gp5AmE-napjVp0f57u1Y8Opdh4JkZ3gehU0qWdKoL-ArjkqZZUeyhiDKexyQt2D6KSMJ5zERCDtFRCFtCCBWCR-j5Srl3P3S9HnHnwVS6r1qHh1C5N9x2fdWoGoML0JQ14KY1UAc8OAMel6pWToPByhlcNX-jUb06RgdW1QFOfvsCPV2vH1e38eb-5m51uYl1UvAiThLNgBLLVMJB2VKXuSl4TvJSMG4pBSGKTGsrCIiUaQNQltZQbRjVRNgkXaCz-W7n248BQi-37eDd9FKmRGQsFYRkE3U-U9q3IXiwsvNTMD9KSuTOm9x5kz_eJpjO8GdVw_gPKdcvD6_zzjfKKXI6</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3075437005</pqid></control><display><type>article</type><title>Bankruptcy prediction using optimal ensemble models under balanced and imbalanced data</title><source>Access via Wiley Online Library</source><creator>Amirshahi, Bahareh ; Lahmiri, Salim</creator><creatorcontrib>Amirshahi, Bahareh ; Lahmiri, Salim</creatorcontrib><description>This study explores the performance of gradient boosting methods in bankruptcy prediction for a highly imbalanced dataset. We developed different heterogenous ensemble models based on three popular gradient boosting methods—XGBoost, LightGBM, and CatBoost. Our ensemble models were optimized using the cross‐validation method and the results of the hold‐out test sets showed that the optimized ensemble models not only outperform their base learners, but also improve the state‐of‐the‐art benchmark results on the same dataset. Interestingly, we observed that the data oversampling technique that is commonly used to address the class imbalance issue had an adverse impact on our ensemble models' performance. This indicates that our models are robust to the imbalanced dataset problem that typically degrades the classification performance of machine learning models.</description><identifier>ISSN: 0266-4720</identifier><identifier>EISSN: 1468-0394</identifier><identifier>DOI: 10.1111/exsy.13599</identifier><language>eng</language><publisher>Oxford: Blackwell Publishing Ltd</publisher><subject>Bankruptcy ; bankruptcy prediction ; Datasets ; gradient boosting methods ; imbalanced dataset ; Machine learning ; optimal ensemble models ; Oversampling</subject><ispartof>Expert systems, 2024-08, Vol.41 (8), p.n/a</ispartof><rights>2024 The Authors. published by John Wiley & Sons Ltd.</rights><rights>2024. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c2969-22c4e10f4a26eafbcb8d96808b746f11e7795ccf70e734cdeebbfd1cd41c07f23</cites><orcidid>0000-0002-9237-4100</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1111%2Fexsy.13599$$EPDF$$P50$$Gwiley$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1111%2Fexsy.13599$$EHTML$$P50$$Gwiley$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,1417,27924,27925,45574,45575</link.rule.ids></links><search><creatorcontrib>Amirshahi, Bahareh</creatorcontrib><creatorcontrib>Lahmiri, Salim</creatorcontrib><title>Bankruptcy prediction using optimal ensemble models under balanced and imbalanced data</title><title>Expert systems</title><description>This study explores the performance of gradient boosting methods in bankruptcy prediction for a highly imbalanced dataset. We developed different heterogenous ensemble models based on three popular gradient boosting methods—XGBoost, LightGBM, and CatBoost. Our ensemble models were optimized using the cross‐validation method and the results of the hold‐out test sets showed that the optimized ensemble models not only outperform their base learners, but also improve the state‐of‐the‐art benchmark results on the same dataset. Interestingly, we observed that the data oversampling technique that is commonly used to address the class imbalance issue had an adverse impact on our ensemble models' performance. This indicates that our models are robust to the imbalanced dataset problem that typically degrades the classification performance of machine learning models.</description><subject>Bankruptcy</subject><subject>bankruptcy prediction</subject><subject>Datasets</subject><subject>gradient boosting methods</subject><subject>imbalanced dataset</subject><subject>Machine learning</subject><subject>optimal ensemble models</subject><subject>Oversampling</subject><issn>0266-4720</issn><issn>1468-0394</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>24P</sourceid><sourceid>WIN</sourceid><recordid>eNp9kE1LxDAQhoMoWFcv_oKAN6Fr0qZJe9Rl_YAFD36gp5AmE-napjVp0f57u1Y8Opdh4JkZ3gehU0qWdKoL-ArjkqZZUeyhiDKexyQt2D6KSMJ5zERCDtFRCFtCCBWCR-j5Srl3P3S9HnHnwVS6r1qHh1C5N9x2fdWoGoML0JQ14KY1UAc8OAMel6pWToPByhlcNX-jUb06RgdW1QFOfvsCPV2vH1e38eb-5m51uYl1UvAiThLNgBLLVMJB2VKXuSl4TvJSMG4pBSGKTGsrCIiUaQNQltZQbRjVRNgkXaCz-W7n248BQi-37eDd9FKmRGQsFYRkE3U-U9q3IXiwsvNTMD9KSuTOm9x5kz_eJpjO8GdVw_gPKdcvD6_zzjfKKXI6</recordid><startdate>202408</startdate><enddate>202408</enddate><creator>Amirshahi, Bahareh</creator><creator>Lahmiri, Salim</creator><general>Blackwell Publishing Ltd</general><scope>24P</scope><scope>WIN</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7TB</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-9237-4100</orcidid></search><sort><creationdate>202408</creationdate><title>Bankruptcy prediction using optimal ensemble models under balanced and imbalanced data</title><author>Amirshahi, Bahareh ; Lahmiri, Salim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2969-22c4e10f4a26eafbcb8d96808b746f11e7795ccf70e734cdeebbfd1cd41c07f23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Bankruptcy</topic><topic>bankruptcy prediction</topic><topic>Datasets</topic><topic>gradient boosting methods</topic><topic>imbalanced dataset</topic><topic>Machine learning</topic><topic>optimal ensemble models</topic><topic>Oversampling</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Amirshahi, Bahareh</creatorcontrib><creatorcontrib>Lahmiri, Salim</creatorcontrib><collection>Wiley Online Library Open Access</collection><collection>Wiley Online Library (Open Access Collection)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Amirshahi, Bahareh</au><au>Lahmiri, Salim</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bankruptcy prediction using optimal ensemble models under balanced and imbalanced data</atitle><jtitle>Expert systems</jtitle><date>2024-08</date><risdate>2024</risdate><volume>41</volume><issue>8</issue><epage>n/a</epage><issn>0266-4720</issn><eissn>1468-0394</eissn><abstract>This study explores the performance of gradient boosting methods in bankruptcy prediction for a highly imbalanced dataset. We developed different heterogenous ensemble models based on three popular gradient boosting methods—XGBoost, LightGBM, and CatBoost. Our ensemble models were optimized using the cross‐validation method and the results of the hold‐out test sets showed that the optimized ensemble models not only outperform their base learners, but also improve the state‐of‐the‐art benchmark results on the same dataset. Interestingly, we observed that the data oversampling technique that is commonly used to address the class imbalance issue had an adverse impact on our ensemble models' performance. This indicates that our models are robust to the imbalanced dataset problem that typically degrades the classification performance of machine learning models.</abstract><cop>Oxford</cop><pub>Blackwell Publishing Ltd</pub><doi>10.1111/exsy.13599</doi><tpages>25</tpages><orcidid>https://orcid.org/0000-0002-9237-4100</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0266-4720 |
ispartof | Expert systems, 2024-08, Vol.41 (8), p.n/a |
issn | 0266-4720 1468-0394 |
language | eng |
recordid | cdi_proquest_journals_3075437005 |
source | Access via Wiley Online Library |
subjects | Bankruptcy bankruptcy prediction Datasets gradient boosting methods imbalanced dataset Machine learning optimal ensemble models Oversampling |
title | Bankruptcy prediction using optimal ensemble models under balanced and imbalanced data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-23T05%3A14%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bankruptcy%20prediction%20using%20optimal%20ensemble%20models%20under%20balanced%20and%20imbalanced%20data&rft.jtitle=Expert%20systems&rft.au=Amirshahi,%20Bahareh&rft.date=2024-08&rft.volume=41&rft.issue=8&rft.epage=n/a&rft.issn=0266-4720&rft.eissn=1468-0394&rft_id=info:doi/10.1111/exsy.13599&rft_dat=%3Cproquest_cross%3E3075437005%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3075437005&rft_id=info:pmid/&rfr_iscdi=true |