A novel tree-based dynamic heterogeneous ensemble method for credit scoring

•A tree-based heterogeneous ensemble credit scoring model is proposed.•Advanced GBDT-based methods function as components of our proposal.•An overfitting-cautious ensemble selection strategy is developed.•Our proposal outperforms the benchmark models significantly in most cases.•Our proposal is robu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2020-11, Vol.159, p.113615, Article 113615
Hauptverfasser: Xia, Yufei, Zhao, Junhao, He, Lingyun, Li, Yinguo, Niu, Mengyi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page 113615
container_title Expert systems with applications
container_volume 159
creator Xia, Yufei
Zhao, Junhao
He, Lingyun
Li, Yinguo
Niu, Mengyi
description •A tree-based heterogeneous ensemble credit scoring model is proposed.•Advanced GBDT-based methods function as components of our proposal.•An overfitting-cautious ensemble selection strategy is developed.•Our proposal outperforms the benchmark models significantly in most cases.•Our proposal is robust to slight modification on base model and fitness function. Ensemble models have been extensively applied to credit scoring. However, advanced tree-based classifiers have been seldom utilized as components of ensemble models. Moreover, few studies have considered dynamic ensemble selection. To fill the research gap, this paper aims to develop a novel tree-based overfitting-cautious heterogeneous ensemble model (i.e., OCHE) for credit scoring which departs from existing literature on base models and ensemble selection strategy. Regarding base models, tree-based techniques are employed to acquire a balance between predictive accuracy and computational cost. In terms of ensemble selection, the proposed method can assign weights to base models dynamically according to the overfitting measure. Validated on five public datasets, the proposed approach is compared with several popular benchmark models and selection strategies on predictive accuracy and computational cost measures. For predictive accuracy, the proposed approach outperforms the benchmark models significantly in most cases based on the non-parametric significance test. It also performs marginally better than several state-of-the-art studies. Our proposal remains robust in several scenarios. In terms of computational cost, the proposed method provides acceptable performance and benefits from GPU acceleration considerably.
doi_str_mv 10.1016/j.eswa.2020.113615
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2454516316</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0957417420304395</els_id><sourcerecordid>2454516316</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-cc784aec443bc0e33d2f7100a27b5788f0e59edf61a96198dc13b8ab4f0796473</originalsourceid><addsrcrecordid>eNp9kEtLw0AQgBdRsFb_gKcFz6mzj2QT8FKKLyx40fOy2Z20CU227m4r_ntT4lkYGBjmm8dHyC2DBQNW3HcLjN9mwYGPBSYKlp-RGSuVyApViXMygypXmWRKXpKrGDsApgDUjLwt6eCPuKMpIGa1ieio-xlM31q6xYTBb3BAf4gUh4h9vUPaY9p6RxsfqA3o2kSj9aEdNtfkojG7iDd_eU4-nx4_Vi_Z-v35dbVcZ1bwMmXWqlIatFKK2gIK4XijGIDhqs5VWTaAeYWuKZipClaVzjJRl6aWDaiqkErMyd00dx_81wFj0p0_hGFcqbnMZc4KMcac8KnLBh9jwEbvQ9ub8KMZ6JM03emTNH2SpidpI_QwQTjef2wx6GhbHOz4ZkCbtPPtf_gvWGt1MQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2454516316</pqid></control><display><type>article</type><title>A novel tree-based dynamic heterogeneous ensemble method for credit scoring</title><source>ScienceDirect Journals (5 years ago - present)</source><creator>Xia, Yufei ; Zhao, Junhao ; He, Lingyun ; Li, Yinguo ; Niu, Mengyi</creator><creatorcontrib>Xia, Yufei ; Zhao, Junhao ; He, Lingyun ; Li, Yinguo ; Niu, Mengyi</creatorcontrib><description>•A tree-based heterogeneous ensemble credit scoring model is proposed.•Advanced GBDT-based methods function as components of our proposal.•An overfitting-cautious ensemble selection strategy is developed.•Our proposal outperforms the benchmark models significantly in most cases.•Our proposal is robust to slight modification on base model and fitness function. Ensemble models have been extensively applied to credit scoring. However, advanced tree-based classifiers have been seldom utilized as components of ensemble models. Moreover, few studies have considered dynamic ensemble selection. To fill the research gap, this paper aims to develop a novel tree-based overfitting-cautious heterogeneous ensemble model (i.e., OCHE) for credit scoring which departs from existing literature on base models and ensemble selection strategy. Regarding base models, tree-based techniques are employed to acquire a balance between predictive accuracy and computational cost. In terms of ensemble selection, the proposed method can assign weights to base models dynamically according to the overfitting measure. Validated on five public datasets, the proposed approach is compared with several popular benchmark models and selection strategies on predictive accuracy and computational cost measures. For predictive accuracy, the proposed approach outperforms the benchmark models significantly in most cases based on the non-parametric significance test. It also performs marginally better than several state-of-the-art studies. Our proposal remains robust in several scenarios. In terms of computational cost, the proposed method provides acceptable performance and benefits from GPU acceleration considerably.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2020.113615</identifier><language>eng</language><publisher>New York: Elsevier Ltd</publisher><subject>Accuracy ; Benchmarks ; Computational efficiency ; Computing costs ; Credit scoring ; Gradient boosting decision tree ; Machine learning ; Random forests ; Selective ensemble ; State-of-the-art reviews</subject><ispartof>Expert systems with applications, 2020-11, Vol.159, p.113615, Article 113615</ispartof><rights>2020 Elsevier Ltd</rights><rights>Copyright Elsevier BV Nov 30, 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-cc784aec443bc0e33d2f7100a27b5788f0e59edf61a96198dc13b8ab4f0796473</citedby><cites>FETCH-LOGICAL-c328t-cc784aec443bc0e33d2f7100a27b5788f0e59edf61a96198dc13b8ab4f0796473</cites><orcidid>0000-0001-7805-8091</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.eswa.2020.113615$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3549,27923,27924,45994</link.rule.ids></links><search><creatorcontrib>Xia, Yufei</creatorcontrib><creatorcontrib>Zhao, Junhao</creatorcontrib><creatorcontrib>He, Lingyun</creatorcontrib><creatorcontrib>Li, Yinguo</creatorcontrib><creatorcontrib>Niu, Mengyi</creatorcontrib><title>A novel tree-based dynamic heterogeneous ensemble method for credit scoring</title><title>Expert systems with applications</title><description>•A tree-based heterogeneous ensemble credit scoring model is proposed.•Advanced GBDT-based methods function as components of our proposal.•An overfitting-cautious ensemble selection strategy is developed.•Our proposal outperforms the benchmark models significantly in most cases.•Our proposal is robust to slight modification on base model and fitness function. Ensemble models have been extensively applied to credit scoring. However, advanced tree-based classifiers have been seldom utilized as components of ensemble models. Moreover, few studies have considered dynamic ensemble selection. To fill the research gap, this paper aims to develop a novel tree-based overfitting-cautious heterogeneous ensemble model (i.e., OCHE) for credit scoring which departs from existing literature on base models and ensemble selection strategy. Regarding base models, tree-based techniques are employed to acquire a balance between predictive accuracy and computational cost. In terms of ensemble selection, the proposed method can assign weights to base models dynamically according to the overfitting measure. Validated on five public datasets, the proposed approach is compared with several popular benchmark models and selection strategies on predictive accuracy and computational cost measures. For predictive accuracy, the proposed approach outperforms the benchmark models significantly in most cases based on the non-parametric significance test. It also performs marginally better than several state-of-the-art studies. Our proposal remains robust in several scenarios. In terms of computational cost, the proposed method provides acceptable performance and benefits from GPU acceleration considerably.</description><subject>Accuracy</subject><subject>Benchmarks</subject><subject>Computational efficiency</subject><subject>Computing costs</subject><subject>Credit scoring</subject><subject>Gradient boosting decision tree</subject><subject>Machine learning</subject><subject>Random forests</subject><subject>Selective ensemble</subject><subject>State-of-the-art reviews</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLw0AQgBdRsFb_gKcFz6mzj2QT8FKKLyx40fOy2Z20CU227m4r_ntT4lkYGBjmm8dHyC2DBQNW3HcLjN9mwYGPBSYKlp-RGSuVyApViXMygypXmWRKXpKrGDsApgDUjLwt6eCPuKMpIGa1ieio-xlM31q6xYTBb3BAf4gUh4h9vUPaY9p6RxsfqA3o2kSj9aEdNtfkojG7iDd_eU4-nx4_Vi_Z-v35dbVcZ1bwMmXWqlIatFKK2gIK4XijGIDhqs5VWTaAeYWuKZipClaVzjJRl6aWDaiqkErMyd00dx_81wFj0p0_hGFcqbnMZc4KMcac8KnLBh9jwEbvQ9ub8KMZ6JM03emTNH2SpidpI_QwQTjef2wx6GhbHOz4ZkCbtPPtf_gvWGt1MQ</recordid><startdate>20201130</startdate><enddate>20201130</enddate><creator>Xia, Yufei</creator><creator>Zhao, Junhao</creator><creator>He, Lingyun</creator><creator>Li, Yinguo</creator><creator>Niu, Mengyi</creator><general>Elsevier Ltd</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-7805-8091</orcidid></search><sort><creationdate>20201130</creationdate><title>A novel tree-based dynamic heterogeneous ensemble method for credit scoring</title><author>Xia, Yufei ; Zhao, Junhao ; He, Lingyun ; Li, Yinguo ; Niu, Mengyi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-cc784aec443bc0e33d2f7100a27b5788f0e59edf61a96198dc13b8ab4f0796473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Accuracy</topic><topic>Benchmarks</topic><topic>Computational efficiency</topic><topic>Computing costs</topic><topic>Credit scoring</topic><topic>Gradient boosting decision tree</topic><topic>Machine learning</topic><topic>Random forests</topic><topic>Selective ensemble</topic><topic>State-of-the-art reviews</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xia, Yufei</creatorcontrib><creatorcontrib>Zhao, Junhao</creatorcontrib><creatorcontrib>He, Lingyun</creatorcontrib><creatorcontrib>Li, Yinguo</creatorcontrib><creatorcontrib>Niu, Mengyi</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xia, Yufei</au><au>Zhao, Junhao</au><au>He, Lingyun</au><au>Li, Yinguo</au><au>Niu, Mengyi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A novel tree-based dynamic heterogeneous ensemble method for credit scoring</atitle><jtitle>Expert systems with applications</jtitle><date>2020-11-30</date><risdate>2020</risdate><volume>159</volume><spage>113615</spage><pages>113615-</pages><artnum>113615</artnum><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>•A tree-based heterogeneous ensemble credit scoring model is proposed.•Advanced GBDT-based methods function as components of our proposal.•An overfitting-cautious ensemble selection strategy is developed.•Our proposal outperforms the benchmark models significantly in most cases.•Our proposal is robust to slight modification on base model and fitness function. Ensemble models have been extensively applied to credit scoring. However, advanced tree-based classifiers have been seldom utilized as components of ensemble models. Moreover, few studies have considered dynamic ensemble selection. To fill the research gap, this paper aims to develop a novel tree-based overfitting-cautious heterogeneous ensemble model (i.e., OCHE) for credit scoring which departs from existing literature on base models and ensemble selection strategy. Regarding base models, tree-based techniques are employed to acquire a balance between predictive accuracy and computational cost. In terms of ensemble selection, the proposed method can assign weights to base models dynamically according to the overfitting measure. Validated on five public datasets, the proposed approach is compared with several popular benchmark models and selection strategies on predictive accuracy and computational cost measures. For predictive accuracy, the proposed approach outperforms the benchmark models significantly in most cases based on the non-parametric significance test. It also performs marginally better than several state-of-the-art studies. Our proposal remains robust in several scenarios. In terms of computational cost, the proposed method provides acceptable performance and benefits from GPU acceleration considerably.</abstract><cop>New York</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2020.113615</doi><orcidid>https://orcid.org/0000-0001-7805-8091</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0957-4174
ispartof Expert systems with applications, 2020-11, Vol.159, p.113615, Article 113615
issn 0957-4174
1873-6793
language eng
recordid cdi_proquest_journals_2454516316
source ScienceDirect Journals (5 years ago - present)
subjects Accuracy
Benchmarks
Computational efficiency
Computing costs
Credit scoring
Gradient boosting decision tree
Machine learning
Random forests
Selective ensemble
State-of-the-art reviews
title A novel tree-based dynamic heterogeneous ensemble method for credit scoring
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T10%3A26%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20novel%20tree-based%20dynamic%20heterogeneous%20ensemble%20method%20for%20credit%20scoring&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Xia,%20Yufei&rft.date=2020-11-30&rft.volume=159&rft.spage=113615&rft.pages=113615-&rft.artnum=113615&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2020.113615&rft_dat=%3Cproquest_cross%3E2454516316%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2454516316&rft_id=info:pmid/&rft_els_id=S0957417420304395&rfr_iscdi=true