Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction

•We examined the effectiveness an optimized cluster-based undersampling technique.•We used a GA-based optimization approach for selecting the appropriate instances.•A critical issue of real-world knowledge extraction is the data imbalance problem.•The proposed method is successfully applied to the b...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2016-10, Vol.59, p.226-234
Hauptverfasser:	Kim, Hyun-Jung, Jo, Nam-Ok, Shin, Kyung-Shik
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Bankruptcies Classification Cluster-based undersampling technique Clustering Corporate bankruptcy prediction Data mining Data structures Evolutionary Genetic algorithms Imbalance data Optimization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	234
container_issue
container_start_page	226
container_title	Expert systems with applications
container_volume	59
creator	Kim, Hyun-Jung Jo, Nam-Ok Shin, Kyung-Shik
description	•We examined the effectiveness an optimized cluster-based undersampling technique.•We used a GA-based optimization approach for selecting the appropriate instances.•A critical issue of real-world knowledge extraction is the data imbalance problem.•The proposed method is successfully applied to the bankruptcy prediction problem. We suggest an optimization approach of cluster-based undersampling to select appropriate instances. This approach can solve the data imbalance problem, which can lead to knowledge extraction for improving the performance of existing data mining techniques. Although data mining techniques among various big data analytics technologies have been successfully applied and proven in terms of classification performance in various domains, such as marketing, accounting and finance areas, the data imbalance problem has been regarded as one of the most important issues to be considered. We examined the effectiveness of a hybrid method using a clustering technique and genetic algorithms based on the artificial neural networks model to balance the proportion between the minority class and majority class. The objective of this paper is to constitute the best suitable training dataset for both decreasing data imbalance and improving the classification accuracy. We extracted the properly balanced dataset composed of optimal or near-optimal instances for the artificial neural networks model. The main contribution of the proposed method is that we extract explorative knowledge based on recognition of the data structure and categorize instances through the clustering technique while performing simultaneous optimization for the artificial neural networks modeling. In addition, we can easily understand why the instances are selected by the rule-format knowledge representation increasing the expressive power of the criteria of selecting instances. The proposed method is successfully applied to the bankruptcy prediction problem using financial data for which the proportion of small- and medium-sized bankruptcy firms in the manufacturing industry is extremely small compared to that of non-bankruptcy firms.
doi_str_mv	10.1016/j.eswa.2016.04.027
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1825496340</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0957417416301932</els_id><sourcerecordid>1825496340</sourcerecordid><originalsourceid>FETCH-LOGICAL-c447t-cbfbc2440bbf37eaa8a2bf9347b0fcc907004187b4b66ae3c651cb17759f368d3</originalsourceid><addsrcrecordid>eNp9kEFP3DAQha2qlboF_gAnH3tJsBNvvJF6QQjaSkhcytmyJ-PiJRuHsQOiB357Hbbnnt5I896TvsfYuRS1FLK72NeYXmzdlLsWqhaN_sA2cqfbqtN9-5FtRL_VlZJafWZfUtoLIbUQesPe7uYcDuGPzSFOPHoO45IyUuVswoHjcxyX9WXplS_TgJTsYR7D9Jv7SDw_ILeUgw8Q7MgnXOhd8kukx8TDxCHSHMlm5M5Oj7TMGV75TDgEWGtP2Sdvx4Rn__SE3d9c_7r6Ud3eff95dXlbgVI6V-C8g0Yp4ZxvNVq7s43zfau0Ex6gFwVGFVynXNdZbKHbSnBS623v2243tCfs67F3pvi0YMrmEBLgONoJ45KM3DVb1XetEsXaHK1AMSVCb2YKh8JvpDDr2GZv1rHNOrYRypSxS-jbMYQF4jkgmQQBJyichJDNEMP_4n8Br-WNHg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1825496340</pqid></control><display><type>article</type><title>Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Kim, Hyun-Jung ; Jo, Nam-Ok ; Shin, Kyung-Shik</creator><creatorcontrib>Kim, Hyun-Jung ; Jo, Nam-Ok ; Shin, Kyung-Shik</creatorcontrib><description>•We examined the effectiveness an optimized cluster-based undersampling technique.•We used a GA-based optimization approach for selecting the appropriate instances.•A critical issue of real-world knowledge extraction is the data imbalance problem.•The proposed method is successfully applied to the bankruptcy prediction problem. We suggest an optimization approach of cluster-based undersampling to select appropriate instances. This approach can solve the data imbalance problem, which can lead to knowledge extraction for improving the performance of existing data mining techniques. Although data mining techniques among various big data analytics technologies have been successfully applied and proven in terms of classification performance in various domains, such as marketing, accounting and finance areas, the data imbalance problem has been regarded as one of the most important issues to be considered. We examined the effectiveness of a hybrid method using a clustering technique and genetic algorithms based on the artificial neural networks model to balance the proportion between the minority class and majority class. The objective of this paper is to constitute the best suitable training dataset for both decreasing data imbalance and improving the classification accuracy. We extracted the properly balanced dataset composed of optimal or near-optimal instances for the artificial neural networks model. The main contribution of the proposed method is that we extract explorative knowledge based on recognition of the data structure and categorize instances through the clustering technique while performing simultaneous optimization for the artificial neural networks modeling. In addition, we can easily understand why the instances are selected by the rule-format knowledge representation increasing the expressive power of the criteria of selecting instances. The proposed method is successfully applied to the bankruptcy prediction problem using financial data for which the proportion of small- and medium-sized bankruptcy firms in the manufacturing industry is extremely small compared to that of non-bankruptcy firms.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2016.04.027</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Artificial neural networks ; Bankruptcies ; Classification ; Cluster-based undersampling technique ; Clustering ; Corporate bankruptcy prediction ; Data mining ; Data structures ; Evolutionary ; Genetic algorithms ; Imbalance data ; Optimization</subject><ispartof>Expert systems with applications, 2016-10, Vol.59, p.226-234</ispartof><rights>2016 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c447t-cbfbc2440bbf37eaa8a2bf9347b0fcc907004187b4b66ae3c651cb17759f368d3</citedby><cites>FETCH-LOGICAL-c447t-cbfbc2440bbf37eaa8a2bf9347b0fcc907004187b4b66ae3c651cb17759f368d3</cites><orcidid>0000-0002-2312-5274</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.eswa.2016.04.027$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Kim, Hyun-Jung</creatorcontrib><creatorcontrib>Jo, Nam-Ok</creatorcontrib><creatorcontrib>Shin, Kyung-Shik</creatorcontrib><title>Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction</title><title>Expert systems with applications</title><description>•We examined the effectiveness an optimized cluster-based undersampling technique.•We used a GA-based optimization approach for selecting the appropriate instances.•A critical issue of real-world knowledge extraction is the data imbalance problem.•The proposed method is successfully applied to the bankruptcy prediction problem. We suggest an optimization approach of cluster-based undersampling to select appropriate instances. This approach can solve the data imbalance problem, which can lead to knowledge extraction for improving the performance of existing data mining techniques. Although data mining techniques among various big data analytics technologies have been successfully applied and proven in terms of classification performance in various domains, such as marketing, accounting and finance areas, the data imbalance problem has been regarded as one of the most important issues to be considered. We examined the effectiveness of a hybrid method using a clustering technique and genetic algorithms based on the artificial neural networks model to balance the proportion between the minority class and majority class. The objective of this paper is to constitute the best suitable training dataset for both decreasing data imbalance and improving the classification accuracy. We extracted the properly balanced dataset composed of optimal or near-optimal instances for the artificial neural networks model. The main contribution of the proposed method is that we extract explorative knowledge based on recognition of the data structure and categorize instances through the clustering technique while performing simultaneous optimization for the artificial neural networks modeling. In addition, we can easily understand why the instances are selected by the rule-format knowledge representation increasing the expressive power of the criteria of selecting instances. The proposed method is successfully applied to the bankruptcy prediction problem using financial data for which the proportion of small- and medium-sized bankruptcy firms in the manufacturing industry is extremely small compared to that of non-bankruptcy firms.</description><subject>Artificial neural networks</subject><subject>Bankruptcies</subject><subject>Classification</subject><subject>Cluster-based undersampling technique</subject><subject>Clustering</subject><subject>Corporate bankruptcy prediction</subject><subject>Data mining</subject><subject>Data structures</subject><subject>Evolutionary</subject><subject>Genetic algorithms</subject><subject>Imbalance data</subject><subject>Optimization</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNp9kEFP3DAQha2qlboF_gAnH3tJsBNvvJF6QQjaSkhcytmyJ-PiJRuHsQOiB357Hbbnnt5I896TvsfYuRS1FLK72NeYXmzdlLsWqhaN_sA2cqfbqtN9-5FtRL_VlZJafWZfUtoLIbUQesPe7uYcDuGPzSFOPHoO45IyUuVswoHjcxyX9WXplS_TgJTsYR7D9Jv7SDw_ILeUgw8Q7MgnXOhd8kukx8TDxCHSHMlm5M5Oj7TMGV75TDgEWGtP2Sdvx4Rn__SE3d9c_7r6Ud3eff95dXlbgVI6V-C8g0Yp4ZxvNVq7s43zfau0Ex6gFwVGFVynXNdZbKHbSnBS623v2243tCfs67F3pvi0YMrmEBLgONoJ45KM3DVb1XetEsXaHK1AMSVCb2YKh8JvpDDr2GZv1rHNOrYRypSxS-jbMYQF4jkgmQQBJyichJDNEMP_4n8Br-WNHg</recordid><startdate>20161015</startdate><enddate>20161015</enddate><creator>Kim, Hyun-Jung</creator><creator>Jo, Nam-Ok</creator><creator>Shin, Kyung-Shik</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2312-5274</orcidid></search><sort><creationdate>20161015</creationdate><title>Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction</title><author>Kim, Hyun-Jung ; Jo, Nam-Ok ; Shin, Kyung-Shik</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c447t-cbfbc2440bbf37eaa8a2bf9347b0fcc907004187b4b66ae3c651cb17759f368d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Artificial neural networks</topic><topic>Bankruptcies</topic><topic>Classification</topic><topic>Cluster-based undersampling technique</topic><topic>Clustering</topic><topic>Corporate bankruptcy prediction</topic><topic>Data mining</topic><topic>Data structures</topic><topic>Evolutionary</topic><topic>Genetic algorithms</topic><topic>Imbalance data</topic><topic>Optimization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kim, Hyun-Jung</creatorcontrib><creatorcontrib>Jo, Nam-Ok</creatorcontrib><creatorcontrib>Shin, Kyung-Shik</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kim, Hyun-Jung</au><au>Jo, Nam-Ok</au><au>Shin, Kyung-Shik</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction</atitle><jtitle>Expert systems with applications</jtitle><date>2016-10-15</date><risdate>2016</risdate><volume>59</volume><spage>226</spage><epage>234</epage><pages>226-234</pages><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>•We examined the effectiveness an optimized cluster-based undersampling technique.•We used a GA-based optimization approach for selecting the appropriate instances.•A critical issue of real-world knowledge extraction is the data imbalance problem.•The proposed method is successfully applied to the bankruptcy prediction problem. We suggest an optimization approach of cluster-based undersampling to select appropriate instances. This approach can solve the data imbalance problem, which can lead to knowledge extraction for improving the performance of existing data mining techniques. Although data mining techniques among various big data analytics technologies have been successfully applied and proven in terms of classification performance in various domains, such as marketing, accounting and finance areas, the data imbalance problem has been regarded as one of the most important issues to be considered. We examined the effectiveness of a hybrid method using a clustering technique and genetic algorithms based on the artificial neural networks model to balance the proportion between the minority class and majority class. The objective of this paper is to constitute the best suitable training dataset for both decreasing data imbalance and improving the classification accuracy. We extracted the properly balanced dataset composed of optimal or near-optimal instances for the artificial neural networks model. The main contribution of the proposed method is that we extract explorative knowledge based on recognition of the data structure and categorize instances through the clustering technique while performing simultaneous optimization for the artificial neural networks modeling. In addition, we can easily understand why the instances are selected by the rule-format knowledge representation increasing the expressive power of the criteria of selecting instances. The proposed method is successfully applied to the bankruptcy prediction problem using financial data for which the proportion of small- and medium-sized bankruptcy firms in the manufacturing industry is extremely small compared to that of non-bankruptcy firms.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2016.04.027</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0002-2312-5274</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0957-4174
ispartof	Expert systems with applications, 2016-10, Vol.59, p.226-234
issn	0957-4174 1873-6793
language	eng
recordid	cdi_proquest_miscellaneous_1825496340
source	Elsevier ScienceDirect Journals Complete
subjects	Artificial neural networks Bankruptcies Classification Cluster-based undersampling technique Clustering Corporate bankruptcy prediction Data mining Data structures Evolutionary Genetic algorithms Imbalance data Optimization
title	Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T23%3A34%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimization%20of%20cluster-based%20evolutionary%20undersampling%20for%20the%20artificial%20neural%20networks%20in%20corporate%20bankruptcy%20prediction&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Kim,%20Hyun-Jung&rft.date=2016-10-15&rft.volume=59&rft.spage=226&rft.epage=234&rft.pages=226-234&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2016.04.027&rft_dat=%3Cproquest_cross%3E1825496340%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1825496340&rft_id=info:pmid/&rft_els_id=S0957417416301932&rfr_iscdi=true