Software Defect Prediction Based on Gated Hierarchical LSTMs

Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on reliability 2021-06, Vol.70 (2), p.711-727
Hauptverfasser:	Wang, Hao, Zhuang, Weiyuan, Zhang, Xiaofang
Format:	Artikel
Sprache:	eng
Schlagworte:	Abstract syntax tree (AST) Feature extraction hierarchical model Logic gates long short-term memory networks (LSTM) Networks Neurons Prediction models Predictive models Recurrent neural networks Semantics Software software defect prediction Source code
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	727
container_issue	2
container_start_page	711
container_title	IEEE transactions on reliability
container_volume	70
creator	Wang, Hao Zhuang, Weiyuan Zhang, Xiaofang
description	Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.
doi_str_mv	10.1109/TR.2020.3047396
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9326336</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9326336</ieee_id><sourcerecordid>2539351831</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-38161e7f5c5d12f870e378577c3739caf4c419209015b32280ca7d67a99057ca3</originalsourceid><addsrcrecordid>eNo9kEFLAzEQhYMoWKtnD14WPG-byWw2CXjRqlWoKO16DjE7i1tqtyZbxH9vSouneQPvzTw-xi6BjwC4GVfzkeCCj5AXCk15xAYgpc5BCThmA85B50YKc8rOYlymtSiMHrCbRdf0Py5Qdk8N-T57C1S3vm-7dXbnItVZElPXJ_HUUnDBf7berbLZonqJ5-ykcatIF4c5ZO-PD9XkKZ-9Tp8nt7PcC236HDWUQKqRXtYgGq04odJSKY-pqXdN4QswghsO8gOF0Nw7VZfKGcOl8g6H7Hp_dxO67y3F3i67bVinl1ZINChBIyTXeO_yoYsxUGM3of1y4dcCtztEtprbHSJ7QJQSV_tES0T_boOiRCzxD6LJXu8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2539351831</pqid></control><display><type>article</type><title>Software Defect Prediction Based on Gated Hierarchical LSTMs</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Hao ; Zhuang, Weiyuan ; Zhang, Xiaofang</creator><creatorcontrib>Wang, Hao ; Zhuang, Weiyuan ; Zhang, Xiaofang</creatorcontrib><description>Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.</description><identifier>ISSN: 0018-9529</identifier><identifier>EISSN: 1558-1721</identifier><identifier>DOI: 10.1109/TR.2020.3047396</identifier><identifier>CODEN: IERQAD</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Abstract syntax tree (AST) ; Feature extraction ; hierarchical model ; Logic gates ; long short-term memory networks (LSTM) ; Networks ; Neurons ; Prediction models ; Predictive models ; Recurrent neural networks ; Semantics ; Software ; software defect prediction ; Source code</subject><ispartof>IEEE transactions on reliability, 2021-06, Vol.70 (2), p.711-727</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-38161e7f5c5d12f870e378577c3739caf4c419209015b32280ca7d67a99057ca3</citedby><cites>FETCH-LOGICAL-c289t-38161e7f5c5d12f870e378577c3739caf4c419209015b32280ca7d67a99057ca3</cites><orcidid>0000-0002-8667-0456</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9326336$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9326336$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wang, Hao</creatorcontrib><creatorcontrib>Zhuang, Weiyuan</creatorcontrib><creatorcontrib>Zhang, Xiaofang</creatorcontrib><title>Software Defect Prediction Based on Gated Hierarchical LSTMs</title><title>IEEE transactions on reliability</title><addtitle>TR</addtitle><description>Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.</description><subject>Abstract syntax tree (AST)</subject><subject>Feature extraction</subject><subject>hierarchical model</subject><subject>Logic gates</subject><subject>long short-term memory networks (LSTM)</subject><subject>Networks</subject><subject>Neurons</subject><subject>Prediction models</subject><subject>Predictive models</subject><subject>Recurrent neural networks</subject><subject>Semantics</subject><subject>Software</subject><subject>software defect prediction</subject><subject>Source code</subject><issn>0018-9529</issn><issn>1558-1721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEFLAzEQhYMoWKtnD14WPG-byWw2CXjRqlWoKO16DjE7i1tqtyZbxH9vSouneQPvzTw-xi6BjwC4GVfzkeCCj5AXCk15xAYgpc5BCThmA85B50YKc8rOYlymtSiMHrCbRdf0Py5Qdk8N-T57C1S3vm-7dXbnItVZElPXJ_HUUnDBf7berbLZonqJ5-ykcatIF4c5ZO-PD9XkKZ-9Tp8nt7PcC236HDWUQKqRXtYgGq04odJSKY-pqXdN4QswghsO8gOF0Nw7VZfKGcOl8g6H7Hp_dxO67y3F3i67bVinl1ZINChBIyTXeO_yoYsxUGM3of1y4dcCtztEtprbHSJ7QJQSV_tES0T_boOiRCzxD6LJXu8</recordid><startdate>202106</startdate><enddate>202106</enddate><creator>Wang, Hao</creator><creator>Zhuang, Weiyuan</creator><creator>Zhang, Xiaofang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-8667-0456</orcidid></search><sort><creationdate>202106</creationdate><title>Software Defect Prediction Based on Gated Hierarchical LSTMs</title><author>Wang, Hao ; Zhuang, Weiyuan ; Zhang, Xiaofang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-38161e7f5c5d12f870e378577c3739caf4c419209015b32280ca7d67a99057ca3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Abstract syntax tree (AST)</topic><topic>Feature extraction</topic><topic>hierarchical model</topic><topic>Logic gates</topic><topic>long short-term memory networks (LSTM)</topic><topic>Networks</topic><topic>Neurons</topic><topic>Prediction models</topic><topic>Predictive models</topic><topic>Recurrent neural networks</topic><topic>Semantics</topic><topic>Software</topic><topic>software defect prediction</topic><topic>Source code</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Hao</creatorcontrib><creatorcontrib>Zhuang, Weiyuan</creatorcontrib><creatorcontrib>Zhang, Xiaofang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on reliability</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Hao</au><au>Zhuang, Weiyuan</au><au>Zhang, Xiaofang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Software Defect Prediction Based on Gated Hierarchical LSTMs</atitle><jtitle>IEEE transactions on reliability</jtitle><stitle>TR</stitle><date>2021-06</date><risdate>2021</risdate><volume>70</volume><issue>2</issue><spage>711</spage><epage>727</epage><pages>711-727</pages><issn>0018-9529</issn><eissn>1558-1721</eissn><coden>IERQAD</coden><abstract>Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TR.2020.3047396</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-8667-0456</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9529
ispartof	IEEE transactions on reliability, 2021-06, Vol.70 (2), p.711-727
issn	0018-9529 1558-1721
language	eng
recordid	cdi_ieee_primary_9326336
source	IEEE Electronic Library (IEL)
subjects	Abstract syntax tree (AST) Feature extraction hierarchical model Logic gates long short-term memory networks (LSTM) Networks Neurons Prediction models Predictive models Recurrent neural networks Semantics Software software defect prediction Source code
title	Software Defect Prediction Based on Gated Hierarchical LSTMs
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-20T03%3A43%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Software%20Defect%20Prediction%20Based%20on%20Gated%20Hierarchical%20LSTMs&rft.jtitle=IEEE%20transactions%20on%20reliability&rft.au=Wang,%20Hao&rft.date=2021-06&rft.volume=70&rft.issue=2&rft.spage=711&rft.epage=727&rft.pages=711-727&rft.issn=0018-9529&rft.eissn=1558-1721&rft.coden=IERQAD&rft_id=info:doi/10.1109/TR.2020.3047396&rft_dat=%3Cproquest_RIE%3E2539351831%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2539351831&rft_id=info:pmid/&rft_ieee_id=9326336&rfr_iscdi=true