Software Defect Prediction Based on Gated Hierarchical LSTMs

Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on reliability 2021-06, Vol.70 (2), p.711-727
Hauptverfasser: Wang, Hao, Zhuang, Weiyuan, Zhang, Xiaofang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 727
container_issue 2
container_start_page 711
container_title IEEE transactions on reliability
container_volume 70
creator Wang, Hao
Zhuang, Weiyuan
Zhang, Xiaofang
description Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.
doi_str_mv 10.1109/TR.2020.3047396
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9326336</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9326336</ieee_id><sourcerecordid>2539351831</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-38161e7f5c5d12f870e378577c3739caf4c419209015b32280ca7d67a99057ca3</originalsourceid><addsrcrecordid>eNo9kEFLAzEQhYMoWKtnD14WPG-byWw2CXjRqlWoKO16DjE7i1tqtyZbxH9vSouneQPvzTw-xi6BjwC4GVfzkeCCj5AXCk15xAYgpc5BCThmA85B50YKc8rOYlymtSiMHrCbRdf0Py5Qdk8N-T57C1S3vm-7dXbnItVZElPXJ_HUUnDBf7berbLZonqJ5-ykcatIF4c5ZO-PD9XkKZ-9Tp8nt7PcC236HDWUQKqRXtYgGq04odJSKY-pqXdN4QswghsO8gOF0Nw7VZfKGcOl8g6H7Hp_dxO67y3F3i67bVinl1ZINChBIyTXeO_yoYsxUGM3of1y4dcCtztEtprbHSJ7QJQSV_tES0T_boOiRCzxD6LJXu8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2539351831</pqid></control><display><type>article</type><title>Software Defect Prediction Based on Gated Hierarchical LSTMs</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Hao ; Zhuang, Weiyuan ; Zhang, Xiaofang</creator><creatorcontrib>Wang, Hao ; Zhuang, Weiyuan ; Zhang, Xiaofang</creatorcontrib><description>Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.</description><identifier>ISSN: 0018-9529</identifier><identifier>EISSN: 1558-1721</identifier><identifier>DOI: 10.1109/TR.2020.3047396</identifier><identifier>CODEN: IERQAD</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Abstract syntax tree (AST) ; Feature extraction ; hierarchical model ; Logic gates ; long short-term memory networks (LSTM) ; Networks ; Neurons ; Prediction models ; Predictive models ; Recurrent neural networks ; Semantics ; Software ; software defect prediction ; Source code</subject><ispartof>IEEE transactions on reliability, 2021-06, Vol.70 (2), p.711-727</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-38161e7f5c5d12f870e378577c3739caf4c419209015b32280ca7d67a99057ca3</citedby><cites>FETCH-LOGICAL-c289t-38161e7f5c5d12f870e378577c3739caf4c419209015b32280ca7d67a99057ca3</cites><orcidid>0000-0002-8667-0456</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9326336$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9326336$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wang, Hao</creatorcontrib><creatorcontrib>Zhuang, Weiyuan</creatorcontrib><creatorcontrib>Zhang, Xiaofang</creatorcontrib><title>Software Defect Prediction Based on Gated Hierarchical LSTMs</title><title>IEEE transactions on reliability</title><addtitle>TR</addtitle><description>Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.</description><subject>Abstract syntax tree (AST)</subject><subject>Feature extraction</subject><subject>hierarchical model</subject><subject>Logic gates</subject><subject>long short-term memory networks (LSTM)</subject><subject>Networks</subject><subject>Neurons</subject><subject>Prediction models</subject><subject>Predictive models</subject><subject>Recurrent neural networks</subject><subject>Semantics</subject><subject>Software</subject><subject>software defect prediction</subject><subject>Source code</subject><issn>0018-9529</issn><issn>1558-1721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEFLAzEQhYMoWKtnD14WPG-byWw2CXjRqlWoKO16DjE7i1tqtyZbxH9vSouneQPvzTw-xi6BjwC4GVfzkeCCj5AXCk15xAYgpc5BCThmA85B50YKc8rOYlymtSiMHrCbRdf0Py5Qdk8N-T57C1S3vm-7dXbnItVZElPXJ_HUUnDBf7berbLZonqJ5-ykcatIF4c5ZO-PD9XkKZ-9Tp8nt7PcC236HDWUQKqRXtYgGq04odJSKY-pqXdN4QswghsO8gOF0Nw7VZfKGcOl8g6H7Hp_dxO67y3F3i67bVinl1ZINChBIyTXeO_yoYsxUGM3of1y4dcCtztEtprbHSJ7QJQSV_tES0T_boOiRCzxD6LJXu8</recordid><startdate>202106</startdate><enddate>202106</enddate><creator>Wang, Hao</creator><creator>Zhuang, Weiyuan</creator><creator>Zhang, Xiaofang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-8667-0456</orcidid></search><sort><creationdate>202106</creationdate><title>Software Defect Prediction Based on Gated Hierarchical LSTMs</title><author>Wang, Hao ; Zhuang, Weiyuan ; Zhang, Xiaofang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-38161e7f5c5d12f870e378577c3739caf4c419209015b32280ca7d67a99057ca3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Abstract syntax tree (AST)</topic><topic>Feature extraction</topic><topic>hierarchical model</topic><topic>Logic gates</topic><topic>long short-term memory networks (LSTM)</topic><topic>Networks</topic><topic>Neurons</topic><topic>Prediction models</topic><topic>Predictive models</topic><topic>Recurrent neural networks</topic><topic>Semantics</topic><topic>Software</topic><topic>software defect prediction</topic><topic>Source code</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Hao</creatorcontrib><creatorcontrib>Zhuang, Weiyuan</creatorcontrib><creatorcontrib>Zhang, Xiaofang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on reliability</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Hao</au><au>Zhuang, Weiyuan</au><au>Zhang, Xiaofang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Software Defect Prediction Based on Gated Hierarchical LSTMs</atitle><jtitle>IEEE transactions on reliability</jtitle><stitle>TR</stitle><date>2021-06</date><risdate>2021</risdate><volume>70</volume><issue>2</issue><spage>711</spage><epage>727</epage><pages>711-727</pages><issn>0018-9529</issn><eissn>1558-1721</eissn><coden>IERQAD</coden><abstract>Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TR.2020.3047396</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-8667-0456</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9529
ispartof IEEE transactions on reliability, 2021-06, Vol.70 (2), p.711-727
issn 0018-9529
1558-1721
language eng
recordid cdi_ieee_primary_9326336
source IEEE Electronic Library (IEL)
subjects Abstract syntax tree (AST)
Feature extraction
hierarchical model
Logic gates
long short-term memory networks (LSTM)
Networks
Neurons
Prediction models
Predictive models
Recurrent neural networks
Semantics
Software
software defect prediction
Source code
title Software Defect Prediction Based on Gated Hierarchical LSTMs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-20T03%3A43%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Software%20Defect%20Prediction%20Based%20on%20Gated%20Hierarchical%20LSTMs&rft.jtitle=IEEE%20transactions%20on%20reliability&rft.au=Wang,%20Hao&rft.date=2021-06&rft.volume=70&rft.issue=2&rft.spage=711&rft.epage=727&rft.pages=711-727&rft.issn=0018-9529&rft.eissn=1558-1721&rft.coden=IERQAD&rft_id=info:doi/10.1109/TR.2020.3047396&rft_dat=%3Cproquest_RIE%3E2539351831%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2539351831&rft_id=info:pmid/&rft_ieee_id=9326336&rfr_iscdi=true