Software Defect Prediction Based on Gated Hierarchical LSTMs

Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on reliability 2021-06, Vol.70 (2), p.711-727
Hauptverfasser: Wang, Hao, Zhuang, Weiyuan, Zhang, Xiaofang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Software defect prediction, aimed at assisting software practitioners in allocating test resources more efficiently, predicts the potential defective modules in software products. With the development of defect prediction technology, the inability of traditional software features to capture semantic information is exposed, hence related researchers have turned to semantic features to build defect prediction models. However, sometimes traditional features such as lines of code (LOC) also play an important role in defect prediction. Most of the existing researches only focus on using a single type of feature as the input of the model. In this article, a defect prediction method based on gated hierarchical long short-term memory networks (GH-LSTMs) is proposed, which uses hierarchical LSTM networks to extract both semantic features from word embeddings of abstract syntax trees (ASTs) of source code files, and traditional features provided by the PROMISE repository. More importantly, we adopt a gated fusion strategy to combine the outputs of the hierarchical networks properly. Experimental results show that GH-LSTMs outperforms existing methods under both noneffort-aware and effort-aware scenarios.
ISSN:0018-9529
1558-1721
DOI:10.1109/TR.2020.3047396