SPE ^: Self-Paced Ensemble of Ensembles for Software Defect Prediction

Software defect prediction aims to predict defect-prone code regions automatically before defects are discovered. Accurate prediction helps software practitioners to prioritize their testing efforts. In recent decades, dozens of approaches have been put forward and acquired good results in this fiel...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on reliability 2022-06, Vol.71 (2), p.865-879
Hauptverfasser: Wan, Xiaohui, Zheng, Zheng, Liu, Yang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Software defect prediction aims to predict defect-prone code regions automatically before defects are discovered. Accurate prediction helps software practitioners to prioritize their testing efforts. In recent decades, dozens of approaches have been put forward and acquired good results in this field. However, in practical scenarios, many projects have limited labeled instances; more than that, most of these labeled instances are nondefective. The lack of training data and class imbalance problem together bring serious challenges to software defect prediction tasks. So far, few of prevailing approaches can well handle these two difficulties simultaneously. One important reason is that they do not pay adequate attention to several key instances, which are difficult to classify in a small imbalanced dataset. This article introduces the concept of " instance hardness " to integrate various difficulties of imbalance classification tasks. Based on it, a novel imbalance learning framework named self-paced ensemble of ensembles (SPE^{2}) is proposed to perform software defect prediction. SPE^{2} aims to generate a strong ensemble of ensembles by self-paced harmonizing instance hardness via undersampling. Finally, SPE^{2} is extensively compared with eight imbalance learning approaches on ten open-source defect datasets. Experiments indicate that SPE^{2} improves the performance and achieves better and more significant F-measure values than its existing counterparts, based on Brunner's statistical significance test and Cliff's effect sizes.
ISSN:0018-9529
1558-1721
DOI:10.1109/TR.2022.3155183