Refining software defect prediction through attentive neural models for code understanding

•Learning code representation does not guarantee good defect classification.•The performance of traditional defect predictions depends on the selected software metrics and the underlying learning model.•Learning bidirectional encoder representations of source code corpora and fine-tuning them to sof...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of systems and software 2025-02, Vol.220, p.112266, Article 112266
Hauptverfasser:	Nashaat, Mona, Miller, James
Format:	Artikel
Sprache:	eng
Schlagworte:	Code understanding Defect prediction Reliability Software Transformers
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Learning code representation does not guarantee good defect classification.•The performance of traditional defect predictions depends on the selected software metrics and the underlying learning model.•Learning bidirectional encoder representations of source code corpora and fine-tuning them to software defect detection can yield better defect classification.•Masked token training objectives outperform other representation learning techniques in detecting software defects. Identifying defects through manual software testing is a resource-intensive task in software development. To alleviate this, software defect prediction identifies code segments likely to contain faults using data-driven methods. Traditional techniques rely on static code metrics, which often fail to reflect the deeper syntactic and semantic features of the code. This paper introduces a novel framework that utilizes transformer-based networks with attention mechanisms to predict software defects. The framework encodes input vectors to develop meaningful representations of software modules. A bidirectional transformer encoder is employed to model programming languages, followed by fine-tuning with labeled data to detect defects. The performance of the framework is assessed through experiments across various software projects and compared against baseline techniques. Additionally, statistical hypothesis testing and an ablation study are performed to assess the impact of different parameter choices. The empirical findings indicate that the proposed approach can increase classification accuracy by an average of 15.93% and improve the F1 score by up to 44.26% compared to traditional methods. [Display omitted]
ISSN:	0164-1212
DOI:	10.1016/j.jss.2024.112266