Improving Bug Localization With Effective Contrastive Learning Representation

Automated localization of buggy files can accelerate developers' efficiency of software maintenance, improving the quality of software products. State-of-the-art approaches for bug localization is based on neural networks, e.g., RNN or CNN, and can learn semantic feature from the given bug repo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2023, Vol.11, p.32523-32533
Hauptverfasser:	Luo, Zhengmao, Wang, Wenyao, Cen, Caichun
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks bug localization Computer bugs Contrastive learning representation Debugging Learning Learning systems Localization Location awareness pre-trained language model Representations Semantics Software Software maintenance Source coding Task analysis Transformers
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Automated localization of buggy files can accelerate developers' efficiency of software maintenance, improving the quality of software products. State-of-the-art approaches for bug localization is based on neural networks, e.g., RNN or CNN, and can learn semantic feature from the given bug report. However, these simple neural architectures are difficult to learn the deep contextual feature from bug reports, which hurts the semantic mapping between bug reports and their corresponding buggy files. To resolve the above problem, in this paper we propose a bug localization approach that combines pre-trained language models and contrastive learning, namely CoLoc. Specifically, CoLoc first is pre-trained on a large-scale bug report corpus in an unsupervised way, to learn the deep contextual feature of each token in the bug report according to its context. Afterward, CoLoc is further pre-trained by a contrastive learning objective to learn the contrastive learning representations both of bug reports and buggy files. Contrastive learning can help CoLoc to learn the semantic differences between different bug reports and buggy files. To evaluate the effectiveness of CoLoc, we choose five baseline approaches and compare their performance on a public dataset. The experimental results show that CoLoc outperforms all baseline approaches by up to 76.00% in terms of MRR, achieving new results for bug localization.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2022.3228802