Deep learning based identification of inconsistent method names: How far are we?
For any software system, concise and meaningful method names are critical for program comprehension and maintenance. However, for various reasons, the method names might be inconsistent with their corresponding implementations. Such inconsistent method names are confusing and misleading, often resul...
Gespeichert in:
Veröffentlicht in: | Empirical software engineering : an international journal 2025-02, Vol.30 (1), p.31, Article 31 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | For any software system, concise and meaningful method names are critical for program comprehension and maintenance. However, for various reasons, the method names might be inconsistent with their corresponding implementations. Such inconsistent method names are confusing and misleading, often resulting in incorrect method invocations. To this end, a few intelligent deep learning-based approaches based on neural networks have been proposed to identify such inconsistent method names in the industry. Existing evaluations suggest that the performance of such DL-based approaches is promising. However, the evaluations are conducted with a perfectly balanced dataset where the number of inconsistent method names is exactly equivalent to that of consistent ones. In addition, the construction method of this balanced dataset is flawed, leading to false positives in this dataset. Consequently, the reported performance may not represent their efficiency in the field where most method names are consistent with their corresponding method bodies and only a small part of method names are inconsistent with corresponding method bodies. To this end, in this paper, we conduct an empirical study to assess the state-of-the-art DL-based approaches in the automated identification of inconsistent method names. We first build a new benchmark (dataset) by using both automatic identification from commit history and manual inspection by developers, aiming to reduce the number of false positives. Based on the benchmark, we evaluate five representative DL-based approaches to identifying inconsistent method names (one is retrieval-based and two are generation-based). Our evaluation results suggest that the performance of the evaluated approaches is substantially reduced when we switch from the existing balanced dataset to our new benchmark. Furthermore, to reveal where and why the evaluated approaches work/fail, we conduct quantitative and qualitative analyses of the evaluation results. Our analysis results suggest that the evaluated approaches work well on methods with simple bodies and short names, and retrieval-based approaches are especially good at methods whose names start with popular first sub-tokens. Retrieval-based approaches fail frequently because the adopted method representation technique is not efficient enough. Another possible reason for the failures is their unverified rationale, i.e., two methods with similar bodies should have similar names. Generation-based approache |
---|---|
ISSN: | 1382-3256 1573-7616 |
DOI: | 10.1007/s10664-024-10592-z |