Enhancing Semantic Code Search with Deep Graph Matching

The job of discovering appropriate code snippets against a natural language query is an important task for software developers. Appropriate code retrieval increases software productivity and quality as well. In contrast to traditional information retrieval techniques, code search necessitates bridgi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2023-01, Vol.11, p.1-1
Hauptverfasser: Bibi, Nazia, Maqbool, Ayesha, Rana, Tauseef, Afzal, Farkhanda, Akgul, Ali, El Din, Sayed M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The job of discovering appropriate code snippets against a natural language query is an important task for software developers. Appropriate code retrieval increases software productivity and quality as well. In contrast to traditional information retrieval techniques, code search necessitates bridging the semantic breach between programming languages and natural language to search code fragments. Deep neural networks for search codes have recently been a hot topic in research. The standard neural code quest approaches present source code and query in the form of text as independent embedding, then calculate the semantic similarity between them using vector distance (e.g., using cosine similarity). Although recent research utilized query and code snippets during code search, it overlooked the contained rich semantic information and deep structural features between them. In this study, we are also dealing with the problem of code search by providing a deep neural solution that facilitates software developers during software development. Our proposed model effectively used neural graph matching and a searching approach for semantic code retrieval. It first converts both query and code fragments in graph format and then the semantic matching module is used to facilitate the process of matching that will retrieve the best-matched code snippets. It not only exploits the enriched semantic meanings and features, but it also uses the cross-attention mechanism to learn the fine-grained similarity that exists between query and code. The proposed model's evaluation is done using the Codesearchnet dataset with six representative programming languages. It provides comparatively good results as compared to existing baselines. It enables users to find required code snippets, and ranking is used to retrieve top 10 results. The accuracy of the proposed system is approximately 97%.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3263878