Reusable Component Retrieval: A Semantic Search Approach for Low-Resource Languages

A common practice among programmers is to reuse existing code, accomplished by performing natural language queries through search engines. The main aim of code retrieval is to search for the most relevant snippet from a corpus of code snippets. However, code retrieval frameworks for low-resource lan...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on Asian and low-resource language information processing 2023-05, Vol.22 (5), p.1-31, Article 141
Hauptverfasser:	Bibi, Nazia, Rana, Tauseef, Maqbool, Ayesha, Alkhalifah, Tamim, Khan, Wazir Zada, Bashir, Ali Kashif, Zikria, Yousaf Bin
Format:	Artikel
Sprache:	eng
Schlagworte:	Information retrieval Information retrieval query processing Information storage systems Information systems Query reformulation Retrieval models and ranking
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A common practice among programmers is to reuse existing code, accomplished by performing natural language queries through search engines. The main aim of code retrieval is to search for the most relevant snippet from a corpus of code snippets. However, code retrieval frameworks for low-resource languages are insufficient. Retrieving the most relevant code snippet efficiently can be accomplished only by eliminating the semantic gap between the code snippets residing in the repository and the user’s query (natural language description). The primary objective of the research is to contribute to this field by providing a code search framework that can be extended for low-resource languages. The secondary objective is to provide a code retrieval mechanism that is semantically relevant to the user query and provide programmers with the ability to locate source code that they want to use when developing new applications. The proposed approach is implemented using a web platform to search for source code. As code retrieval is a sophisticated task, the proposed approach incorporates a semantic search mechanism. This research uses a semantic model for code retrieval, which generates meanings or synonyms of words. The proposed model integrates ontologies and Natural Language Processing. System performance measures and classification accuracy are computed using precision, recall, and F1-score. We also compare the proposed approach with state-of-the-art baseline models. The retrieved results are ranked, showing that our approach significantly outperforms robust code matching. Our evaluation shows that semantic matching leads to improved source code retrieval. This study marks a substantial advancement in integrating programming expertise with code retrieval techniques. Moreover, our system lets users know when and how it is used for successful semantic searching.
ISSN:	2375-4699 2375-4702
DOI:	10.1145/3564604