Deep API Sequence Generation via Golden Solution Samples and API Seeds

Automatic API recommendation can accelerate developers’ programming, and has been studied for years. There are two orthogonal lines of approaches for this task, i.e., information retrieval-based (IR-based) approaches and sequence to sequence (seq2seq) model based approaches. Although these approache...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on software engineering and methodology 2024-09
Hauptverfasser:	Huang, Yuekai, Wang, Junjie, Wang, Song, Wei, Moshi, Shi, Lin, Liu, Zhe, Wang, Qing
Format:	Artikel
Sprache:	eng
Schlagworte:	Software and its engineering Software notations and tools
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Automatic API recommendation can accelerate developers’ programming, and has been studied for years. There are two orthogonal lines of approaches for this task, i.e., information retrieval-based (IR-based) approaches and sequence to sequence (seq2seq) model based approaches. Although these approaches were reported to have remarkable performance, our observation finds two major drawbacks, i.e., IR-based approaches lack the consideration of relations among the recommended APIs, and seq2seq models do not model the API’s semantic meaning. To alleviate the above two problems, we propose APIGens, which is a retrieval-enhanced large language model (LLM) based API recommendation approach to recommend an API sequence for a natural language query. The approach first retrieves similar programming questions in history based on the input natural language query, and then scores the results based on API documents via a scorer model. Finally, these results are used as samples for few-shot learning of LLM. To reduce the risk of encountering local optima, we also extract API seeds from the retrieved results to increase the search scope during the LLM generation process. The results show that our approach can achieve 48.41% ROUGE@10 on API sequence recommendation and the 82.61% MAP on API set recommendation, largely outperforming the state-of-the-art baselines.
ISSN:	1049-331X 1557-7392
DOI:	10.1145/3695995