Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion
Pretrained code language models have enabled great progress towards program synthesis. However, common approaches only consider in-file local context and thus miss information and constraints imposed by other parts of the codebase and its external dependencies. Existing code completion benchmarks al...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Pretrained code language models have enabled great progress towards program
synthesis. However, common approaches only consider in-file local context and
thus miss information and constraints imposed by other parts of the codebase
and its external dependencies. Existing code completion benchmarks also lack
such context. To resolve these restrictions we curate a new dataset of
permissively licensed Python packages that includes full projects and their
dependencies and provide tools to extract non-local information with the help
of program analyzers. We then focus on the task of function call argument
completion which requires predicting the arguments to function calls. We show
that existing code completion models do not yield good results on our
completion task. To better solve this task, we query a program analyzer for
information relevant to a given function call, and consider ways to provide the
analyzer results to different code completion models during inference and
training. Our experiments show that providing access to the function
implementation and function usages greatly improves the argument completion
performance. Our ablation study provides further insights on how different
types of information available from the program analyzer and different ways of
incorporating the information affect the model performance. |
---|---|
DOI: | 10.48550/arxiv.2306.00381 |