Function-constrained Program Synthesis
This work introduces (1) a technique that allows large language models (LLMs) to leverage user-provided code when solving programming tasks and (2) a method to iteratively generate modular sub-functions that can aid future code generation attempts when the initial code generated by the LLM is inadeq...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This work introduces (1) a technique that allows large language models (LLMs)
to leverage user-provided code when solving programming tasks and (2) a method
to iteratively generate modular sub-functions that can aid future code
generation attempts when the initial code generated by the LLM is inadequate.
Generating computer programs in general-purpose programming languages like
Python poses a challenge for LLMs when instructed to use code provided in the
prompt. Code-specific LLMs (e.g., GitHub Copilot, CodeLlama2) can generate code
completions in real-time by drawing on all code available in a development
environment. However, restricting code-specific LLMs to use only in-context
code is not straightforward, as the model is not explicitly instructed to use
the user-provided code and users cannot highlight precisely which snippets of
code the model should incorporate into its context. Moreover, current systems
lack effective recovery methods, forcing users to iteratively re-prompt the
model with modified prompts until a sufficient solution is reached. Our method
differs from traditional LLM-powered code-generation by constraining
code-generation to an explicit function set and enabling recovery from failed
attempts through automatically generated sub-functions. When the LLM cannot
produce working code, we generate modular sub-functions to aid subsequent
attempts at generating functional code. A by-product of our method is a library
of reusable sub-functions that can solve related tasks, imitating a software
team where efficiency scales with experience. We also introduce a new
"half-shot" evaluation paradigm that provides tighter estimates of LLMs' coding
abilities compared to traditional zero-shot evaluation. Our proposed evaluation
method encourages models to output solutions in a structured format, decreasing
syntax errors that can be mistaken for poor coding ability. |
---|---|
DOI: | 10.48550/arxiv.2311.15500 |