CodeFusion: A Pre-trained Diffusion Model for Code Generation
Imagine a developer who can only change their last line of code, how often would they have to start writing a function from scratch before it is correct? Auto-regressive models for code generation from natural language have a similar limitation: they do not easily allow reconsidering earlier tokens...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Imagine a developer who can only change their last line of code, how often
would they have to start writing a function from scratch before it is correct?
Auto-regressive models for code generation from natural language have a similar
limitation: they do not easily allow reconsidering earlier tokens generated. We
introduce CodeFusion, a pre-trained diffusion code generation model that
addresses this limitation by iteratively denoising a complete program
conditioned on the encoded natural language. We evaluate CodeFusion on the task
of natural language to code generation for Bash, Python, and Microsoft Excel
conditional formatting (CF) rules. Experiments show that CodeFusion (75M
parameters) performs on par with state-of-the-art auto-regressive systems
(350M-175B parameters) in top-1 accuracy and outperforms them in top-3 and
top-5 accuracy due to its better balance in diversity versus quality. |
---|---|
DOI: | 10.48550/arxiv.2310.17680 |