GAP-Gen: Guided Automatic Python Code Generation
Automatic code generation from natural language descriptions can be highly beneficial during the process of software development. In this work, we propose GAP-Gen, a Guided Automatic Python Code Generation method based on Python syntactic constraints and semantic constraints. We first introduce Pyth...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Zhao, Junchen Song, Yurun Wang, Junlin Harris, Ian G |
description | Automatic code generation from natural language descriptions can be highly
beneficial during the process of software development. In this work, we propose
GAP-Gen, a Guided Automatic Python Code Generation method based on Python
syntactic constraints and semantic constraints. We first introduce Python
syntactic constraints in the form of Syntax-Flow, which is a simplified version
of Abstract Syntax Tree (AST) reducing the size and high complexity of Abstract
Syntax Tree but maintaining crucial syntactic information of Python code. In
addition to Syntax-Flow, we introduce Variable-Flow which abstracts variable
and function names consistently through out the code. In our work, rather than
pretraining, we focus on modifying the finetuning process which reduces
computational requirements but retains high generation performance on automatic
Python code generation task. GAP-Gen fine-tunes the transformer based language
models T5 and CodeT5 using the Code-to-Docstring datasets CodeSearchNet,
CodeSearchNet AdvTest and Code-Docstring Corpus from EdinburghNLP. Our
experiments show that GAP-Gen achieves better results on automatic Python code
generation task than previous works. |
doi_str_mv | 10.48550/arxiv.2201.08810 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2201_08810</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2201_08810</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-a14f08cada4bffcecc0f046df3975b1a7b6a2449eb6fbff1fc0eb806381e9b4d3</originalsourceid><addsrcrecordid>eNotjsGOgjAURbuZhXH8AFf2B8BXKKXMjhBlTEx04Z68tq-RZACDYPTvRcfVTW5OTg5jSwGh1EkCa-zv9S2MIhAhaC1gxqDMj0FJ7Q8vx9qR4_k4dA0OteXHx3DuWl50jvhEUD-9XfvNvjz-XWnx2Tk7bTen4jfYH8pdke8DVCkEKKQHbdGhNN5bshY8SOV8nKWJEZgahZGUGRnlJ0B4C2Q0qFgLyox08Zyt_rXv5OrS1w32j-qVXr3T4yfYyD5C</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>GAP-Gen: Guided Automatic Python Code Generation</title><source>arXiv.org</source><creator>Zhao, Junchen ; Song, Yurun ; Wang, Junlin ; Harris, Ian G</creator><creatorcontrib>Zhao, Junchen ; Song, Yurun ; Wang, Junlin ; Harris, Ian G</creatorcontrib><description>Automatic code generation from natural language descriptions can be highly
beneficial during the process of software development. In this work, we propose
GAP-Gen, a Guided Automatic Python Code Generation method based on Python
syntactic constraints and semantic constraints. We first introduce Python
syntactic constraints in the form of Syntax-Flow, which is a simplified version
of Abstract Syntax Tree (AST) reducing the size and high complexity of Abstract
Syntax Tree but maintaining crucial syntactic information of Python code. In
addition to Syntax-Flow, we introduce Variable-Flow which abstracts variable
and function names consistently through out the code. In our work, rather than
pretraining, we focus on modifying the finetuning process which reduces
computational requirements but retains high generation performance on automatic
Python code generation task. GAP-Gen fine-tunes the transformer based language
models T5 and CodeT5 using the Code-to-Docstring datasets CodeSearchNet,
CodeSearchNet AdvTest and Code-Docstring Corpus from EdinburghNLP. Our
experiments show that GAP-Gen achieves better results on automatic Python code
generation task than previous works.</description><identifier>DOI: 10.48550/arxiv.2201.08810</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Learning ; Computer Science - Programming Languages ; Computer Science - Software Engineering</subject><creationdate>2022-01</creationdate><rights>http://creativecommons.org/licenses/by-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2201.08810$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2201.08810$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhao, Junchen</creatorcontrib><creatorcontrib>Song, Yurun</creatorcontrib><creatorcontrib>Wang, Junlin</creatorcontrib><creatorcontrib>Harris, Ian G</creatorcontrib><title>GAP-Gen: Guided Automatic Python Code Generation</title><description>Automatic code generation from natural language descriptions can be highly
beneficial during the process of software development. In this work, we propose
GAP-Gen, a Guided Automatic Python Code Generation method based on Python
syntactic constraints and semantic constraints. We first introduce Python
syntactic constraints in the form of Syntax-Flow, which is a simplified version
of Abstract Syntax Tree (AST) reducing the size and high complexity of Abstract
Syntax Tree but maintaining crucial syntactic information of Python code. In
addition to Syntax-Flow, we introduce Variable-Flow which abstracts variable
and function names consistently through out the code. In our work, rather than
pretraining, we focus on modifying the finetuning process which reduces
computational requirements but retains high generation performance on automatic
Python code generation task. GAP-Gen fine-tunes the transformer based language
models T5 and CodeT5 using the Code-to-Docstring datasets CodeSearchNet,
CodeSearchNet AdvTest and Code-Docstring Corpus from EdinburghNLP. Our
experiments show that GAP-Gen achieves better results on automatic Python code
generation task than previous works.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Programming Languages</subject><subject>Computer Science - Software Engineering</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotjsGOgjAURbuZhXH8AFf2B8BXKKXMjhBlTEx04Z68tq-RZACDYPTvRcfVTW5OTg5jSwGh1EkCa-zv9S2MIhAhaC1gxqDMj0FJ7Q8vx9qR4_k4dA0OteXHx3DuWl50jvhEUD-9XfvNvjz-XWnx2Tk7bTen4jfYH8pdke8DVCkEKKQHbdGhNN5bshY8SOV8nKWJEZgahZGUGRnlJ0B4C2Q0qFgLyox08Zyt_rXv5OrS1w32j-qVXr3T4yfYyD5C</recordid><startdate>20220119</startdate><enddate>20220119</enddate><creator>Zhao, Junchen</creator><creator>Song, Yurun</creator><creator>Wang, Junlin</creator><creator>Harris, Ian G</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220119</creationdate><title>GAP-Gen: Guided Automatic Python Code Generation</title><author>Zhao, Junchen ; Song, Yurun ; Wang, Junlin ; Harris, Ian G</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-a14f08cada4bffcecc0f046df3975b1a7b6a2449eb6fbff1fc0eb806381e9b4d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Programming Languages</topic><topic>Computer Science - Software Engineering</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Junchen</creatorcontrib><creatorcontrib>Song, Yurun</creatorcontrib><creatorcontrib>Wang, Junlin</creatorcontrib><creatorcontrib>Harris, Ian G</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhao, Junchen</au><au>Song, Yurun</au><au>Wang, Junlin</au><au>Harris, Ian G</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>GAP-Gen: Guided Automatic Python Code Generation</atitle><date>2022-01-19</date><risdate>2022</risdate><abstract>Automatic code generation from natural language descriptions can be highly
beneficial during the process of software development. In this work, we propose
GAP-Gen, a Guided Automatic Python Code Generation method based on Python
syntactic constraints and semantic constraints. We first introduce Python
syntactic constraints in the form of Syntax-Flow, which is a simplified version
of Abstract Syntax Tree (AST) reducing the size and high complexity of Abstract
Syntax Tree but maintaining crucial syntactic information of Python code. In
addition to Syntax-Flow, we introduce Variable-Flow which abstracts variable
and function names consistently through out the code. In our work, rather than
pretraining, we focus on modifying the finetuning process which reduces
computational requirements but retains high generation performance on automatic
Python code generation task. GAP-Gen fine-tunes the transformer based language
models T5 and CodeT5 using the Code-to-Docstring datasets CodeSearchNet,
CodeSearchNet AdvTest and Code-Docstring Corpus from EdinburghNLP. Our
experiments show that GAP-Gen achieves better results on automatic Python code
generation task than previous works.</abstract><doi>10.48550/arxiv.2201.08810</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2201.08810 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2201_08810 |
source | arXiv.org |
subjects | Computer Science - Computation and Language Computer Science - Learning Computer Science - Programming Languages Computer Science - Software Engineering |
title | GAP-Gen: Guided Automatic Python Code Generation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T09%3A04%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=GAP-Gen:%20Guided%20Automatic%20Python%20Code%20Generation&rft.au=Zhao,%20Junchen&rft.date=2022-01-19&rft_id=info:doi/10.48550/arxiv.2201.08810&rft_dat=%3Carxiv_GOX%3E2201_08810%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |