Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency

Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Gee, Leonidas, Gritta, Milan, Lampouras, Gerasimos, Iacobacci, Ignacio
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Gee, Leonidas Gritta, Milan Lampouras, Gerasimos Iacobacci, Ignacio
description	Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.
doi_str_mv	10.48550/arxiv.2406.12502
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2406_12502</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2406_12502</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-d65c2ebb09ce700ba0d7834ba2b0d674b320105cfcf352721995e9793c2b78843</originalsourceid><addsrcrecordid>eNotz81KAzEUhuFsupDWC3BlbiDjmfxMJu5krFUYqGD3w0lyAoF2pmQGsXevVlff5uWDh7G7GirdGgMPWL7yZyU1NFUtDcgb1ndTJLE_L_mUZ3rkH3RMYkcjFVwo8vdCiQqNgfgzLsjTVHg3lUJhGWmeOY6Rb1PKIf80lw1bJTzOdPu_a3Z42R66V9Hvd2_dUy-wsVLExgRJ3oMLZAE8QrSt0h6lh9hY7ZWEGkxIISkjraydM-SsU0F627Zardn93-2VM5xLPmG5DL-s4cpS337YRxs</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency</title><source>arXiv.org</source><creator>Gee, Leonidas ; Gritta, Milan ; Lampouras, Gerasimos ; Iacobacci, Ignacio</creator><creatorcontrib>Gee, Leonidas ; Gritta, Milan ; Lampouras, Gerasimos ; Iacobacci, Ignacio</creatorcontrib><description>Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.</description><identifier>DOI: 10.48550/arxiv.2406.12502</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2024-06</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2406.12502$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2406.12502$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Gee, Leonidas</creatorcontrib><creatorcontrib>Gritta, Milan</creatorcontrib><creatorcontrib>Lampouras, Gerasimos</creatorcontrib><creatorcontrib>Iacobacci, Ignacio</creatorcontrib><title>Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency</title><description>Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81KAzEUhuFsupDWC3BlbiDjmfxMJu5krFUYqGD3w0lyAoF2pmQGsXevVlff5uWDh7G7GirdGgMPWL7yZyU1NFUtDcgb1ndTJLE_L_mUZ3rkH3RMYkcjFVwo8vdCiQqNgfgzLsjTVHg3lUJhGWmeOY6Rb1PKIf80lw1bJTzOdPu_a3Z42R66V9Hvd2_dUy-wsVLExgRJ3oMLZAE8QrSt0h6lh9hY7ZWEGkxIISkjraydM-SsU0F627Zardn93-2VM5xLPmG5DL-s4cpS337YRxs</recordid><startdate>20240618</startdate><enddate>20240618</enddate><creator>Gee, Leonidas</creator><creator>Gritta, Milan</creator><creator>Lampouras, Gerasimos</creator><creator>Iacobacci, Ignacio</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240618</creationdate><title>Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency</title><author>Gee, Leonidas ; Gritta, Milan ; Lampouras, Gerasimos ; Iacobacci, Ignacio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-d65c2ebb09ce700ba0d7834ba2b0d674b320105cfcf352721995e9793c2b78843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Gee, Leonidas</creatorcontrib><creatorcontrib>Gritta, Milan</creatorcontrib><creatorcontrib>Lampouras, Gerasimos</creatorcontrib><creatorcontrib>Iacobacci, Ignacio</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gee, Leonidas</au><au>Gritta, Milan</au><au>Lampouras, Gerasimos</au><au>Iacobacci, Ignacio</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency</atitle><date>2024-06-18</date><risdate>2024</risdate><abstract>Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.</abstract><doi>10.48550/arxiv.2406.12502</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2406.12502
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2406_12502
source	arXiv.org
subjects	Computer Science - Computation and Language
title	Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T05%3A54%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Code-Optimise:%20Self-Generated%20Preference%20Data%20for%20Correctness%20and%20Efficiency&rft.au=Gee,%20Leonidas&rft.date=2024-06-18&rft_id=info:doi/10.48550/arxiv.2406.12502&rft_dat=%3Carxiv_GOX%3E2406_12502%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true