Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency

Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Gee, Leonidas, Gritta, Milan, Lampouras, Gerasimos, Iacobacci, Ignacio
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Gee, Leonidas
Gritta, Milan
Lampouras, Gerasimos
Iacobacci, Ignacio
description Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.
doi_str_mv 10.48550/arxiv.2406.12502
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2406_12502</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2406_12502</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-d65c2ebb09ce700ba0d7834ba2b0d674b320105cfcf352721995e9793c2b78843</originalsourceid><addsrcrecordid>eNotz81KAzEUhuFsupDWC3BlbiDjmfxMJu5krFUYqGD3w0lyAoF2pmQGsXevVlff5uWDh7G7GirdGgMPWL7yZyU1NFUtDcgb1ndTJLE_L_mUZ3rkH3RMYkcjFVwo8vdCiQqNgfgzLsjTVHg3lUJhGWmeOY6Rb1PKIf80lw1bJTzOdPu_a3Z42R66V9Hvd2_dUy-wsVLExgRJ3oMLZAE8QrSt0h6lh9hY7ZWEGkxIISkjraydM-SsU0F627Zardn93-2VM5xLPmG5DL-s4cpS337YRxs</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency</title><source>arXiv.org</source><creator>Gee, Leonidas ; Gritta, Milan ; Lampouras, Gerasimos ; Iacobacci, Ignacio</creator><creatorcontrib>Gee, Leonidas ; Gritta, Milan ; Lampouras, Gerasimos ; Iacobacci, Ignacio</creatorcontrib><description>Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.</description><identifier>DOI: 10.48550/arxiv.2406.12502</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2024-06</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2406.12502$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2406.12502$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Gee, Leonidas</creatorcontrib><creatorcontrib>Gritta, Milan</creatorcontrib><creatorcontrib>Lampouras, Gerasimos</creatorcontrib><creatorcontrib>Iacobacci, Ignacio</creatorcontrib><title>Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency</title><description>Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81KAzEUhuFsupDWC3BlbiDjmfxMJu5krFUYqGD3w0lyAoF2pmQGsXevVlff5uWDh7G7GirdGgMPWL7yZyU1NFUtDcgb1ndTJLE_L_mUZ3rkH3RMYkcjFVwo8vdCiQqNgfgzLsjTVHg3lUJhGWmeOY6Rb1PKIf80lw1bJTzOdPu_a3Z42R66V9Hvd2_dUy-wsVLExgRJ3oMLZAE8QrSt0h6lh9hY7ZWEGkxIISkjraydM-SsU0F627Zardn93-2VM5xLPmG5DL-s4cpS337YRxs</recordid><startdate>20240618</startdate><enddate>20240618</enddate><creator>Gee, Leonidas</creator><creator>Gritta, Milan</creator><creator>Lampouras, Gerasimos</creator><creator>Iacobacci, Ignacio</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240618</creationdate><title>Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency</title><author>Gee, Leonidas ; Gritta, Milan ; Lampouras, Gerasimos ; Iacobacci, Ignacio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-d65c2ebb09ce700ba0d7834ba2b0d674b320105cfcf352721995e9793c2b78843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Gee, Leonidas</creatorcontrib><creatorcontrib>Gritta, Milan</creatorcontrib><creatorcontrib>Lampouras, Gerasimos</creatorcontrib><creatorcontrib>Iacobacci, Ignacio</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gee, Leonidas</au><au>Gritta, Milan</au><au>Lampouras, Gerasimos</au><au>Iacobacci, Ignacio</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency</atitle><date>2024-06-18</date><risdate>2024</risdate><abstract>Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.</abstract><doi>10.48550/arxiv.2406.12502</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2406.12502
ispartof
issn
language eng
recordid cdi_arxiv_primary_2406_12502
source arXiv.org
subjects Computer Science - Computation and Language
title Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T05%3A54%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Code-Optimise:%20Self-Generated%20Preference%20Data%20for%20Correctness%20and%20Efficiency&rft.au=Gee,%20Leonidas&rft.date=2024-06-18&rft_id=info:doi/10.48550/arxiv.2406.12502&rft_dat=%3Carxiv_GOX%3E2406_12502%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true