Code and Data Synthesis for Genetic Improvement in Emergent Software Systems

Emergent software systems are assembled from a collection of small code blocks, where some of those blocks have alternative implementation variants; they optimise at run-time by learning which compositions of alternative blocks best suit each deployment environment encountered.In this paper we study...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on evolutionary learning 2022-08, Vol.2 (2), p.1-35, Article 7
Hauptverfasser:	Rainford, Penny Faulkner, Porter, Barry
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial life Computing methodologies Genetic algorithms
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	35
container_issue	2
container_start_page	1
container_title	ACM transactions on evolutionary learning
container_volume	2
creator	Rainford, Penny Faulkner Porter, Barry
description	Emergent software systems are assembled from a collection of small code blocks, where some of those blocks have alternative implementation variants; they optimise at run-time by learning which compositions of alternative blocks best suit each deployment environment encountered.In this paper we study the automated synthesis of new implementation variants for a running system using genetic improvement (GI). Typical GI approaches, however, rely on large amounts of data for accurate training and large code bases from which to source genetic material. In emergent systems we have neither asset, with sparsely sampled runtime data and small code volumes in each building block.We therefore examine two approaches to more effective GI under these constraints: the synthesis of data from sparse samples to construct statistically representative larger training corpora; and the synthesis of code to counter the relative lack of genetic material in our starting population members.Our results demonstrate that a mixture of synthesised and existing code is a viable optimisation strategy, and that phases of increased synthesis can make GI more robust to deleterious mutations. On synthesised data, we find that we can produce equivalent optimisation compared to GI methods using larger data sets, and that this optimisation can produce both useful specialists and generalists.
doi_str_mv	10.1145/3542823
format	Article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3542823</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3542823</sourcerecordid><originalsourceid>FETCH-LOGICAL-a1543-d568fd5a6b94384c8c02f24122331333df1b5ffada9847d8c1fec8344b3404d43</originalsourceid><addsrcrecordid>eNo9kMFLwzAYxYMoOObw7ik3T9UkX9KmR6lzGxQ8TMFbSZMvWrHtSIKy_96NVU_vwfu9d3iEXHN2x7lU96Ck0ALOyEzkWmfAWHE-eVGWb5dkEeMnY0woDgUrZqSuRofUDI4-mmTodj-kD4xdpH4MdIUDps7STb8L4zf2OCTaDXTZY3g_-u3o048JeKjFhH28IhfefEVcTDonr0_Ll2qd1c-rTfVQZ4YrCZlTufZOmbwtJWhptWXCC8mFAOAA4DxvlffGmVLLwmnLPVoNUrYgmXQS5uT2tGvDGGNA3-xC15uwbzhrjj800w8H8uZEGtv_Q3_hL_VmVr0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Code and Data Synthesis for Genetic Improvement in Emergent Software Systems</title><source>ACM Digital Library Complete</source><creator>Rainford, Penny Faulkner ; Porter, Barry</creator><creatorcontrib>Rainford, Penny Faulkner ; Porter, Barry</creatorcontrib><description>Emergent software systems are assembled from a collection of small code blocks, where some of those blocks have alternative implementation variants; they optimise at run-time by learning which compositions of alternative blocks best suit each deployment environment encountered.In this paper we study the automated synthesis of new implementation variants for a running system using genetic improvement (GI). Typical GI approaches, however, rely on large amounts of data for accurate training and large code bases from which to source genetic material. In emergent systems we have neither asset, with sparsely sampled runtime data and small code volumes in each building block.We therefore examine two approaches to more effective GI under these constraints: the synthesis of data from sparse samples to construct statistically representative larger training corpora; and the synthesis of code to counter the relative lack of genetic material in our starting population members.Our results demonstrate that a mixture of synthesised and existing code is a viable optimisation strategy, and that phases of increased synthesis can make GI more robust to deleterious mutations. On synthesised data, we find that we can produce equivalent optimisation compared to GI methods using larger data sets, and that this optimisation can produce both useful specialists and generalists.</description><identifier>ISSN: 2688-299X</identifier><identifier>EISSN: 2688-3007</identifier><identifier>DOI: 10.1145/3542823</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Artificial life ; Computing methodologies ; Genetic algorithms</subject><ispartof>ACM transactions on evolutionary learning, 2022-08, Vol.2 (2), p.1-35, Article 7</ispartof><rights>Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a1543-d568fd5a6b94384c8c02f24122331333df1b5ffada9847d8c1fec8344b3404d43</cites><orcidid>0000-0001-8376-736X ; 0000-0002-0552-2209</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3542823$$EPDF$$P50$$Gacm$$H</linktopdf><link.rule.ids>314,780,784,2282,27924,27925,40196,76228</link.rule.ids></links><search><creatorcontrib>Rainford, Penny Faulkner</creatorcontrib><creatorcontrib>Porter, Barry</creatorcontrib><title>Code and Data Synthesis for Genetic Improvement in Emergent Software Systems</title><title>ACM transactions on evolutionary learning</title><addtitle>ACM TELO</addtitle><description>Emergent software systems are assembled from a collection of small code blocks, where some of those blocks have alternative implementation variants; they optimise at run-time by learning which compositions of alternative blocks best suit each deployment environment encountered.In this paper we study the automated synthesis of new implementation variants for a running system using genetic improvement (GI). Typical GI approaches, however, rely on large amounts of data for accurate training and large code bases from which to source genetic material. In emergent systems we have neither asset, with sparsely sampled runtime data and small code volumes in each building block.We therefore examine two approaches to more effective GI under these constraints: the synthesis of data from sparse samples to construct statistically representative larger training corpora; and the synthesis of code to counter the relative lack of genetic material in our starting population members.Our results demonstrate that a mixture of synthesised and existing code is a viable optimisation strategy, and that phases of increased synthesis can make GI more robust to deleterious mutations. On synthesised data, we find that we can produce equivalent optimisation compared to GI methods using larger data sets, and that this optimisation can produce both useful specialists and generalists.</description><subject>Artificial life</subject><subject>Computing methodologies</subject><subject>Genetic algorithms</subject><issn>2688-299X</issn><issn>2688-3007</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNo9kMFLwzAYxYMoOObw7ik3T9UkX9KmR6lzGxQ8TMFbSZMvWrHtSIKy_96NVU_vwfu9d3iEXHN2x7lU96Ck0ALOyEzkWmfAWHE-eVGWb5dkEeMnY0woDgUrZqSuRofUDI4-mmTodj-kD4xdpH4MdIUDps7STb8L4zf2OCTaDXTZY3g_-u3o048JeKjFhH28IhfefEVcTDonr0_Ll2qd1c-rTfVQZ4YrCZlTufZOmbwtJWhptWXCC8mFAOAA4DxvlffGmVLLwmnLPVoNUrYgmXQS5uT2tGvDGGNA3-xC15uwbzhrjj800w8H8uZEGtv_Q3_hL_VmVr0</recordid><startdate>20220817</startdate><enddate>20220817</enddate><creator>Rainford, Penny Faulkner</creator><creator>Porter, Barry</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-8376-736X</orcidid><orcidid>https://orcid.org/0000-0002-0552-2209</orcidid></search><sort><creationdate>20220817</creationdate><title>Code and Data Synthesis for Genetic Improvement in Emergent Software Systems</title><author>Rainford, Penny Faulkner ; Porter, Barry</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a1543-d568fd5a6b94384c8c02f24122331333df1b5ffada9847d8c1fec8344b3404d43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial life</topic><topic>Computing methodologies</topic><topic>Genetic algorithms</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rainford, Penny Faulkner</creatorcontrib><creatorcontrib>Porter, Barry</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on evolutionary learning</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rainford, Penny Faulkner</au><au>Porter, Barry</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Code and Data Synthesis for Genetic Improvement in Emergent Software Systems</atitle><jtitle>ACM transactions on evolutionary learning</jtitle><stitle>ACM TELO</stitle><date>2022-08-17</date><risdate>2022</risdate><volume>2</volume><issue>2</issue><spage>1</spage><epage>35</epage><pages>1-35</pages><artnum>7</artnum><issn>2688-299X</issn><eissn>2688-3007</eissn><abstract>Emergent software systems are assembled from a collection of small code blocks, where some of those blocks have alternative implementation variants; they optimise at run-time by learning which compositions of alternative blocks best suit each deployment environment encountered.In this paper we study the automated synthesis of new implementation variants for a running system using genetic improvement (GI). Typical GI approaches, however, rely on large amounts of data for accurate training and large code bases from which to source genetic material. In emergent systems we have neither asset, with sparsely sampled runtime data and small code volumes in each building block.We therefore examine two approaches to more effective GI under these constraints: the synthesis of data from sparse samples to construct statistically representative larger training corpora; and the synthesis of code to counter the relative lack of genetic material in our starting population members.Our results demonstrate that a mixture of synthesised and existing code is a viable optimisation strategy, and that phases of increased synthesis can make GI more robust to deleterious mutations. On synthesised data, we find that we can produce equivalent optimisation compared to GI methods using larger data sets, and that this optimisation can produce both useful specialists and generalists.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3542823</doi><tpages>35</tpages><orcidid>https://orcid.org/0000-0001-8376-736X</orcidid><orcidid>https://orcid.org/0000-0002-0552-2209</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2688-299X
ispartof	ACM transactions on evolutionary learning, 2022-08, Vol.2 (2), p.1-35, Article 7
issn	2688-299X 2688-3007
language	eng
recordid	cdi_crossref_primary_10_1145_3542823
source	ACM Digital Library Complete
subjects	Artificial life Computing methodologies Genetic algorithms
title	Code and Data Synthesis for Genetic Improvement in Emergent Software Systems
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-23T16%3A39%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Code%20and%20Data%20Synthesis%20for%20Genetic%20Improvement%20in%20Emergent%20Software%20Systems&rft.jtitle=ACM%20transactions%20on%20evolutionary%20learning&rft.au=Rainford,%20Penny%20Faulkner&rft.date=2022-08-17&rft.volume=2&rft.issue=2&rft.spage=1&rft.epage=35&rft.pages=1-35&rft.artnum=7&rft.issn=2688-299X&rft.eissn=2688-3007&rft_id=info:doi/10.1145/3542823&rft_dat=%3Cacm_cross%3E3542823%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true