Efficient generator of mathematical expressions for symbolic regression

We propose an approach to symbolic regression based on a novel variational autoencoder for generating hierarchical structures, HVAE. It combines simple atomic units with shared weights to recursively encode and decode the individual nodes in the hierarchy. Encoding is performed bottom-up and decodin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Machine learning 2023-11, Vol.112 (11), p.4563-4596
Hauptverfasser: Mežnar, Sebastian, Džeroski, Sašo, Todorovski, Ljupčo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4596
container_issue 11
container_start_page 4563
container_title Machine learning
container_volume 112
creator Mežnar, Sebastian
Džeroski, Sašo
Todorovski, Ljupčo
description We propose an approach to symbolic regression based on a novel variational autoencoder for generating hierarchical structures, HVAE. It combines simple atomic units with shared weights to recursively encode and decode the individual nodes in the hierarchy. Encoding is performed bottom-up and decoding top-down. We empirically show that HVAE can be trained efficiently with small corpora of mathematical expressions and can accurately encode expressions into a smooth low-dimensional latent space. The latter can be efficiently explored with various optimization methods to address the task of symbolic regression. Indeed, random search through the latent space of HVAE performs better than random search through expressions generated by manually crafted probabilistic grammars for mathematical expressions. Finally, EDHiE system for symbolic regression, which applies an evolutionary algorithm to the latent space of HVAE, reconstructs equations from a standard symbolic regression benchmark better than a state-of-the-art system based on a similar combination of deep learning and evolutionary algorithms.
doi_str_mv 10.1007/s10994-023-06400-2
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2881543112</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2881543112</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-b7e63f35079cc52e759b1097d4a74450669ff893126ac8cfab5d2ebfbe9a4baf3</originalsourceid><addsrcrecordid>eNp9kE9LAzEQxYMoWKtfwNOC59XJ380epdQqFLzoOSTppG5pNzXZgv32pq7gzcsMw3tvZvgRckvhngI0D5lC24oaGK9BCYCanZEJlU0ZpZLnZAJay1pRJi_JVc4bAGBKqwlZzEPofIf9UK2xx2SHmKoYqp0dPrCUzttthV_7hDl3sc9VKHo-7lzcdr5KuP4VrslFsNuMN799St6f5m-z53r5uniZPS5rzxUfateg4oFLaFrvJcNGtq683qyEbYSQoFQbgm45Zcp67YN1csXQBYetFc4GPiV34959ip8HzIPZxEPqy0nDtKZScEpZcbHR5VPMOWEw-9TtbDoaCuYEzIzATAFmfoCZU4iPoVzM_RrT3-p_Ut-x-m8Z</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2881543112</pqid></control><display><type>article</type><title>Efficient generator of mathematical expressions for symbolic regression</title><source>Springer Nature - Complete Springer Journals</source><creator>Mežnar, Sebastian ; Džeroski, Sašo ; Todorovski, Ljupčo</creator><creatorcontrib>Mežnar, Sebastian ; Džeroski, Sašo ; Todorovski, Ljupčo</creatorcontrib><description>We propose an approach to symbolic regression based on a novel variational autoencoder for generating hierarchical structures, HVAE. It combines simple atomic units with shared weights to recursively encode and decode the individual nodes in the hierarchy. Encoding is performed bottom-up and decoding top-down. We empirically show that HVAE can be trained efficiently with small corpora of mathematical expressions and can accurately encode expressions into a smooth low-dimensional latent space. The latter can be efficiently explored with various optimization methods to address the task of symbolic regression. Indeed, random search through the latent space of HVAE performs better than random search through expressions generated by manually crafted probabilistic grammars for mathematical expressions. Finally, EDHiE system for symbolic regression, which applies an evolutionary algorithm to the latent space of HVAE, reconstructs equations from a standard symbolic regression benchmark better than a state-of-the-art system based on a similar combination of deep learning and evolutionary algorithms.</description><identifier>ISSN: 0885-6125</identifier><identifier>EISSN: 1573-0565</identifier><identifier>DOI: 10.1007/s10994-023-06400-2</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Atomic properties ; Computer Science ; Control ; Decoding ; Evolutionary algorithms ; Genetic algorithms ; Grammars ; Machine Learning ; Mathematical analysis ; Mechatronics ; Natural Language Processing (NLP) ; Optimization ; Regression ; Robotics ; Simulation and Modeling ; Special Issue of the ECML PKDD 2023 Journal Track ; Statistical analysis</subject><ispartof>Machine learning, 2023-11, Vol.112 (11), p.4563-4596</ispartof><rights>The Author(s) 2023. corrected publication 2023</rights><rights>The Author(s) 2023. corrected publication 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-b7e63f35079cc52e759b1097d4a74450669ff893126ac8cfab5d2ebfbe9a4baf3</citedby><cites>FETCH-LOGICAL-c363t-b7e63f35079cc52e759b1097d4a74450669ff893126ac8cfab5d2ebfbe9a4baf3</cites><orcidid>0000-0002-0469-6696</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10994-023-06400-2$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10994-023-06400-2$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>315,781,785,27926,27927,41490,42559,51321</link.rule.ids></links><search><creatorcontrib>Mežnar, Sebastian</creatorcontrib><creatorcontrib>Džeroski, Sašo</creatorcontrib><creatorcontrib>Todorovski, Ljupčo</creatorcontrib><title>Efficient generator of mathematical expressions for symbolic regression</title><title>Machine learning</title><addtitle>Mach Learn</addtitle><description>We propose an approach to symbolic regression based on a novel variational autoencoder for generating hierarchical structures, HVAE. It combines simple atomic units with shared weights to recursively encode and decode the individual nodes in the hierarchy. Encoding is performed bottom-up and decoding top-down. We empirically show that HVAE can be trained efficiently with small corpora of mathematical expressions and can accurately encode expressions into a smooth low-dimensional latent space. The latter can be efficiently explored with various optimization methods to address the task of symbolic regression. Indeed, random search through the latent space of HVAE performs better than random search through expressions generated by manually crafted probabilistic grammars for mathematical expressions. Finally, EDHiE system for symbolic regression, which applies an evolutionary algorithm to the latent space of HVAE, reconstructs equations from a standard symbolic regression benchmark better than a state-of-the-art system based on a similar combination of deep learning and evolutionary algorithms.</description><subject>Artificial Intelligence</subject><subject>Atomic properties</subject><subject>Computer Science</subject><subject>Control</subject><subject>Decoding</subject><subject>Evolutionary algorithms</subject><subject>Genetic algorithms</subject><subject>Grammars</subject><subject>Machine Learning</subject><subject>Mathematical analysis</subject><subject>Mechatronics</subject><subject>Natural Language Processing (NLP)</subject><subject>Optimization</subject><subject>Regression</subject><subject>Robotics</subject><subject>Simulation and Modeling</subject><subject>Special Issue of the ECML PKDD 2023 Journal Track</subject><subject>Statistical analysis</subject><issn>0885-6125</issn><issn>1573-0565</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kE9LAzEQxYMoWKtfwNOC59XJ380epdQqFLzoOSTppG5pNzXZgv32pq7gzcsMw3tvZvgRckvhngI0D5lC24oaGK9BCYCanZEJlU0ZpZLnZAJay1pRJi_JVc4bAGBKqwlZzEPofIf9UK2xx2SHmKoYqp0dPrCUzttthV_7hDl3sc9VKHo-7lzcdr5KuP4VrslFsNuMN799St6f5m-z53r5uniZPS5rzxUfateg4oFLaFrvJcNGtq683qyEbYSQoFQbgm45Zcp67YN1csXQBYetFc4GPiV34959ip8HzIPZxEPqy0nDtKZScEpZcbHR5VPMOWEw-9TtbDoaCuYEzIzATAFmfoCZU4iPoVzM_RrT3-p_Ut-x-m8Z</recordid><startdate>20231101</startdate><enddate>20231101</enddate><creator>Mežnar, Sebastian</creator><creator>Džeroski, Sašo</creator><creator>Todorovski, Ljupčo</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M2P</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-0469-6696</orcidid></search><sort><creationdate>20231101</creationdate><title>Efficient generator of mathematical expressions for symbolic regression</title><author>Mežnar, Sebastian ; Džeroski, Sašo ; Todorovski, Ljupčo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-b7e63f35079cc52e759b1097d4a74450669ff893126ac8cfab5d2ebfbe9a4baf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial Intelligence</topic><topic>Atomic properties</topic><topic>Computer Science</topic><topic>Control</topic><topic>Decoding</topic><topic>Evolutionary algorithms</topic><topic>Genetic algorithms</topic><topic>Grammars</topic><topic>Machine Learning</topic><topic>Mathematical analysis</topic><topic>Mechatronics</topic><topic>Natural Language Processing (NLP)</topic><topic>Optimization</topic><topic>Regression</topic><topic>Robotics</topic><topic>Simulation and Modeling</topic><topic>Special Issue of the ECML PKDD 2023 Journal Track</topic><topic>Statistical analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mežnar, Sebastian</creatorcontrib><creatorcontrib>Džeroski, Sašo</creatorcontrib><creatorcontrib>Todorovski, Ljupčo</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer science database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>ProQuest Science Journals</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Machine learning</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mežnar, Sebastian</au><au>Džeroski, Sašo</au><au>Todorovski, Ljupčo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient generator of mathematical expressions for symbolic regression</atitle><jtitle>Machine learning</jtitle><stitle>Mach Learn</stitle><date>2023-11-01</date><risdate>2023</risdate><volume>112</volume><issue>11</issue><spage>4563</spage><epage>4596</epage><pages>4563-4596</pages><issn>0885-6125</issn><eissn>1573-0565</eissn><abstract>We propose an approach to symbolic regression based on a novel variational autoencoder for generating hierarchical structures, HVAE. It combines simple atomic units with shared weights to recursively encode and decode the individual nodes in the hierarchy. Encoding is performed bottom-up and decoding top-down. We empirically show that HVAE can be trained efficiently with small corpora of mathematical expressions and can accurately encode expressions into a smooth low-dimensional latent space. The latter can be efficiently explored with various optimization methods to address the task of symbolic regression. Indeed, random search through the latent space of HVAE performs better than random search through expressions generated by manually crafted probabilistic grammars for mathematical expressions. Finally, EDHiE system for symbolic regression, which applies an evolutionary algorithm to the latent space of HVAE, reconstructs equations from a standard symbolic regression benchmark better than a state-of-the-art system based on a similar combination of deep learning and evolutionary algorithms.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10994-023-06400-2</doi><tpages>34</tpages><orcidid>https://orcid.org/0000-0002-0469-6696</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0885-6125
ispartof Machine learning, 2023-11, Vol.112 (11), p.4563-4596
issn 0885-6125
1573-0565
language eng
recordid cdi_proquest_journals_2881543112
source Springer Nature - Complete Springer Journals
subjects Artificial Intelligence
Atomic properties
Computer Science
Control
Decoding
Evolutionary algorithms
Genetic algorithms
Grammars
Machine Learning
Mathematical analysis
Mechatronics
Natural Language Processing (NLP)
Optimization
Regression
Robotics
Simulation and Modeling
Special Issue of the ECML PKDD 2023 Journal Track
Statistical analysis
title Efficient generator of mathematical expressions for symbolic regression
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T19%3A02%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20generator%20of%20mathematical%20expressions%20for%20symbolic%20regression&rft.jtitle=Machine%20learning&rft.au=Me%C5%BEnar,%20Sebastian&rft.date=2023-11-01&rft.volume=112&rft.issue=11&rft.spage=4563&rft.epage=4596&rft.pages=4563-4596&rft.issn=0885-6125&rft.eissn=1573-0565&rft_id=info:doi/10.1007/s10994-023-06400-2&rft_dat=%3Cproquest_cross%3E2881543112%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2881543112&rft_id=info:pmid/&rfr_iscdi=true