Quantifying Behavioural Distance Between Mathematical Expressions

Existing symbolic regression methods organize the space of candidate mathematical expressions primarily based on their syntactic, structural similarity. However, this approach overlooks crucial equivalences between expressions that arise from mathematical symmetries, such as commutativity, associati...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-08
Hauptverfasser:	Mežnar, Sebastian, Džeroski, Sašo, Todorovski, Ljupčo
Format:	Artikel
Sprache:	eng
Schlagworte:	Associativity Commutativity Error analysis Searching Smoothness
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Mežnar, Sebastian Džeroski, Sašo Todorovski, Ljupčo
description	Existing symbolic regression methods organize the space of candidate mathematical expressions primarily based on their syntactic, structural similarity. However, this approach overlooks crucial equivalences between expressions that arise from mathematical symmetries, such as commutativity, associativity, and distribution laws for arithmetic operations. Consequently, expressions with similar errors on a given data set are apart from each other in the search space. This leads to a rough error landscape in the search space that efficient local, gradient-based methods cannot explore. This paper proposes and implements a measure of a behavioral distance, BED, that clusters together expressions with similar errors. The experimental results show that the stochastic method for calculating BED achieves consistency with a modest number of sampled values for evaluating the expressions. This leads to computational efficiency comparable to the tree-based syntactic distance. Our findings also reveal that BED significantly improves the smoothness of the error landscape in the search space for symbolic regression.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3095811190</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3095811190</sourcerecordid><originalsourceid>FETCH-proquest_journals_30958111903</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwDCxNzCvJTKvMzEtXcErNSCzLzC8tSsxRcMksLknMS04FCpaUp6bmKfgmlmSk5iaWZCYDZV0rCopSi4sz8_OKeRhY0xJzilN5oTQ3g7Kba4izh25BUX5haWpxSXwW0MQ8oFS8sYGlqYWhoaGlgTFxqgDKozoT</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3095811190</pqid></control><display><type>article</type><title>Quantifying Behavioural Distance Between Mathematical Expressions</title><source>Freely Accessible Journals</source><creator>Mežnar, Sebastian ; Džeroski, Sašo ; Todorovski, Ljupčo</creator><creatorcontrib>Mežnar, Sebastian ; Džeroski, Sašo ; Todorovski, Ljupčo</creatorcontrib><description>Existing symbolic regression methods organize the space of candidate mathematical expressions primarily based on their syntactic, structural similarity. However, this approach overlooks crucial equivalences between expressions that arise from mathematical symmetries, such as commutativity, associativity, and distribution laws for arithmetic operations. Consequently, expressions with similar errors on a given data set are apart from each other in the search space. This leads to a rough error landscape in the search space that efficient local, gradient-based methods cannot explore. This paper proposes and implements a measure of a behavioral distance, BED, that clusters together expressions with similar errors. The experimental results show that the stochastic method for calculating BED achieves consistency with a modest number of sampled values for evaluating the expressions. This leads to computational efficiency comparable to the tree-based syntactic distance. Our findings also reveal that BED significantly improves the smoothness of the error landscape in the search space for symbolic regression.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Associativity ; Commutativity ; Error analysis ; Searching ; Smoothness</subject><ispartof>arXiv.org, 2024-08</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>781,785</link.rule.ids></links><search><creatorcontrib>Mežnar, Sebastian</creatorcontrib><creatorcontrib>Džeroski, Sašo</creatorcontrib><creatorcontrib>Todorovski, Ljupčo</creatorcontrib><title>Quantifying Behavioural Distance Between Mathematical Expressions</title><title>arXiv.org</title><description>Existing symbolic regression methods organize the space of candidate mathematical expressions primarily based on their syntactic, structural similarity. However, this approach overlooks crucial equivalences between expressions that arise from mathematical symmetries, such as commutativity, associativity, and distribution laws for arithmetic operations. Consequently, expressions with similar errors on a given data set are apart from each other in the search space. This leads to a rough error landscape in the search space that efficient local, gradient-based methods cannot explore. This paper proposes and implements a measure of a behavioral distance, BED, that clusters together expressions with similar errors. The experimental results show that the stochastic method for calculating BED achieves consistency with a modest number of sampled values for evaluating the expressions. This leads to computational efficiency comparable to the tree-based syntactic distance. Our findings also reveal that BED significantly improves the smoothness of the error landscape in the search space for symbolic regression.</description><subject>Associativity</subject><subject>Commutativity</subject><subject>Error analysis</subject><subject>Searching</subject><subject>Smoothness</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwDCxNzCvJTKvMzEtXcErNSCzLzC8tSsxRcMksLknMS04FCpaUp6bmKfgmlmSk5iaWZCYDZV0rCopSi4sz8_OKeRhY0xJzilN5oTQ3g7Kba4izh25BUX5haWpxSXwW0MQ8oFS8sYGlqYWhoaGlgTFxqgDKozoT</recordid><startdate>20240821</startdate><enddate>20240821</enddate><creator>Mežnar, Sebastian</creator><creator>Džeroski, Sašo</creator><creator>Todorovski, Ljupčo</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240821</creationdate><title>Quantifying Behavioural Distance Between Mathematical Expressions</title><author>Mežnar, Sebastian ; Džeroski, Sašo ; Todorovski, Ljupčo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30958111903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Associativity</topic><topic>Commutativity</topic><topic>Error analysis</topic><topic>Searching</topic><topic>Smoothness</topic><toplevel>online_resources</toplevel><creatorcontrib>Mežnar, Sebastian</creatorcontrib><creatorcontrib>Džeroski, Sašo</creatorcontrib><creatorcontrib>Todorovski, Ljupčo</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mežnar, Sebastian</au><au>Džeroski, Sašo</au><au>Todorovski, Ljupčo</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Quantifying Behavioural Distance Between Mathematical Expressions</atitle><jtitle>arXiv.org</jtitle><date>2024-08-21</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Existing symbolic regression methods organize the space of candidate mathematical expressions primarily based on their syntactic, structural similarity. However, this approach overlooks crucial equivalences between expressions that arise from mathematical symmetries, such as commutativity, associativity, and distribution laws for arithmetic operations. Consequently, expressions with similar errors on a given data set are apart from each other in the search space. This leads to a rough error landscape in the search space that efficient local, gradient-based methods cannot explore. This paper proposes and implements a measure of a behavioral distance, BED, that clusters together expressions with similar errors. The experimental results show that the stochastic method for calculating BED achieves consistency with a modest number of sampled values for evaluating the expressions. This leads to computational efficiency comparable to the tree-based syntactic distance. Our findings also reveal that BED significantly improves the smoothness of the error landscape in the search space for symbolic regression.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-08
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3095811190
source	Freely Accessible Journals
subjects	Associativity Commutativity Error analysis Searching Smoothness
title	Quantifying Behavioural Distance Between Mathematical Expressions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T07%3A38%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Quantifying%20Behavioural%20Distance%20Between%20Mathematical%20Expressions&rft.jtitle=arXiv.org&rft.au=Me%C5%BEnar,%20Sebastian&rft.date=2024-08-21&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3095811190%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3095811190&rft_id=info:pmid/&rfr_iscdi=true