SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning

Automated Machine Learning (AutoML) approaches encompass traditional methods that optimize fixed pipelines for model selection and ensembling, as well as newer LLM-based frameworks that autonomously build pipelines. While LLM-based agents have shown promise in automating machine learning tasks, they...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-10
Hauptverfasser:	Chi, Yizhou, Lin, Yizhang, Hong, Sirui, Pan, Duyi, Yaying Fei, Guanghao Mei, Liu, Bangbang, Pang, Tianqi, Kwok, Jacky, Zhang, Ceyao, Liu, Bang, Wu, Chenglin
Format:	Artikel
Sprache:	eng
Schlagworte:	Automation Cognitive tasks Datasets Machine learning Optimization Pipelines Searching Solution space
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Chi, Yizhou Lin, Yizhang Hong, Sirui Pan, Duyi Yaying Fei Guanghao Mei Liu, Bangbang Pang, Tianqi Kwok, Jacky Zhang, Ceyao Liu, Bang Wu, Chenglin
description	Automated Machine Learning (AutoML) approaches encompass traditional methods that optimize fixed pipelines for model selection and ensembling, as well as newer LLM-based frameworks that autonomously build pipelines. While LLM-based agents have shown promise in automating machine learning tasks, they often generate low-diversity and suboptimal code, even after multiple iterations. To overcome these limitations, we introduce Tree-Search Enhanced LLM Agents (SELA), an innovative agent-based system that leverages Monte Carlo Tree Search (MCTS) to optimize the AutoML process. By representing pipeline configurations as trees, our framework enables agents to conduct experiments intelligently and iteratively refine their strategies, facilitating a more effective exploration of the machine learning solution space. This novel approach allows SELA to discover optimal pathways based on experimental feedback, improving the overall quality of the solutions. In an extensive evaluation across 20 machine learning datasets, we compare the performance of traditional and agent-based AutoML methods, demonstrating that SELA achieves a win rate of 65% to 80% against each baseline across all datasets. These results underscore the significant potential of agent-based strategies in AutoML, offering a fresh perspective on tackling complex machine learning challenges.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3119817548</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3119817548</sourcerecordid><originalsourceid>FETCH-proquest_journals_31198175483</originalsourceid><addsrcrecordid>eNqNi7sOgjAUQBsTE4nyDzdxJqEtCLoRxTiUSXbS4OUVvdW2_L8MfoDTGc45KxYIKXmUJ0JsWOjcFMexOGQiTWXALvdSFSeoLWJ0R23bAUoaNLX4AKUqKHok76AzForZm5f2i6h0O4yEoJaBRup3bN3pp8Pwxy3bX8v6fIve1nxmdL6ZzGxpUY3k_JjzLE1y-V_1BTmUOMY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3119817548</pqid></control><display><type>article</type><title>SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning</title><source>Free E- Journals</source><creator>Chi, Yizhou ; Lin, Yizhang ; Hong, Sirui ; Pan, Duyi ; Yaying Fei ; Guanghao Mei ; Liu, Bangbang ; Pang, Tianqi ; Kwok, Jacky ; Zhang, Ceyao ; Liu, Bang ; Wu, Chenglin</creator><creatorcontrib>Chi, Yizhou ; Lin, Yizhang ; Hong, Sirui ; Pan, Duyi ; Yaying Fei ; Guanghao Mei ; Liu, Bangbang ; Pang, Tianqi ; Kwok, Jacky ; Zhang, Ceyao ; Liu, Bang ; Wu, Chenglin</creatorcontrib><description>Automated Machine Learning (AutoML) approaches encompass traditional methods that optimize fixed pipelines for model selection and ensembling, as well as newer LLM-based frameworks that autonomously build pipelines. While LLM-based agents have shown promise in automating machine learning tasks, they often generate low-diversity and suboptimal code, even after multiple iterations. To overcome these limitations, we introduce Tree-Search Enhanced LLM Agents (SELA), an innovative agent-based system that leverages Monte Carlo Tree Search (MCTS) to optimize the AutoML process. By representing pipeline configurations as trees, our framework enables agents to conduct experiments intelligently and iteratively refine their strategies, facilitating a more effective exploration of the machine learning solution space. This novel approach allows SELA to discover optimal pathways based on experimental feedback, improving the overall quality of the solutions. In an extensive evaluation across 20 machine learning datasets, we compare the performance of traditional and agent-based AutoML methods, demonstrating that SELA achieves a win rate of 65% to 80% against each baseline across all datasets. These results underscore the significant potential of agent-based strategies in AutoML, offering a fresh perspective on tackling complex machine learning challenges.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Automation ; Cognitive tasks ; Datasets ; Machine learning ; Optimization ; Pipelines ; Searching ; Solution space</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Chi, Yizhou</creatorcontrib><creatorcontrib>Lin, Yizhang</creatorcontrib><creatorcontrib>Hong, Sirui</creatorcontrib><creatorcontrib>Pan, Duyi</creatorcontrib><creatorcontrib>Yaying Fei</creatorcontrib><creatorcontrib>Guanghao Mei</creatorcontrib><creatorcontrib>Liu, Bangbang</creatorcontrib><creatorcontrib>Pang, Tianqi</creatorcontrib><creatorcontrib>Kwok, Jacky</creatorcontrib><creatorcontrib>Zhang, Ceyao</creatorcontrib><creatorcontrib>Liu, Bang</creatorcontrib><creatorcontrib>Wu, Chenglin</creatorcontrib><title>SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning</title><title>arXiv.org</title><description>Automated Machine Learning (AutoML) approaches encompass traditional methods that optimize fixed pipelines for model selection and ensembling, as well as newer LLM-based frameworks that autonomously build pipelines. While LLM-based agents have shown promise in automating machine learning tasks, they often generate low-diversity and suboptimal code, even after multiple iterations. To overcome these limitations, we introduce Tree-Search Enhanced LLM Agents (SELA), an innovative agent-based system that leverages Monte Carlo Tree Search (MCTS) to optimize the AutoML process. By representing pipeline configurations as trees, our framework enables agents to conduct experiments intelligently and iteratively refine their strategies, facilitating a more effective exploration of the machine learning solution space. This novel approach allows SELA to discover optimal pathways based on experimental feedback, improving the overall quality of the solutions. In an extensive evaluation across 20 machine learning datasets, we compare the performance of traditional and agent-based AutoML methods, demonstrating that SELA achieves a win rate of 65% to 80% against each baseline across all datasets. These results underscore the significant potential of agent-based strategies in AutoML, offering a fresh perspective on tackling complex machine learning challenges.</description><subject>Automation</subject><subject>Cognitive tasks</subject><subject>Datasets</subject><subject>Machine learning</subject><subject>Optimization</subject><subject>Pipelines</subject><subject>Searching</subject><subject>Solution space</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi7sOgjAUQBsTE4nyDzdxJqEtCLoRxTiUSXbS4OUVvdW2_L8MfoDTGc45KxYIKXmUJ0JsWOjcFMexOGQiTWXALvdSFSeoLWJ0R23bAUoaNLX4AKUqKHok76AzForZm5f2i6h0O4yEoJaBRup3bN3pp8Pwxy3bX8v6fIve1nxmdL6ZzGxpUY3k_JjzLE1y-V_1BTmUOMY</recordid><startdate>20241022</startdate><enddate>20241022</enddate><creator>Chi, Yizhou</creator><creator>Lin, Yizhang</creator><creator>Hong, Sirui</creator><creator>Pan, Duyi</creator><creator>Yaying Fei</creator><creator>Guanghao Mei</creator><creator>Liu, Bangbang</creator><creator>Pang, Tianqi</creator><creator>Kwok, Jacky</creator><creator>Zhang, Ceyao</creator><creator>Liu, Bang</creator><creator>Wu, Chenglin</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241022</creationdate><title>SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning</title><author>Chi, Yizhou ; Lin, Yizhang ; Hong, Sirui ; Pan, Duyi ; Yaying Fei ; Guanghao Mei ; Liu, Bangbang ; Pang, Tianqi ; Kwok, Jacky ; Zhang, Ceyao ; Liu, Bang ; Wu, Chenglin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31198175483</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Automation</topic><topic>Cognitive tasks</topic><topic>Datasets</topic><topic>Machine learning</topic><topic>Optimization</topic><topic>Pipelines</topic><topic>Searching</topic><topic>Solution space</topic><toplevel>online_resources</toplevel><creatorcontrib>Chi, Yizhou</creatorcontrib><creatorcontrib>Lin, Yizhang</creatorcontrib><creatorcontrib>Hong, Sirui</creatorcontrib><creatorcontrib>Pan, Duyi</creatorcontrib><creatorcontrib>Yaying Fei</creatorcontrib><creatorcontrib>Guanghao Mei</creatorcontrib><creatorcontrib>Liu, Bangbang</creatorcontrib><creatorcontrib>Pang, Tianqi</creatorcontrib><creatorcontrib>Kwok, Jacky</creatorcontrib><creatorcontrib>Zhang, Ceyao</creatorcontrib><creatorcontrib>Liu, Bang</creatorcontrib><creatorcontrib>Wu, Chenglin</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chi, Yizhou</au><au>Lin, Yizhang</au><au>Hong, Sirui</au><au>Pan, Duyi</au><au>Yaying Fei</au><au>Guanghao Mei</au><au>Liu, Bangbang</au><au>Pang, Tianqi</au><au>Kwok, Jacky</au><au>Zhang, Ceyao</au><au>Liu, Bang</au><au>Wu, Chenglin</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning</atitle><jtitle>arXiv.org</jtitle><date>2024-10-22</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Automated Machine Learning (AutoML) approaches encompass traditional methods that optimize fixed pipelines for model selection and ensembling, as well as newer LLM-based frameworks that autonomously build pipelines. While LLM-based agents have shown promise in automating machine learning tasks, they often generate low-diversity and suboptimal code, even after multiple iterations. To overcome these limitations, we introduce Tree-Search Enhanced LLM Agents (SELA), an innovative agent-based system that leverages Monte Carlo Tree Search (MCTS) to optimize the AutoML process. By representing pipeline configurations as trees, our framework enables agents to conduct experiments intelligently and iteratively refine their strategies, facilitating a more effective exploration of the machine learning solution space. This novel approach allows SELA to discover optimal pathways based on experimental feedback, improving the overall quality of the solutions. In an extensive evaluation across 20 machine learning datasets, we compare the performance of traditional and agent-based AutoML methods, demonstrating that SELA achieves a win rate of 65% to 80% against each baseline across all datasets. These results underscore the significant potential of agent-based strategies in AutoML, offering a fresh perspective on tackling complex machine learning challenges.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-10
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3119817548
source	Free E- Journals
subjects	Automation Cognitive tasks Datasets Machine learning Optimization Pipelines Searching Solution space
title	SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T01%3A15%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=SELA:%20Tree-Search%20Enhanced%20LLM%20Agents%20for%20Automated%20Machine%20Learning&rft.jtitle=arXiv.org&rft.au=Chi,%20Yizhou&rft.date=2024-10-22&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3119817548%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3119817548&rft_id=info:pmid/&rfr_iscdi=true