ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application
This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI's ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition programs,...
Gespeichert in:
Veröffentlicht in: | IEEE access 2023, Vol.11, p.95060-95078 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 95078 |
---|---|
container_issue | |
container_start_page | 95060 |
container_title | IEEE access |
container_volume | 11 |
creator | Wake, Naoki Kanehira, Atsushi Sasabuchi, Kazuhiro Takamatsu, Jun Ikeuchi, Katsushi |
description | This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI's ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition programs, adapt to various environments, and create multi-step task plans while mitigating the impact of token limit imposed on ChatGPT. In our approach, ChatGPT receives both instructions and textual environmental data, and outputs a task plan and an updated environment. These environmental data are reused in subsequent task planning, thus eliminating the extensive record-keeping of prior task plans within the prompts of ChatGPT. Experimental results demonstrated the effectiveness of these prompts across various domestic environments, such as manipulations in front of a shelf, a fridge, and a drawer. The conversational capability of ChatGPT allows users to adjust the output via natural-language feedback. Additionally, a quantitative evaluation using VirtualHome showed that our results are comparable to previous studies. Specifically, 36% of task planning met both executability and correctness, and the rate approached 100% after several rounds of feedback. Our experiments revealed that ChatGPT can reasonably plan tasks and estimate post-operation environments without actual experience in object manipulation. Despite the allure of ChatGPT-based task planning in robotics, a standardized methodology remains elusive, making our work a substantial contribution. These prompts can serve as customizable templates, offering practical resources for the robotics research community. Our prompts and source code are open source and publicly available at https://github.com/microsoft/ChatGPT-Robot-Manipulation-Prompts . |
doi_str_mv | 10.1109/ACCESS.2023.3310935 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_ACCESS_2023_3310935</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10235949</ieee_id><doaj_id>oai_doaj_org_article_d310831d5296475aa84f1dabd9cfc6d2</doaj_id><sourcerecordid>2862622312</sourcerecordid><originalsourceid>FETCH-LOGICAL-c409t-6beb3d7cb1b4361c6a26c7de3c852617663131808fd1b836ccc652d768ebb00e3</originalsourceid><addsrcrecordid>eNpNUU1LxDAQLaKgqL9ADwHPXZNMm6belrKuCwuKq15DvqpZdpuaZBX_vV0r4lxmeLz3ZpiXZRcETwjB9fW0aWar1YRiChOAAYHyIDuhhNU5lMAO_83H2XmMazwUH6CyOsmemzeZ5g9PaLbt_acN1qCl717zVbI9evTKJ9T4LgW_Qa5DLzI4v4to1n244Lut7VK8QVPUyGjRtO83TsvkfHeWHbVyE-35bz_Nnm9nT81dvryfL5rpMtcFrlPOlFVgKq2IKoARzSRlujIWNC8pIxVjQIBwzFtDFAemtWYlNRXjVimMLZxmi9HXeLkWfXBbGb6El078AD68ChmS0xsrzPAZDsSUtGZFVUrJi5YYqUytW80MHbyuRq8--PedjUms_S50w_mCckYZpUD2LBhZOvgYg23_thIs9nGIMQ6xj0P8xjGoLkeVs9b-U1Ao66KGb6SuhQg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2862622312</pqid></control><display><type>article</type><title>ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Wake, Naoki ; Kanehira, Atsushi ; Sasabuchi, Kazuhiro ; Takamatsu, Jun ; Ikeuchi, Katsushi</creator><creatorcontrib>Wake, Naoki ; Kanehira, Atsushi ; Sasabuchi, Kazuhiro ; Takamatsu, Jun ; Ikeuchi, Katsushi</creatorcontrib><description>This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI's ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition programs, adapt to various environments, and create multi-step task plans while mitigating the impact of token limit imposed on ChatGPT. In our approach, ChatGPT receives both instructions and textual environmental data, and outputs a task plan and an updated environment. These environmental data are reused in subsequent task planning, thus eliminating the extensive record-keeping of prior task plans within the prompts of ChatGPT. Experimental results demonstrated the effectiveness of these prompts across various domestic environments, such as manipulations in front of a shelf, a fridge, and a drawer. The conversational capability of ChatGPT allows users to adjust the output via natural-language feedback. Additionally, a quantitative evaluation using VirtualHome showed that our results are comparable to previous studies. Specifically, 36% of task planning met both executability and correctness, and the rate approached 100% after several rounds of feedback. Our experiments revealed that ChatGPT can reasonably plan tasks and estimate post-operation environments without actual experience in object manipulation. Despite the allure of ChatGPT-based task planning in robotics, a standardized methodology remains elusive, making our work a substantial contribution. These prompts can serve as customizable templates, offering practical resources for the robotics research community. Our prompts and source code are open source and publicly available at https://github.com/microsoft/ChatGPT-Robot-Manipulation-Prompts .</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3310935</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Chatbots ; ChatGPT ; Feedback ; large language models ; Natural language processing ; Oral communication ; Planning ; Robot control ; robot manipulation ; Robotics ; Robots ; Source code ; Task analysis ; Task planning ; Task planning (robotics) ; Translating ; Visualization</subject><ispartof>IEEE access, 2023, Vol.11, p.95060-95078</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c409t-6beb3d7cb1b4361c6a26c7de3c852617663131808fd1b836ccc652d768ebb00e3</citedby><cites>FETCH-LOGICAL-c409t-6beb3d7cb1b4361c6a26c7de3c852617663131808fd1b836ccc652d768ebb00e3</cites><orcidid>0000-0001-9758-9357 ; 0000-0001-7457-2878 ; 0000-0001-8278-2373 ; 0000-0002-5408-3089</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10235949$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2102,4024,27633,27923,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Wake, Naoki</creatorcontrib><creatorcontrib>Kanehira, Atsushi</creatorcontrib><creatorcontrib>Sasabuchi, Kazuhiro</creatorcontrib><creatorcontrib>Takamatsu, Jun</creatorcontrib><creatorcontrib>Ikeuchi, Katsushi</creatorcontrib><title>ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application</title><title>IEEE access</title><addtitle>Access</addtitle><description>This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI's ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition programs, adapt to various environments, and create multi-step task plans while mitigating the impact of token limit imposed on ChatGPT. In our approach, ChatGPT receives both instructions and textual environmental data, and outputs a task plan and an updated environment. These environmental data are reused in subsequent task planning, thus eliminating the extensive record-keeping of prior task plans within the prompts of ChatGPT. Experimental results demonstrated the effectiveness of these prompts across various domestic environments, such as manipulations in front of a shelf, a fridge, and a drawer. The conversational capability of ChatGPT allows users to adjust the output via natural-language feedback. Additionally, a quantitative evaluation using VirtualHome showed that our results are comparable to previous studies. Specifically, 36% of task planning met both executability and correctness, and the rate approached 100% after several rounds of feedback. Our experiments revealed that ChatGPT can reasonably plan tasks and estimate post-operation environments without actual experience in object manipulation. Despite the allure of ChatGPT-based task planning in robotics, a standardized methodology remains elusive, making our work a substantial contribution. These prompts can serve as customizable templates, offering practical resources for the robotics research community. Our prompts and source code are open source and publicly available at https://github.com/microsoft/ChatGPT-Robot-Manipulation-Prompts .</description><subject>Chatbots</subject><subject>ChatGPT</subject><subject>Feedback</subject><subject>large language models</subject><subject>Natural language processing</subject><subject>Oral communication</subject><subject>Planning</subject><subject>Robot control</subject><subject>robot manipulation</subject><subject>Robotics</subject><subject>Robots</subject><subject>Source code</subject><subject>Task analysis</subject><subject>Task planning</subject><subject>Task planning (robotics)</subject><subject>Translating</subject><subject>Visualization</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1LxDAQLaKgqL9ADwHPXZNMm6belrKuCwuKq15DvqpZdpuaZBX_vV0r4lxmeLz3ZpiXZRcETwjB9fW0aWar1YRiChOAAYHyIDuhhNU5lMAO_83H2XmMazwUH6CyOsmemzeZ5g9PaLbt_acN1qCl717zVbI9evTKJ9T4LgW_Qa5DLzI4v4to1n244Lut7VK8QVPUyGjRtO83TsvkfHeWHbVyE-35bz_Nnm9nT81dvryfL5rpMtcFrlPOlFVgKq2IKoARzSRlujIWNC8pIxVjQIBwzFtDFAemtWYlNRXjVimMLZxmi9HXeLkWfXBbGb6El078AD68ChmS0xsrzPAZDsSUtGZFVUrJi5YYqUytW80MHbyuRq8--PedjUms_S50w_mCckYZpUD2LBhZOvgYg23_thIs9nGIMQ6xj0P8xjGoLkeVs9b-U1Ao66KGb6SuhQg</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Wake, Naoki</creator><creator>Kanehira, Atsushi</creator><creator>Sasabuchi, Kazuhiro</creator><creator>Takamatsu, Jun</creator><creator>Ikeuchi, Katsushi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-9758-9357</orcidid><orcidid>https://orcid.org/0000-0001-7457-2878</orcidid><orcidid>https://orcid.org/0000-0001-8278-2373</orcidid><orcidid>https://orcid.org/0000-0002-5408-3089</orcidid></search><sort><creationdate>2023</creationdate><title>ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application</title><author>Wake, Naoki ; Kanehira, Atsushi ; Sasabuchi, Kazuhiro ; Takamatsu, Jun ; Ikeuchi, Katsushi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c409t-6beb3d7cb1b4361c6a26c7de3c852617663131808fd1b836ccc652d768ebb00e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Chatbots</topic><topic>ChatGPT</topic><topic>Feedback</topic><topic>large language models</topic><topic>Natural language processing</topic><topic>Oral communication</topic><topic>Planning</topic><topic>Robot control</topic><topic>robot manipulation</topic><topic>Robotics</topic><topic>Robots</topic><topic>Source code</topic><topic>Task analysis</topic><topic>Task planning</topic><topic>Task planning (robotics)</topic><topic>Translating</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wake, Naoki</creatorcontrib><creatorcontrib>Kanehira, Atsushi</creatorcontrib><creatorcontrib>Sasabuchi, Kazuhiro</creatorcontrib><creatorcontrib>Takamatsu, Jun</creatorcontrib><creatorcontrib>Ikeuchi, Katsushi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wake, Naoki</au><au>Kanehira, Atsushi</au><au>Sasabuchi, Kazuhiro</au><au>Takamatsu, Jun</au><au>Ikeuchi, Katsushi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023</date><risdate>2023</risdate><volume>11</volume><spage>95060</spage><epage>95078</epage><pages>95060-95078</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>This paper introduces a novel method for translating natural-language instructions into executable robot actions using OpenAI's ChatGPT in a few-shot setting. We propose customizable input prompts for ChatGPT that can easily integrate with robot execution systems or visual recognition programs, adapt to various environments, and create multi-step task plans while mitigating the impact of token limit imposed on ChatGPT. In our approach, ChatGPT receives both instructions and textual environmental data, and outputs a task plan and an updated environment. These environmental data are reused in subsequent task planning, thus eliminating the extensive record-keeping of prior task plans within the prompts of ChatGPT. Experimental results demonstrated the effectiveness of these prompts across various domestic environments, such as manipulations in front of a shelf, a fridge, and a drawer. The conversational capability of ChatGPT allows users to adjust the output via natural-language feedback. Additionally, a quantitative evaluation using VirtualHome showed that our results are comparable to previous studies. Specifically, 36% of task planning met both executability and correctness, and the rate approached 100% after several rounds of feedback. Our experiments revealed that ChatGPT can reasonably plan tasks and estimate post-operation environments without actual experience in object manipulation. Despite the allure of ChatGPT-based task planning in robotics, a standardized methodology remains elusive, making our work a substantial contribution. These prompts can serve as customizable templates, offering practical resources for the robotics research community. Our prompts and source code are open source and publicly available at https://github.com/microsoft/ChatGPT-Robot-Manipulation-Prompts .</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3310935</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0001-9758-9357</orcidid><orcidid>https://orcid.org/0000-0001-7457-2878</orcidid><orcidid>https://orcid.org/0000-0001-8278-2373</orcidid><orcidid>https://orcid.org/0000-0002-5408-3089</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2023, Vol.11, p.95060-95078 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_crossref_primary_10_1109_ACCESS_2023_3310935 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals |
subjects | Chatbots ChatGPT Feedback large language models Natural language processing Oral communication Planning Robot control robot manipulation Robotics Robots Source code Task analysis Task planning Task planning (robotics) Translating Visualization |
title | ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T16%3A18%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ChatGPT%20Empowered%20Long-Step%20Robot%20Control%20in%20Various%20Environments:%20A%20Case%20Application&rft.jtitle=IEEE%20access&rft.au=Wake,%20Naoki&rft.date=2023&rft.volume=11&rft.spage=95060&rft.epage=95078&rft.pages=95060-95078&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3310935&rft_dat=%3Cproquest_cross%3E2862622312%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2862622312&rft_id=info:pmid/&rft_ieee_id=10235949&rft_doaj_id=oai_doaj_org_article_d310831d5296475aa84f1dabd9cfc6d2&rfr_iscdi=true |