Triangle Inequality for Inverse Optimal Control

Inverse optimal control (IOC) is a problem of estimating a cost function based on the behaviors of an expert that behaves optimally with respect to the cost function. Although the Hamilton-Jacobi-Bellman (HJB) equation for the value function that evaluates the temporal integral of the cost function...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2023, Vol.11, p.119187-119199
Hauptverfasser: Mitsuhashi, Sho, Ishii, Shin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 119199
container_issue
container_start_page 119187
container_title IEEE access
container_volume 11
creator Mitsuhashi, Sho
Ishii, Shin
description Inverse optimal control (IOC) is a problem of estimating a cost function based on the behaviors of an expert that behaves optimally with respect to the cost function. Although the Hamilton-Jacobi-Bellman (HJB) equation for the value function that evaluates the temporal integral of the cost function provides a necessary condition for the optimality of expert behaviors, the use of the HJB equation alone is insufficient for solving the IOC problem. In this study, we propose a triangle inequality which is useful for estimating the better representation of the value function, along with a new IOC method incorporating the triangle inequality. Through several IOC problems and imitation learning problems of time-dependent control behaviors, we show that our IOC method performs substantially better than an existing IOC method. Showing our IOC method is also applicable to an imitation of expert control of a 2-link manipulator, we demonstrate applicability of our method to real-world problems.
doi_str_mv 10.1109/ACCESS.2023.3327426
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2885652512</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10295444</ieee_id><doaj_id>oai_doaj_org_article_55529466b9d24673a102c4a5736d3441</doaj_id><sourcerecordid>2885652512</sourcerecordid><originalsourceid>FETCH-LOGICAL-c322t-a2a3b6cb807a8963e2af252a0b0774a61e7dd98fc3928f2c4cf625d54121d6ab3</originalsourceid><addsrcrecordid>eNpNUE1rwkAQXUoLFesvaA-BnqO7s1_ZowTbCoIH7XmZJBuJpK5uYsF_37WR0rnMzGPem8cj5JnRKWPUzOZ5vthspkCBTzkHLUDdkREwZVIuubr_Nz-SSdftaawsQlKPyGwbGjzsWpcsD-50xrbpL0ntQ1y_Xehcsj72zRe2Se4PffDtE3mose3c5NbH5PNtsc0_0tX6fZnPV2nJAfoUAXmhyiKjGjOjuAOsQQLSgmotUDGnq8pkdckNZDWUoqwVyEoKBqxSWPAxWQ66lce9PYboIVysx8b-Aj7sLIa-KVtnpZRghFKFqUAozZHRKIhSc1VxIVjUeh20jsGfzq7r7d6fwyHat5BlUkmQDOIVH67K4LsuuPrvK6P2GrQdgrbXoO0t6Mh6GViNc-4fA4wUQvAfK6l21w</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2885652512</pqid></control><display><type>article</type><title>Triangle Inequality for Inverse Optimal Control</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Mitsuhashi, Sho ; Ishii, Shin</creator><creatorcontrib>Mitsuhashi, Sho ; Ishii, Shin</creatorcontrib><description>Inverse optimal control (IOC) is a problem of estimating a cost function based on the behaviors of an expert that behaves optimally with respect to the cost function. Although the Hamilton-Jacobi-Bellman (HJB) equation for the value function that evaluates the temporal integral of the cost function provides a necessary condition for the optimality of expert behaviors, the use of the HJB equation alone is insufficient for solving the IOC problem. In this study, we propose a triangle inequality which is useful for estimating the better representation of the value function, along with a new IOC method incorporating the triangle inequality. Through several IOC problems and imitation learning problems of time-dependent control behaviors, we show that our IOC method performs substantially better than an existing IOC method. Showing our IOC method is also applicable to an imitation of expert control of a 2-link manipulator, we demonstrate applicability of our method to real-world problems.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3327426</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Aerospace electronics ; Behavioral sciences ; Cost estimation ; Cost function ; Costs ; Estimation ; imitation learning ; Inequality ; inverse optimal control ; inverse reinforcement learning ; Optimal control ; Optimization ; Task analysis ; Trajectory</subject><ispartof>IEEE access, 2023, Vol.11, p.119187-119199</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c322t-a2a3b6cb807a8963e2af252a0b0774a61e7dd98fc3928f2c4cf625d54121d6ab3</cites><orcidid>0000-0001-9385-8230 ; 0000-0003-4217-6883</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10295444$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2102,4024,27633,27923,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Mitsuhashi, Sho</creatorcontrib><creatorcontrib>Ishii, Shin</creatorcontrib><title>Triangle Inequality for Inverse Optimal Control</title><title>IEEE access</title><addtitle>Access</addtitle><description>Inverse optimal control (IOC) is a problem of estimating a cost function based on the behaviors of an expert that behaves optimally with respect to the cost function. Although the Hamilton-Jacobi-Bellman (HJB) equation for the value function that evaluates the temporal integral of the cost function provides a necessary condition for the optimality of expert behaviors, the use of the HJB equation alone is insufficient for solving the IOC problem. In this study, we propose a triangle inequality which is useful for estimating the better representation of the value function, along with a new IOC method incorporating the triangle inequality. Through several IOC problems and imitation learning problems of time-dependent control behaviors, we show that our IOC method performs substantially better than an existing IOC method. Showing our IOC method is also applicable to an imitation of expert control of a 2-link manipulator, we demonstrate applicability of our method to real-world problems.</description><subject>Aerospace electronics</subject><subject>Behavioral sciences</subject><subject>Cost estimation</subject><subject>Cost function</subject><subject>Costs</subject><subject>Estimation</subject><subject>imitation learning</subject><subject>Inequality</subject><subject>inverse optimal control</subject><subject>inverse reinforcement learning</subject><subject>Optimal control</subject><subject>Optimization</subject><subject>Task analysis</subject><subject>Trajectory</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUE1rwkAQXUoLFesvaA-BnqO7s1_ZowTbCoIH7XmZJBuJpK5uYsF_37WR0rnMzGPem8cj5JnRKWPUzOZ5vthspkCBTzkHLUDdkREwZVIuubr_Nz-SSdftaawsQlKPyGwbGjzsWpcsD-50xrbpL0ntQ1y_Xehcsj72zRe2Se4PffDtE3mose3c5NbH5PNtsc0_0tX6fZnPV2nJAfoUAXmhyiKjGjOjuAOsQQLSgmotUDGnq8pkdckNZDWUoqwVyEoKBqxSWPAxWQ66lce9PYboIVysx8b-Aj7sLIa-KVtnpZRghFKFqUAozZHRKIhSc1VxIVjUeh20jsGfzq7r7d6fwyHat5BlUkmQDOIVH67K4LsuuPrvK6P2GrQdgrbXoO0t6Mh6GViNc-4fA4wUQvAfK6l21w</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Mitsuhashi, Sho</creator><creator>Ishii, Shin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-9385-8230</orcidid><orcidid>https://orcid.org/0000-0003-4217-6883</orcidid></search><sort><creationdate>2023</creationdate><title>Triangle Inequality for Inverse Optimal Control</title><author>Mitsuhashi, Sho ; Ishii, Shin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c322t-a2a3b6cb807a8963e2af252a0b0774a61e7dd98fc3928f2c4cf625d54121d6ab3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Aerospace electronics</topic><topic>Behavioral sciences</topic><topic>Cost estimation</topic><topic>Cost function</topic><topic>Costs</topic><topic>Estimation</topic><topic>imitation learning</topic><topic>Inequality</topic><topic>inverse optimal control</topic><topic>inverse reinforcement learning</topic><topic>Optimal control</topic><topic>Optimization</topic><topic>Task analysis</topic><topic>Trajectory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mitsuhashi, Sho</creatorcontrib><creatorcontrib>Ishii, Shin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mitsuhashi, Sho</au><au>Ishii, Shin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Triangle Inequality for Inverse Optimal Control</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023</date><risdate>2023</risdate><volume>11</volume><spage>119187</spage><epage>119199</epage><pages>119187-119199</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Inverse optimal control (IOC) is a problem of estimating a cost function based on the behaviors of an expert that behaves optimally with respect to the cost function. Although the Hamilton-Jacobi-Bellman (HJB) equation for the value function that evaluates the temporal integral of the cost function provides a necessary condition for the optimality of expert behaviors, the use of the HJB equation alone is insufficient for solving the IOC problem. In this study, we propose a triangle inequality which is useful for estimating the better representation of the value function, along with a new IOC method incorporating the triangle inequality. Through several IOC problems and imitation learning problems of time-dependent control behaviors, we show that our IOC method performs substantially better than an existing IOC method. Showing our IOC method is also applicable to an imitation of expert control of a 2-link manipulator, we demonstrate applicability of our method to real-world problems.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3327426</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-9385-8230</orcidid><orcidid>https://orcid.org/0000-0003-4217-6883</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2023, Vol.11, p.119187-119199
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2885652512
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects Aerospace electronics
Behavioral sciences
Cost estimation
Cost function
Costs
Estimation
imitation learning
Inequality
inverse optimal control
inverse reinforcement learning
Optimal control
Optimization
Task analysis
Trajectory
title Triangle Inequality for Inverse Optimal Control
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T21%3A08%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Triangle%20Inequality%20for%20Inverse%20Optimal%20Control&rft.jtitle=IEEE%20access&rft.au=Mitsuhashi,%20Sho&rft.date=2023&rft.volume=11&rft.spage=119187&rft.epage=119199&rft.pages=119187-119199&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3327426&rft_dat=%3Cproquest_ieee_%3E2885652512%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2885652512&rft_id=info:pmid/&rft_ieee_id=10295444&rft_doaj_id=oai_doaj_org_article_55529466b9d24673a102c4a5736d3441&rfr_iscdi=true