Expert Intervention Learning

Scalable robot learning from human-robot interaction is critical if robots are to solve a multitude of tasks in the real world. Current approaches to imitation learning suffer from one of two drawbacks. On the one hand, they rely solely on off-policy human demonstration, which in some cases leads to...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Autonomous robots 2022-01, Vol.46 (1), p.99-113
Hauptverfasser:	Spencer, Jonathan, Choudhury Sanjiban, Barnes, Matthew, Schmittle Matthew, Chiang Mung, Ramadge, Peter, Sidd, Srinivasa
Format:	Artikel
Sprache:	eng
Schlagworte:	Collision avoidance Distance learning Human engineering Intervention Robots
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	113
container_issue	1
container_start_page	99
container_title	Autonomous robots
container_volume	46
creator	Spencer, Jonathan Choudhury Sanjiban Barnes, Matthew Schmittle Matthew Chiang Mung Ramadge, Peter Sidd, Srinivasa
description	Scalable robot learning from human-robot interaction is critical if robots are to solve a multitude of tasks in the real world. Current approaches to imitation learning suffer from one of two drawbacks. On the one hand, they rely solely on off-policy human demonstration, which in some cases leads to a mismatch in train-test distribution. On the other, they burden the human to label every state the learner visits, rendering it impractical in many applications. We argue that learning interactively from expert interventions enjoys the best of both worlds. Our key insight is that any amount of expert feedback, whether by intervention or non-intervention, provides information about the quality of the current state, the quality of the action, or both. We formalize this as a constraint on the learner’s value function, which we can efficiently learn using no regret, online learning techniques. We call our approach Expert Intervention Learning (EIL), and evaluate it on a real and simulated driving task with a human expert, where it learns collision avoidance from scratch with just a few hundred samples (about one minute) of expert control.
doi_str_mv	10.1007/s10514-021-10006-9
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2625646342</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2625646342</sourcerecordid><originalsourceid>FETCH-LOGICAL-c772-ba0aa11114522cb5c1f902b2f85481e8e51a4a5545bd2183da93af304ec1c9533</originalsourceid><addsrcrecordid>eNotjUFLAzEQhYNYcG39A-Kh4Dk6M8lsNkcptRYWvPResumstEi2Jlvx57ug7_L44PE9pe4RnhDAPRcERquBUE8MtfZXqkJ2Rjsmd60q8OQ1szc36raU07TxDqBSD-ufs-RxuU2j5G9J43FIy1ZCTsf0sVCzPnwWufvvudq9rnerN92-b7arl1ZH50h3AULAKZaJYscRew_UUd-wbVAaYQw2MFvuDoSNOQRvQm_ASsTo2Zi5evzTnvPwdZEy7k_DJafpcU81cW1rY8n8AnTnPYE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2625646342</pqid></control><display><type>article</type><title>Expert Intervention Learning</title><source>Springer Nature - Complete Springer Journals</source><creator>Spencer, Jonathan ; Choudhury Sanjiban ; Barnes, Matthew ; Schmittle Matthew ; Chiang Mung ; Ramadge, Peter ; Sidd, Srinivasa</creator><creatorcontrib>Spencer, Jonathan ; Choudhury Sanjiban ; Barnes, Matthew ; Schmittle Matthew ; Chiang Mung ; Ramadge, Peter ; Sidd, Srinivasa</creatorcontrib><description>Scalable robot learning from human-robot interaction is critical if robots are to solve a multitude of tasks in the real world. Current approaches to imitation learning suffer from one of two drawbacks. On the one hand, they rely solely on off-policy human demonstration, which in some cases leads to a mismatch in train-test distribution. On the other, they burden the human to label every state the learner visits, rendering it impractical in many applications. We argue that learning interactively from expert interventions enjoys the best of both worlds. Our key insight is that any amount of expert feedback, whether by intervention or non-intervention, provides information about the quality of the current state, the quality of the action, or both. We formalize this as a constraint on the learner’s value function, which we can efficiently learn using no regret, online learning techniques. We call our approach Expert Intervention Learning (EIL), and evaluate it on a real and simulated driving task with a human expert, where it learns collision avoidance from scratch with just a few hundred samples (about one minute) of expert control.</description><identifier>ISSN: 0929-5593</identifier><identifier>EISSN: 1573-7527</identifier><identifier>DOI: 10.1007/s10514-021-10006-9</identifier><language>eng</language><publisher>Dordrecht: Springer Nature B.V</publisher><subject>Collision avoidance ; Distance learning ; Human engineering ; Intervention ; Robots</subject><ispartof>Autonomous robots, 2022-01, Vol.46 (1), p.99-113</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c772-ba0aa11114522cb5c1f902b2f85481e8e51a4a5545bd2183da93af304ec1c9533</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids></links><search><creatorcontrib>Spencer, Jonathan</creatorcontrib><creatorcontrib>Choudhury Sanjiban</creatorcontrib><creatorcontrib>Barnes, Matthew</creatorcontrib><creatorcontrib>Schmittle Matthew</creatorcontrib><creatorcontrib>Chiang Mung</creatorcontrib><creatorcontrib>Ramadge, Peter</creatorcontrib><creatorcontrib>Sidd, Srinivasa</creatorcontrib><title>Expert Intervention Learning</title><title>Autonomous robots</title><description>Scalable robot learning from human-robot interaction is critical if robots are to solve a multitude of tasks in the real world. Current approaches to imitation learning suffer from one of two drawbacks. On the one hand, they rely solely on off-policy human demonstration, which in some cases leads to a mismatch in train-test distribution. On the other, they burden the human to label every state the learner visits, rendering it impractical in many applications. We argue that learning interactively from expert interventions enjoys the best of both worlds. Our key insight is that any amount of expert feedback, whether by intervention or non-intervention, provides information about the quality of the current state, the quality of the action, or both. We formalize this as a constraint on the learner’s value function, which we can efficiently learn using no regret, online learning techniques. We call our approach Expert Intervention Learning (EIL), and evaluate it on a real and simulated driving task with a human expert, where it learns collision avoidance from scratch with just a few hundred samples (about one minute) of expert control.</description><subject>Collision avoidance</subject><subject>Distance learning</subject><subject>Human engineering</subject><subject>Intervention</subject><subject>Robots</subject><issn>0929-5593</issn><issn>1573-7527</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNotjUFLAzEQhYNYcG39A-Kh4Dk6M8lsNkcptRYWvPResumstEi2Jlvx57ug7_L44PE9pe4RnhDAPRcERquBUE8MtfZXqkJ2Rjsmd60q8OQ1szc36raU07TxDqBSD-ufs-RxuU2j5G9J43FIy1ZCTsf0sVCzPnwWufvvudq9rnerN92-b7arl1ZH50h3AULAKZaJYscRew_UUd-wbVAaYQw2MFvuDoSNOQRvQm_ASsTo2Zi5evzTnvPwdZEy7k_DJafpcU81cW1rY8n8AnTnPYE</recordid><startdate>20220101</startdate><enddate>20220101</enddate><creator>Spencer, Jonathan</creator><creator>Choudhury Sanjiban</creator><creator>Barnes, Matthew</creator><creator>Schmittle Matthew</creator><creator>Chiang Mung</creator><creator>Ramadge, Peter</creator><creator>Sidd, Srinivasa</creator><general>Springer Nature B.V</general><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>F28</scope><scope>FR3</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>S0W</scope></search><sort><creationdate>20220101</creationdate><title>Expert Intervention Learning</title><author>Spencer, Jonathan ; Choudhury Sanjiban ; Barnes, Matthew ; Schmittle Matthew ; Chiang Mung ; Ramadge, Peter ; Sidd, Srinivasa</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c772-ba0aa11114522cb5c1f902b2f85481e8e51a4a5545bd2183da93af304ec1c9533</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Collision avoidance</topic><topic>Distance learning</topic><topic>Human engineering</topic><topic>Intervention</topic><topic>Robots</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Spencer, Jonathan</creatorcontrib><creatorcontrib>Choudhury Sanjiban</creatorcontrib><creatorcontrib>Barnes, Matthew</creatorcontrib><creatorcontrib>Schmittle Matthew</creatorcontrib><creatorcontrib>Chiang Mung</creatorcontrib><creatorcontrib>Ramadge, Peter</creatorcontrib><creatorcontrib>Sidd, Srinivasa</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>DELNET Engineering & Technology Collection</collection><jtitle>Autonomous robots</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Spencer, Jonathan</au><au>Choudhury Sanjiban</au><au>Barnes, Matthew</au><au>Schmittle Matthew</au><au>Chiang Mung</au><au>Ramadge, Peter</au><au>Sidd, Srinivasa</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Expert Intervention Learning</atitle><jtitle>Autonomous robots</jtitle><date>2022-01-01</date><risdate>2022</risdate><volume>46</volume><issue>1</issue><spage>99</spage><epage>113</epage><pages>99-113</pages><issn>0929-5593</issn><eissn>1573-7527</eissn><abstract>Scalable robot learning from human-robot interaction is critical if robots are to solve a multitude of tasks in the real world. Current approaches to imitation learning suffer from one of two drawbacks. On the one hand, they rely solely on off-policy human demonstration, which in some cases leads to a mismatch in train-test distribution. On the other, they burden the human to label every state the learner visits, rendering it impractical in many applications. We argue that learning interactively from expert interventions enjoys the best of both worlds. Our key insight is that any amount of expert feedback, whether by intervention or non-intervention, provides information about the quality of the current state, the quality of the action, or both. We formalize this as a constraint on the learner’s value function, which we can efficiently learn using no regret, online learning techniques. We call our approach Expert Intervention Learning (EIL), and evaluate it on a real and simulated driving task with a human expert, where it learns collision avoidance from scratch with just a few hundred samples (about one minute) of expert control.</abstract><cop>Dordrecht</cop><pub>Springer Nature B.V</pub><doi>10.1007/s10514-021-10006-9</doi><tpages>15</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0929-5593
ispartof	Autonomous robots, 2022-01, Vol.46 (1), p.99-113
issn	0929-5593 1573-7527
language	eng
recordid	cdi_proquest_journals_2625646342
source	Springer Nature - Complete Springer Journals
subjects	Collision avoidance Distance learning Human engineering Intervention Robots
title	Expert Intervention Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T14%3A16%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Expert%20Intervention%20Learning&rft.jtitle=Autonomous%20robots&rft.au=Spencer,%20Jonathan&rft.date=2022-01-01&rft.volume=46&rft.issue=1&rft.spage=99&rft.epage=113&rft.pages=99-113&rft.issn=0929-5593&rft.eissn=1573-7527&rft_id=info:doi/10.1007/s10514-021-10006-9&rft_dat=%3Cproquest%3E2625646342%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2625646342&rft_id=info:pmid/&rfr_iscdi=true