ROBOT LEARNING FROM DEMONSTRATION VIA META-IMITATION LEARNING

A robot learning from demonstration method based on meta-imitation learning. The method includes: obtaining robot demonstration and teaching task set; constructing a network structure model to adapt the objective loss function; in a meta-training stage, using Algorithm 1 to learn and optimize the ad...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	LI, Xiuhao, XU, Jie, HAN, Zhangxiu, GUI, Guangchao, PAN, Yipeng, LIU, Ji, WANG, Weijun, WANG, Yuhe, LIANG, Bo, LEI, Qujiang
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	CONTROL OR REGULATING SYSTEMS IN GENERAL CONTROLLING FUNCTIONAL ELEMENTS OF SUCH SYSTEMS MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS PHYSICS REGULATING
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	LI, Xiuhao XU, Jie HAN, Zhangxiu GUI, Guangchao PAN, Yipeng LIU, Ji WANG, Weijun WANG, Yuhe LIANG, Bo LEI, Qujiang
description	A robot learning from demonstration method based on meta-imitation learning. The method includes: obtaining robot demonstration and teaching task set; constructing a network structure model to adapt the objective loss function; in a meta-training stage, using Algorithm 1 to learn and optimize the adaptive objective loss function, and then obtaining the strategy parameters; in a meta-testing stage, using Algorithm 2 to learn the trajectory demonstrated to obtain the learning strategy; taking the expert's demonstration trajectory as input, combining the learned strategy to generate the robot simulation trajectory, mapping to the robot's actions. The method can quickly generalize to new scenes from a small number of demonstration examples given by expert's demonstration without specific task engineering. In addition, the robot can self-learn strategies that have nothing to do with the task by the expert's demonstration, thereby generating a trajectory and realizing a demonstration and rapid teaching. Apprentissage de robots à partir d'un procédé de démonstrations basé sur un apprentissage par méta-imitation. Le procédé consiste : à obtenir une démonstration de robot et un ensemble de tâches d'enseignement; à construire un modèle de structure de réseau pour adapter la fonction de perte d'objectifs; lors d'un stade de méta-apprentissage, à utiliser un Algorithme 1 pour apprendre et pour optimiser la fonction adaptative de perte d'objectifs, puis à obtenir les paramètres de stratégie; lors d'un stade de métatest, à utiliser un Algorithme 2 pour apprendre la trajectoire démontrée pour obtenir la stratégie d'apprentissage; et à prendre la trajectoire de démonstration d'expert comme entrée, à combiner la stratégie apprise pour générer la trajectoire de simulation de robot et à mapper vers les actions de robot. Le procédé peut se généraliser rapidement à de nouvelles scènes à partir d'un petit nombre d'exemples de démonstration donnés par démonstration d'expert sans ingénierie spécifique de tâches. De plus, le robot peut apprendre par lui-même des stratégies indépendantes de la tâche par la démonstration d'expert, ce qui génère une trajectoire en réalisant une démonstration et un enseignement rapide.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_WO2022012265A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>WO2022012265A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_WO2022012265A13</originalsourceid><addsrcrecordid>eNrjZLAN8nfyD1HwcXUM8vP0c1dwC_L3VXBx9fX3Cw4Jcgzx9PdTCPN0VPB1DXHU9fT1DIEIwZTzMLCmJeYUp_JCaW4GZTfXEGcP3dSC_PjU4oLE5NS81JL4cH8jAyMjA0MjIzNTR0Nj4lQBAO1BKuM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>ROBOT LEARNING FROM DEMONSTRATION VIA META-IMITATION LEARNING</title><source>esp@cenet</source><creator>LI, Xiuhao ; XU, Jie ; HAN, Zhangxiu ; GUI, Guangchao ; PAN, Yipeng ; LIU, Ji ; WANG, Weijun ; WANG, Yuhe ; LIANG, Bo ; LEI, Qujiang</creator><creatorcontrib>LI, Xiuhao ; XU, Jie ; HAN, Zhangxiu ; GUI, Guangchao ; PAN, Yipeng ; LIU, Ji ; WANG, Weijun ; WANG, Yuhe ; LIANG, Bo ; LEI, Qujiang</creatorcontrib><description>A robot learning from demonstration method based on meta-imitation learning. The method includes: obtaining robot demonstration and teaching task set; constructing a network structure model to adapt the objective loss function; in a meta-training stage, using Algorithm 1 to learn and optimize the adaptive objective loss function, and then obtaining the strategy parameters; in a meta-testing stage, using Algorithm 2 to learn the trajectory demonstrated to obtain the learning strategy; taking the expert's demonstration trajectory as input, combining the learned strategy to generate the robot simulation trajectory, mapping to the robot's actions. The method can quickly generalize to new scenes from a small number of demonstration examples given by expert's demonstration without specific task engineering. In addition, the robot can self-learn strategies that have nothing to do with the task by the expert's demonstration, thereby generating a trajectory and realizing a demonstration and rapid teaching. Apprentissage de robots à partir d'un procédé de démonstrations basé sur un apprentissage par méta-imitation. Le procédé consiste : à obtenir une démonstration de robot et un ensemble de tâches d'enseignement; à construire un modèle de structure de réseau pour adapter la fonction de perte d'objectifs; lors d'un stade de méta-apprentissage, à utiliser un Algorithme 1 pour apprendre et pour optimiser la fonction adaptative de perte d'objectifs, puis à obtenir les paramètres de stratégie; lors d'un stade de métatest, à utiliser un Algorithme 2 pour apprendre la trajectoire démontrée pour obtenir la stratégie d'apprentissage; et à prendre la trajectoire de démonstration d'expert comme entrée, à combiner la stratégie apprise pour générer la trajectoire de simulation de robot et à mapper vers les actions de robot. Le procédé peut se généraliser rapidement à de nouvelles scènes à partir d'un petit nombre d'exemples de démonstration donnés par démonstration d'expert sans ingénierie spécifique de tâches. De plus, le robot peut apprendre par lui-même des stratégies indépendantes de la tâche par la démonstration d'expert, ce qui génère une trajectoire en réalisant une démonstration et un enseignement rapide.</description><language>eng ; fre</language><subject>CONTROL OR REGULATING SYSTEMS IN GENERAL ; CONTROLLING ; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS ; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS ; PHYSICS ; REGULATING</subject><creationdate>2022</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20220120&DB=EPODOC&CC=WO&NR=2022012265A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20220120&DB=EPODOC&CC=WO&NR=2022012265A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>LI, Xiuhao</creatorcontrib><creatorcontrib>XU, Jie</creatorcontrib><creatorcontrib>HAN, Zhangxiu</creatorcontrib><creatorcontrib>GUI, Guangchao</creatorcontrib><creatorcontrib>PAN, Yipeng</creatorcontrib><creatorcontrib>LIU, Ji</creatorcontrib><creatorcontrib>WANG, Weijun</creatorcontrib><creatorcontrib>WANG, Yuhe</creatorcontrib><creatorcontrib>LIANG, Bo</creatorcontrib><creatorcontrib>LEI, Qujiang</creatorcontrib><title>ROBOT LEARNING FROM DEMONSTRATION VIA META-IMITATION LEARNING</title><description>A robot learning from demonstration method based on meta-imitation learning. The method includes: obtaining robot demonstration and teaching task set; constructing a network structure model to adapt the objective loss function; in a meta-training stage, using Algorithm 1 to learn and optimize the adaptive objective loss function, and then obtaining the strategy parameters; in a meta-testing stage, using Algorithm 2 to learn the trajectory demonstrated to obtain the learning strategy; taking the expert's demonstration trajectory as input, combining the learned strategy to generate the robot simulation trajectory, mapping to the robot's actions. The method can quickly generalize to new scenes from a small number of demonstration examples given by expert's demonstration without specific task engineering. In addition, the robot can self-learn strategies that have nothing to do with the task by the expert's demonstration, thereby generating a trajectory and realizing a demonstration and rapid teaching. Apprentissage de robots à partir d'un procédé de démonstrations basé sur un apprentissage par méta-imitation. Le procédé consiste : à obtenir une démonstration de robot et un ensemble de tâches d'enseignement; à construire un modèle de structure de réseau pour adapter la fonction de perte d'objectifs; lors d'un stade de méta-apprentissage, à utiliser un Algorithme 1 pour apprendre et pour optimiser la fonction adaptative de perte d'objectifs, puis à obtenir les paramètres de stratégie; lors d'un stade de métatest, à utiliser un Algorithme 2 pour apprendre la trajectoire démontrée pour obtenir la stratégie d'apprentissage; et à prendre la trajectoire de démonstration d'expert comme entrée, à combiner la stratégie apprise pour générer la trajectoire de simulation de robot et à mapper vers les actions de robot. Le procédé peut se généraliser rapidement à de nouvelles scènes à partir d'un petit nombre d'exemples de démonstration donnés par démonstration d'expert sans ingénierie spécifique de tâches. De plus, le robot peut apprendre par lui-même des stratégies indépendantes de la tâche par la démonstration d'expert, ce qui génère une trajectoire en réalisant une démonstration et un enseignement rapide.</description><subject>CONTROL OR REGULATING SYSTEMS IN GENERAL</subject><subject>CONTROLLING</subject><subject>FUNCTIONAL ELEMENTS OF SUCH SYSTEMS</subject><subject>MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS</subject><subject>PHYSICS</subject><subject>REGULATING</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2022</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLAN8nfyD1HwcXUM8vP0c1dwC_L3VXBx9fX3Cw4Jcgzx9PdTCPN0VPB1DXHU9fT1DIEIwZTzMLCmJeYUp_JCaW4GZTfXEGcP3dSC_PjU4oLE5NS81JL4cH8jAyMjA0MjIzNTR0Nj4lQBAO1BKuM</recordid><startdate>20220120</startdate><enddate>20220120</enddate><creator>LI, Xiuhao</creator><creator>XU, Jie</creator><creator>HAN, Zhangxiu</creator><creator>GUI, Guangchao</creator><creator>PAN, Yipeng</creator><creator>LIU, Ji</creator><creator>WANG, Weijun</creator><creator>WANG, Yuhe</creator><creator>LIANG, Bo</creator><creator>LEI, Qujiang</creator><scope>EVB</scope></search><sort><creationdate>20220120</creationdate><title>ROBOT LEARNING FROM DEMONSTRATION VIA META-IMITATION LEARNING</title><author>LI, Xiuhao ; XU, Jie ; HAN, Zhangxiu ; GUI, Guangchao ; PAN, Yipeng ; LIU, Ji ; WANG, Weijun ; WANG, Yuhe ; LIANG, Bo ; LEI, Qujiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_WO2022012265A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng ; fre</language><creationdate>2022</creationdate><topic>CONTROL OR REGULATING SYSTEMS IN GENERAL</topic><topic>CONTROLLING</topic><topic>FUNCTIONAL ELEMENTS OF SUCH SYSTEMS</topic><topic>MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS</topic><topic>PHYSICS</topic><topic>REGULATING</topic><toplevel>online_resources</toplevel><creatorcontrib>LI, Xiuhao</creatorcontrib><creatorcontrib>XU, Jie</creatorcontrib><creatorcontrib>HAN, Zhangxiu</creatorcontrib><creatorcontrib>GUI, Guangchao</creatorcontrib><creatorcontrib>PAN, Yipeng</creatorcontrib><creatorcontrib>LIU, Ji</creatorcontrib><creatorcontrib>WANG, Weijun</creatorcontrib><creatorcontrib>WANG, Yuhe</creatorcontrib><creatorcontrib>LIANG, Bo</creatorcontrib><creatorcontrib>LEI, Qujiang</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>LI, Xiuhao</au><au>XU, Jie</au><au>HAN, Zhangxiu</au><au>GUI, Guangchao</au><au>PAN, Yipeng</au><au>LIU, Ji</au><au>WANG, Weijun</au><au>WANG, Yuhe</au><au>LIANG, Bo</au><au>LEI, Qujiang</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>ROBOT LEARNING FROM DEMONSTRATION VIA META-IMITATION LEARNING</title><date>2022-01-20</date><risdate>2022</risdate><abstract>A robot learning from demonstration method based on meta-imitation learning. The method includes: obtaining robot demonstration and teaching task set; constructing a network structure model to adapt the objective loss function; in a meta-training stage, using Algorithm 1 to learn and optimize the adaptive objective loss function, and then obtaining the strategy parameters; in a meta-testing stage, using Algorithm 2 to learn the trajectory demonstrated to obtain the learning strategy; taking the expert's demonstration trajectory as input, combining the learned strategy to generate the robot simulation trajectory, mapping to the robot's actions. The method can quickly generalize to new scenes from a small number of demonstration examples given by expert's demonstration without specific task engineering. In addition, the robot can self-learn strategies that have nothing to do with the task by the expert's demonstration, thereby generating a trajectory and realizing a demonstration and rapid teaching. Apprentissage de robots à partir d'un procédé de démonstrations basé sur un apprentissage par méta-imitation. Le procédé consiste : à obtenir une démonstration de robot et un ensemble de tâches d'enseignement; à construire un modèle de structure de réseau pour adapter la fonction de perte d'objectifs; lors d'un stade de méta-apprentissage, à utiliser un Algorithme 1 pour apprendre et pour optimiser la fonction adaptative de perte d'objectifs, puis à obtenir les paramètres de stratégie; lors d'un stade de métatest, à utiliser un Algorithme 2 pour apprendre la trajectoire démontrée pour obtenir la stratégie d'apprentissage; et à prendre la trajectoire de démonstration d'expert comme entrée, à combiner la stratégie apprise pour générer la trajectoire de simulation de robot et à mapper vers les actions de robot. Le procédé peut se généraliser rapidement à de nouvelles scènes à partir d'un petit nombre d'exemples de démonstration donnés par démonstration d'expert sans ingénierie spécifique de tâches. De plus, le robot peut apprendre par lui-même des stratégies indépendantes de la tâche par la démonstration d'expert, ce qui génère une trajectoire en réalisant une démonstration et un enseignement rapide.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng ; fre
recordid	cdi_epo_espacenet_WO2022012265A1
source	esp@cenet
subjects	CONTROL OR REGULATING SYSTEMS IN GENERAL CONTROLLING FUNCTIONAL ELEMENTS OF SUCH SYSTEMS MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS PHYSICS REGULATING
title	ROBOT LEARNING FROM DEMONSTRATION VIA META-IMITATION LEARNING
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T07%3A02%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=LI,%20Xiuhao&rft.date=2022-01-20&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EWO2022012265A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true