DATA-DRIVEN ROBOT CONTROL

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific traini...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Cabi, Serkan, Scholz, Jonathan Karl, Jeong, Rae Chan, Novikov, Alexander, Reed, Scott Ellison, Sushkov, Oleg O, Konyushkova, Ksenia, Budden, David, Denil, Misha Man Ray, Gomes de Freitas, Joao Ferdinando, Vecerik, Mel, Aytar, Yusuf, Barker, David, Gomez Colmenarejo, Sergio, Wang, Ziyu
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Cabi, Serkan
Scholz, Jonathan Karl
Jeong, Rae Chan
Novikov, Alexander
Reed, Scott Ellison
Sushkov, Oleg O
Konyushkova, Ksenia
Budden, David
Denil, Misha Man Ray
Gomes de Freitas, Joao Ferdinando
Vecerik, Mel
Aytar, Yusuf
Barker, David
Gomez Colmenarejo, Sergio
Wang, Ziyu
description Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2021078169A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2021078169A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2021078169A13</originalsourceid><addsrcrecordid>eNrjZJB0cQxx1HUJ8gxz9VMI8nfyD1Fw9vcLCfL34WFgTUvMKU7lhdLcDMpuriHOHrqpBfnxqcUFicmpeakl8aHBRgZGhgbmFoZmlo6GxsSpAgCR5SDy</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>DATA-DRIVEN ROBOT CONTROL</title><source>esp@cenet</source><creator>Cabi, Serkan ; Scholz, Jonathan Karl ; Jeong, Rae Chan ; Novikov, Alexander ; Reed, Scott Ellison ; Sushkov, Oleg O ; Konyushkova, Ksenia ; Budden, David ; Denil, Misha Man Ray ; Gomes de Freitas, Joao Ferdinando ; Vecerik, Mel ; Aytar, Yusuf ; Barker, David ; Gomez Colmenarejo, Sergio ; Wang, Ziyu</creator><creatorcontrib>Cabi, Serkan ; Scholz, Jonathan Karl ; Jeong, Rae Chan ; Novikov, Alexander ; Reed, Scott Ellison ; Sushkov, Oleg O ; Konyushkova, Ksenia ; Budden, David ; Denil, Misha Man Ray ; Gomes de Freitas, Joao Ferdinando ; Vecerik, Mel ; Aytar, Yusuf ; Barker, David ; Gomez Colmenarejo, Sergio ; Wang, Ziyu</creatorcontrib><description>Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.</description><language>eng</language><subject>CHAMBERS PROVIDED WITH MANIPULATION DEVICES ; HAND TOOLS ; MANIPULATORS ; PERFORMING OPERATIONS ; PORTABLE POWER-DRIVEN TOOLS ; TRANSPORTING</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20210318&amp;DB=EPODOC&amp;CC=US&amp;NR=2021078169A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20210318&amp;DB=EPODOC&amp;CC=US&amp;NR=2021078169A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Cabi, Serkan</creatorcontrib><creatorcontrib>Scholz, Jonathan Karl</creatorcontrib><creatorcontrib>Jeong, Rae Chan</creatorcontrib><creatorcontrib>Novikov, Alexander</creatorcontrib><creatorcontrib>Reed, Scott Ellison</creatorcontrib><creatorcontrib>Sushkov, Oleg O</creatorcontrib><creatorcontrib>Konyushkova, Ksenia</creatorcontrib><creatorcontrib>Budden, David</creatorcontrib><creatorcontrib>Denil, Misha Man Ray</creatorcontrib><creatorcontrib>Gomes de Freitas, Joao Ferdinando</creatorcontrib><creatorcontrib>Vecerik, Mel</creatorcontrib><creatorcontrib>Aytar, Yusuf</creatorcontrib><creatorcontrib>Barker, David</creatorcontrib><creatorcontrib>Gomez Colmenarejo, Sergio</creatorcontrib><creatorcontrib>Wang, Ziyu</creatorcontrib><title>DATA-DRIVEN ROBOT CONTROL</title><description>Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.</description><subject>CHAMBERS PROVIDED WITH MANIPULATION DEVICES</subject><subject>HAND TOOLS</subject><subject>MANIPULATORS</subject><subject>PERFORMING OPERATIONS</subject><subject>PORTABLE POWER-DRIVEN TOOLS</subject><subject>TRANSPORTING</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZJB0cQxx1HUJ8gxz9VMI8nfyD1Fw9vcLCfL34WFgTUvMKU7lhdLcDMpuriHOHrqpBfnxqcUFicmpeakl8aHBRgZGhgbmFoZmlo6GxsSpAgCR5SDy</recordid><startdate>20210318</startdate><enddate>20210318</enddate><creator>Cabi, Serkan</creator><creator>Scholz, Jonathan Karl</creator><creator>Jeong, Rae Chan</creator><creator>Novikov, Alexander</creator><creator>Reed, Scott Ellison</creator><creator>Sushkov, Oleg O</creator><creator>Konyushkova, Ksenia</creator><creator>Budden, David</creator><creator>Denil, Misha Man Ray</creator><creator>Gomes de Freitas, Joao Ferdinando</creator><creator>Vecerik, Mel</creator><creator>Aytar, Yusuf</creator><creator>Barker, David</creator><creator>Gomez Colmenarejo, Sergio</creator><creator>Wang, Ziyu</creator><scope>EVB</scope></search><sort><creationdate>20210318</creationdate><title>DATA-DRIVEN ROBOT CONTROL</title><author>Cabi, Serkan ; Scholz, Jonathan Karl ; Jeong, Rae Chan ; Novikov, Alexander ; Reed, Scott Ellison ; Sushkov, Oleg O ; Konyushkova, Ksenia ; Budden, David ; Denil, Misha Man Ray ; Gomes de Freitas, Joao Ferdinando ; Vecerik, Mel ; Aytar, Yusuf ; Barker, David ; Gomez Colmenarejo, Sergio ; Wang, Ziyu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2021078169A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2021</creationdate><topic>CHAMBERS PROVIDED WITH MANIPULATION DEVICES</topic><topic>HAND TOOLS</topic><topic>MANIPULATORS</topic><topic>PERFORMING OPERATIONS</topic><topic>PORTABLE POWER-DRIVEN TOOLS</topic><topic>TRANSPORTING</topic><toplevel>online_resources</toplevel><creatorcontrib>Cabi, Serkan</creatorcontrib><creatorcontrib>Scholz, Jonathan Karl</creatorcontrib><creatorcontrib>Jeong, Rae Chan</creatorcontrib><creatorcontrib>Novikov, Alexander</creatorcontrib><creatorcontrib>Reed, Scott Ellison</creatorcontrib><creatorcontrib>Sushkov, Oleg O</creatorcontrib><creatorcontrib>Konyushkova, Ksenia</creatorcontrib><creatorcontrib>Budden, David</creatorcontrib><creatorcontrib>Denil, Misha Man Ray</creatorcontrib><creatorcontrib>Gomes de Freitas, Joao Ferdinando</creatorcontrib><creatorcontrib>Vecerik, Mel</creatorcontrib><creatorcontrib>Aytar, Yusuf</creatorcontrib><creatorcontrib>Barker, David</creatorcontrib><creatorcontrib>Gomez Colmenarejo, Sergio</creatorcontrib><creatorcontrib>Wang, Ziyu</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Cabi, Serkan</au><au>Scholz, Jonathan Karl</au><au>Jeong, Rae Chan</au><au>Novikov, Alexander</au><au>Reed, Scott Ellison</au><au>Sushkov, Oleg O</au><au>Konyushkova, Ksenia</au><au>Budden, David</au><au>Denil, Misha Man Ray</au><au>Gomes de Freitas, Joao Ferdinando</au><au>Vecerik, Mel</au><au>Aytar, Yusuf</au><au>Barker, David</au><au>Gomez Colmenarejo, Sergio</au><au>Wang, Ziyu</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>DATA-DRIVEN ROBOT CONTROL</title><date>2021-03-18</date><risdate>2021</risdate><abstract>Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US2021078169A1
source esp@cenet
subjects CHAMBERS PROVIDED WITH MANIPULATION DEVICES
HAND TOOLS
MANIPULATORS
PERFORMING OPERATIONS
PORTABLE POWER-DRIVEN TOOLS
TRANSPORTING
title DATA-DRIVEN ROBOT CONTROL
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T14%3A08%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Cabi,%20Serkan&rft.date=2021-03-18&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2021078169A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true