Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity

In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficie...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on industrial electronics (1982) 2023-03, Vol.70 (3), p.2759-2769
Hauptverfasser: Bing, Zhenshan, Zhou, Hongkuan, Li, Rui, Su, Xiaojie, Morin, Fabrice O., Huang, Kai, Knoll, Alois
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2769
container_issue 3
container_start_page 2759
container_title IEEE transactions on industrial electronics (1982)
container_volume 70
creator Bing, Zhenshan
Zhou, Hongkuan
Li, Rui
Su, Xiaojie
Morin, Fabrice O.
Huang, Kai
Knoll, Alois
description In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficiency in robotic manipulation tasks, such as hindsight experience replay (HER), hindsight goal generation (HGG), graph-based HGG (G-HGG), and curriculum-guided HER (CHER). However, none of these can learn efficiently to solve challenging manipulation tasks with distant goals and obstacles, since they rely either on heuristic or simple distance-guided exploration. In this article, we introduce graph-curriculum-guided HGG (GC-HGG), an extension of CHER and G-HGG, which works by selecting hindsight goals on the basis of graph-based proximity and diversity. We evaluated GC-HGG in four challenging manipulation tasks involving obstacles in both simulations and real-world experiments, in which significant enhancements in both sample efficiency and overall success rates over prior works were demonstrated. Videos and codes can be viewed at this link: https://videoviewsite.wixsite.com/gc-hgg .
doi_str_mv 10.1109/TIE.2022.3172754
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIE_2022_3172754</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9772990</ieee_id><sourcerecordid>2737567159</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-be7e4570485c8e8e4f37deee598dcb6ad9bd8f9ab0168ff23e05c7b6aebe99253</originalsourceid><addsrcrecordid>eNo9kM1PAjEQxRujiYjeTbw08bzYjy1tj4qIJBgNoB433d1ZKYF2bReU_94lEE8vk3lv5uWH0DUlPUqJvpuPhz1GGOtxKpkU6QnqUCFkonWqTlGHMKkSQtL-ObqIcUkITQUVHeRmfrW17gtPfe4bW-AX42y9WZnGeoc_bbPAs9qECHgKPyaUrVhX-VDAGlyDJ2CC28c_rMGjYOpF8mAilPjRbiFE2-ywcSV-C_7XrtvpEp1VZhXh6qhd9P40nA-ek8nraDy4nyQF57xJcpCQCklSJQoFCtKKyxIAhFZlkfdNqfNSVdrkhPZVVTEORBSyXUAOWjPBu-j2cLcO_nsDscmWfhNc-zJjkkvRl1To1kUOriL4GANUWR3s2oRdRkm2p5q1VLM91exItY3cHCK2rfNv11IyrQn_A3tHdWc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2737567159</pqid></control><display><type>article</type><title>Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity</title><source>IEEE Electronic Library (IEL)</source><creator>Bing, Zhenshan ; Zhou, Hongkuan ; Li, Rui ; Su, Xiaojie ; Morin, Fabrice O. ; Huang, Kai ; Knoll, Alois</creator><creatorcontrib>Bing, Zhenshan ; Zhou, Hongkuan ; Li, Rui ; Su, Xiaojie ; Morin, Fabrice O. ; Huang, Kai ; Knoll, Alois</creatorcontrib><description>In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficiency in robotic manipulation tasks, such as hindsight experience replay (HER), hindsight goal generation (HGG), graph-based HGG (G-HGG), and curriculum-guided HER (CHER). However, none of these can learn efficiently to solve challenging manipulation tasks with distant goals and obstacles, since they rely either on heuristic or simple distance-guided exploration. In this article, we introduce graph-curriculum-guided HGG (GC-HGG), an extension of CHER and G-HGG, which works by selecting hindsight goals on the basis of graph-based proximity and diversity. We evaluated GC-HGG in four challenging manipulation tasks involving obstacles in both simulations and real-world experiments, in which significant enhancements in both sample efficiency and overall success rates over prior works were demonstrated. Videos and codes can be viewed at this link: https://videoviewsite.wixsite.com/gc-hgg .</description><identifier>ISSN: 0278-0046</identifier><identifier>EISSN: 1557-9948</identifier><identifier>DOI: 10.1109/TIE.2022.3172754</identifier><identifier>CODEN: ITIED6</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Barriers ; Curricula ; Euclidean distance ; Hindsight experience replay (HER) ; Machine learning ; Optimization ; path planning ; Reinforcement learning ; robotic arm manipulation ; Robots ; Task analysis ; Training ; Trajectory</subject><ispartof>IEEE transactions on industrial electronics (1982), 2023-03, Vol.70 (3), p.2759-2769</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-be7e4570485c8e8e4f37deee598dcb6ad9bd8f9ab0168ff23e05c7b6aebe99253</citedby><cites>FETCH-LOGICAL-c333t-be7e4570485c8e8e4f37deee598dcb6ad9bd8f9ab0168ff23e05c7b6aebe99253</cites><orcidid>0000-0002-0896-2517 ; 0000-0003-0185-7420 ; 0000-0002-8877-8524 ; 0000-0003-0359-7810 ; 0000-0002-3665-9822 ; 0000-0003-1802-0264 ; 0000-0003-4840-076X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9772990$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27915,27916,54749</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9772990$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Bing, Zhenshan</creatorcontrib><creatorcontrib>Zhou, Hongkuan</creatorcontrib><creatorcontrib>Li, Rui</creatorcontrib><creatorcontrib>Su, Xiaojie</creatorcontrib><creatorcontrib>Morin, Fabrice O.</creatorcontrib><creatorcontrib>Huang, Kai</creatorcontrib><creatorcontrib>Knoll, Alois</creatorcontrib><title>Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity</title><title>IEEE transactions on industrial electronics (1982)</title><addtitle>TIE</addtitle><description>In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficiency in robotic manipulation tasks, such as hindsight experience replay (HER), hindsight goal generation (HGG), graph-based HGG (G-HGG), and curriculum-guided HER (CHER). However, none of these can learn efficiently to solve challenging manipulation tasks with distant goals and obstacles, since they rely either on heuristic or simple distance-guided exploration. In this article, we introduce graph-curriculum-guided HGG (GC-HGG), an extension of CHER and G-HGG, which works by selecting hindsight goals on the basis of graph-based proximity and diversity. We evaluated GC-HGG in four challenging manipulation tasks involving obstacles in both simulations and real-world experiments, in which significant enhancements in both sample efficiency and overall success rates over prior works were demonstrated. Videos and codes can be viewed at this link: https://videoviewsite.wixsite.com/gc-hgg .</description><subject>Algorithms</subject><subject>Barriers</subject><subject>Curricula</subject><subject>Euclidean distance</subject><subject>Hindsight experience replay (HER)</subject><subject>Machine learning</subject><subject>Optimization</subject><subject>path planning</subject><subject>Reinforcement learning</subject><subject>robotic arm manipulation</subject><subject>Robots</subject><subject>Task analysis</subject><subject>Training</subject><subject>Trajectory</subject><issn>0278-0046</issn><issn>1557-9948</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM1PAjEQxRujiYjeTbw08bzYjy1tj4qIJBgNoB433d1ZKYF2bReU_94lEE8vk3lv5uWH0DUlPUqJvpuPhz1GGOtxKpkU6QnqUCFkonWqTlGHMKkSQtL-ObqIcUkITQUVHeRmfrW17gtPfe4bW-AX42y9WZnGeoc_bbPAs9qECHgKPyaUrVhX-VDAGlyDJ2CC28c_rMGjYOpF8mAilPjRbiFE2-ywcSV-C_7XrtvpEp1VZhXh6qhd9P40nA-ek8nraDy4nyQF57xJcpCQCklSJQoFCtKKyxIAhFZlkfdNqfNSVdrkhPZVVTEORBSyXUAOWjPBu-j2cLcO_nsDscmWfhNc-zJjkkvRl1To1kUOriL4GANUWR3s2oRdRkm2p5q1VLM91exItY3cHCK2rfNv11IyrQn_A3tHdWc</recordid><startdate>20230301</startdate><enddate>20230301</enddate><creator>Bing, Zhenshan</creator><creator>Zhou, Hongkuan</creator><creator>Li, Rui</creator><creator>Su, Xiaojie</creator><creator>Morin, Fabrice O.</creator><creator>Huang, Kai</creator><creator>Knoll, Alois</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-0896-2517</orcidid><orcidid>https://orcid.org/0000-0003-0185-7420</orcidid><orcidid>https://orcid.org/0000-0002-8877-8524</orcidid><orcidid>https://orcid.org/0000-0003-0359-7810</orcidid><orcidid>https://orcid.org/0000-0002-3665-9822</orcidid><orcidid>https://orcid.org/0000-0003-1802-0264</orcidid><orcidid>https://orcid.org/0000-0003-4840-076X</orcidid></search><sort><creationdate>20230301</creationdate><title>Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity</title><author>Bing, Zhenshan ; Zhou, Hongkuan ; Li, Rui ; Su, Xiaojie ; Morin, Fabrice O. ; Huang, Kai ; Knoll, Alois</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-be7e4570485c8e8e4f37deee598dcb6ad9bd8f9ab0168ff23e05c7b6aebe99253</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Barriers</topic><topic>Curricula</topic><topic>Euclidean distance</topic><topic>Hindsight experience replay (HER)</topic><topic>Machine learning</topic><topic>Optimization</topic><topic>path planning</topic><topic>Reinforcement learning</topic><topic>robotic arm manipulation</topic><topic>Robots</topic><topic>Task analysis</topic><topic>Training</topic><topic>Trajectory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bing, Zhenshan</creatorcontrib><creatorcontrib>Zhou, Hongkuan</creatorcontrib><creatorcontrib>Li, Rui</creatorcontrib><creatorcontrib>Su, Xiaojie</creatorcontrib><creatorcontrib>Morin, Fabrice O.</creatorcontrib><creatorcontrib>Huang, Kai</creatorcontrib><creatorcontrib>Knoll, Alois</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on industrial electronics (1982)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bing, Zhenshan</au><au>Zhou, Hongkuan</au><au>Li, Rui</au><au>Su, Xiaojie</au><au>Morin, Fabrice O.</au><au>Huang, Kai</au><au>Knoll, Alois</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity</atitle><jtitle>IEEE transactions on industrial electronics (1982)</jtitle><stitle>TIE</stitle><date>2023-03-01</date><risdate>2023</risdate><volume>70</volume><issue>3</issue><spage>2759</spage><epage>2769</epage><pages>2759-2769</pages><issn>0278-0046</issn><eissn>1557-9948</eissn><coden>ITIED6</coden><abstract>In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficiency in robotic manipulation tasks, such as hindsight experience replay (HER), hindsight goal generation (HGG), graph-based HGG (G-HGG), and curriculum-guided HER (CHER). However, none of these can learn efficiently to solve challenging manipulation tasks with distant goals and obstacles, since they rely either on heuristic or simple distance-guided exploration. In this article, we introduce graph-curriculum-guided HGG (GC-HGG), an extension of CHER and G-HGG, which works by selecting hindsight goals on the basis of graph-based proximity and diversity. We evaluated GC-HGG in four challenging manipulation tasks involving obstacles in both simulations and real-world experiments, in which significant enhancements in both sample efficiency and overall success rates over prior works were demonstrated. Videos and codes can be viewed at this link: https://videoviewsite.wixsite.com/gc-hgg .</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIE.2022.3172754</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-0896-2517</orcidid><orcidid>https://orcid.org/0000-0003-0185-7420</orcidid><orcidid>https://orcid.org/0000-0002-8877-8524</orcidid><orcidid>https://orcid.org/0000-0003-0359-7810</orcidid><orcidid>https://orcid.org/0000-0002-3665-9822</orcidid><orcidid>https://orcid.org/0000-0003-1802-0264</orcidid><orcidid>https://orcid.org/0000-0003-4840-076X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0278-0046
ispartof IEEE transactions on industrial electronics (1982), 2023-03, Vol.70 (3), p.2759-2769
issn 0278-0046
1557-9948
language eng
recordid cdi_crossref_primary_10_1109_TIE_2022_3172754
source IEEE Electronic Library (IEL)
subjects Algorithms
Barriers
Curricula
Euclidean distance
Hindsight experience replay (HER)
Machine learning
Optimization
path planning
Reinforcement learning
robotic arm manipulation
Robots
Task analysis
Training
Trajectory
title Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T06%3A50%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Solving%20Robotic%20Manipulation%20With%20Sparse%20Reward%20Reinforcement%20Learning%20Via%20Graph-Based%20Diversity%20and%20Proximity&rft.jtitle=IEEE%20transactions%20on%20industrial%20electronics%20(1982)&rft.au=Bing,%20Zhenshan&rft.date=2023-03-01&rft.volume=70&rft.issue=3&rft.spage=2759&rft.epage=2769&rft.pages=2759-2769&rft.issn=0278-0046&rft.eissn=1557-9948&rft.coden=ITIED6&rft_id=info:doi/10.1109/TIE.2022.3172754&rft_dat=%3Cproquest_RIE%3E2737567159%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2737567159&rft_id=info:pmid/&rft_ieee_id=9772990&rfr_iscdi=true