Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity
In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficie...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on industrial electronics (1982) 2023-03, Vol.70 (3), p.2759-2769 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2769 |
---|---|
container_issue | 3 |
container_start_page | 2759 |
container_title | IEEE transactions on industrial electronics (1982) |
container_volume | 70 |
creator | Bing, Zhenshan Zhou, Hongkuan Li, Rui Su, Xiaojie Morin, Fabrice O. Huang, Kai Knoll, Alois |
description | In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficiency in robotic manipulation tasks, such as hindsight experience replay (HER), hindsight goal generation (HGG), graph-based HGG (G-HGG), and curriculum-guided HER (CHER). However, none of these can learn efficiently to solve challenging manipulation tasks with distant goals and obstacles, since they rely either on heuristic or simple distance-guided exploration. In this article, we introduce graph-curriculum-guided HGG (GC-HGG), an extension of CHER and G-HGG, which works by selecting hindsight goals on the basis of graph-based proximity and diversity. We evaluated GC-HGG in four challenging manipulation tasks involving obstacles in both simulations and real-world experiments, in which significant enhancements in both sample efficiency and overall success rates over prior works were demonstrated. Videos and codes can be viewed at this link: https://videoviewsite.wixsite.com/gc-hgg . |
doi_str_mv | 10.1109/TIE.2022.3172754 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIE_2022_3172754</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9772990</ieee_id><sourcerecordid>2737567159</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-be7e4570485c8e8e4f37deee598dcb6ad9bd8f9ab0168ff23e05c7b6aebe99253</originalsourceid><addsrcrecordid>eNo9kM1PAjEQxRujiYjeTbw08bzYjy1tj4qIJBgNoB433d1ZKYF2bReU_94lEE8vk3lv5uWH0DUlPUqJvpuPhz1GGOtxKpkU6QnqUCFkonWqTlGHMKkSQtL-ObqIcUkITQUVHeRmfrW17gtPfe4bW-AX42y9WZnGeoc_bbPAs9qECHgKPyaUrVhX-VDAGlyDJ2CC28c_rMGjYOpF8mAilPjRbiFE2-ywcSV-C_7XrtvpEp1VZhXh6qhd9P40nA-ek8nraDy4nyQF57xJcpCQCklSJQoFCtKKyxIAhFZlkfdNqfNSVdrkhPZVVTEORBSyXUAOWjPBu-j2cLcO_nsDscmWfhNc-zJjkkvRl1To1kUOriL4GANUWR3s2oRdRkm2p5q1VLM91exItY3cHCK2rfNv11IyrQn_A3tHdWc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2737567159</pqid></control><display><type>article</type><title>Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity</title><source>IEEE Electronic Library (IEL)</source><creator>Bing, Zhenshan ; Zhou, Hongkuan ; Li, Rui ; Su, Xiaojie ; Morin, Fabrice O. ; Huang, Kai ; Knoll, Alois</creator><creatorcontrib>Bing, Zhenshan ; Zhou, Hongkuan ; Li, Rui ; Su, Xiaojie ; Morin, Fabrice O. ; Huang, Kai ; Knoll, Alois</creatorcontrib><description>In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficiency in robotic manipulation tasks, such as hindsight experience replay (HER), hindsight goal generation (HGG), graph-based HGG (G-HGG), and curriculum-guided HER (CHER). However, none of these can learn efficiently to solve challenging manipulation tasks with distant goals and obstacles, since they rely either on heuristic or simple distance-guided exploration. In this article, we introduce graph-curriculum-guided HGG (GC-HGG), an extension of CHER and G-HGG, which works by selecting hindsight goals on the basis of graph-based proximity and diversity. We evaluated GC-HGG in four challenging manipulation tasks involving obstacles in both simulations and real-world experiments, in which significant enhancements in both sample efficiency and overall success rates over prior works were demonstrated. Videos and codes can be viewed at this link: https://videoviewsite.wixsite.com/gc-hgg .</description><identifier>ISSN: 0278-0046</identifier><identifier>EISSN: 1557-9948</identifier><identifier>DOI: 10.1109/TIE.2022.3172754</identifier><identifier>CODEN: ITIED6</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Barriers ; Curricula ; Euclidean distance ; Hindsight experience replay (HER) ; Machine learning ; Optimization ; path planning ; Reinforcement learning ; robotic arm manipulation ; Robots ; Task analysis ; Training ; Trajectory</subject><ispartof>IEEE transactions on industrial electronics (1982), 2023-03, Vol.70 (3), p.2759-2769</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-be7e4570485c8e8e4f37deee598dcb6ad9bd8f9ab0168ff23e05c7b6aebe99253</citedby><cites>FETCH-LOGICAL-c333t-be7e4570485c8e8e4f37deee598dcb6ad9bd8f9ab0168ff23e05c7b6aebe99253</cites><orcidid>0000-0002-0896-2517 ; 0000-0003-0185-7420 ; 0000-0002-8877-8524 ; 0000-0003-0359-7810 ; 0000-0002-3665-9822 ; 0000-0003-1802-0264 ; 0000-0003-4840-076X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9772990$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27915,27916,54749</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9772990$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Bing, Zhenshan</creatorcontrib><creatorcontrib>Zhou, Hongkuan</creatorcontrib><creatorcontrib>Li, Rui</creatorcontrib><creatorcontrib>Su, Xiaojie</creatorcontrib><creatorcontrib>Morin, Fabrice O.</creatorcontrib><creatorcontrib>Huang, Kai</creatorcontrib><creatorcontrib>Knoll, Alois</creatorcontrib><title>Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity</title><title>IEEE transactions on industrial electronics (1982)</title><addtitle>TIE</addtitle><description>In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficiency in robotic manipulation tasks, such as hindsight experience replay (HER), hindsight goal generation (HGG), graph-based HGG (G-HGG), and curriculum-guided HER (CHER). However, none of these can learn efficiently to solve challenging manipulation tasks with distant goals and obstacles, since they rely either on heuristic or simple distance-guided exploration. In this article, we introduce graph-curriculum-guided HGG (GC-HGG), an extension of CHER and G-HGG, which works by selecting hindsight goals on the basis of graph-based proximity and diversity. We evaluated GC-HGG in four challenging manipulation tasks involving obstacles in both simulations and real-world experiments, in which significant enhancements in both sample efficiency and overall success rates over prior works were demonstrated. Videos and codes can be viewed at this link: https://videoviewsite.wixsite.com/gc-hgg .</description><subject>Algorithms</subject><subject>Barriers</subject><subject>Curricula</subject><subject>Euclidean distance</subject><subject>Hindsight experience replay (HER)</subject><subject>Machine learning</subject><subject>Optimization</subject><subject>path planning</subject><subject>Reinforcement learning</subject><subject>robotic arm manipulation</subject><subject>Robots</subject><subject>Task analysis</subject><subject>Training</subject><subject>Trajectory</subject><issn>0278-0046</issn><issn>1557-9948</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM1PAjEQxRujiYjeTbw08bzYjy1tj4qIJBgNoB433d1ZKYF2bReU_94lEE8vk3lv5uWH0DUlPUqJvpuPhz1GGOtxKpkU6QnqUCFkonWqTlGHMKkSQtL-ObqIcUkITQUVHeRmfrW17gtPfe4bW-AX42y9WZnGeoc_bbPAs9qECHgKPyaUrVhX-VDAGlyDJ2CC28c_rMGjYOpF8mAilPjRbiFE2-ywcSV-C_7XrtvpEp1VZhXh6qhd9P40nA-ek8nraDy4nyQF57xJcpCQCklSJQoFCtKKyxIAhFZlkfdNqfNSVdrkhPZVVTEORBSyXUAOWjPBu-j2cLcO_nsDscmWfhNc-zJjkkvRl1To1kUOriL4GANUWR3s2oRdRkm2p5q1VLM91exItY3cHCK2rfNv11IyrQn_A3tHdWc</recordid><startdate>20230301</startdate><enddate>20230301</enddate><creator>Bing, Zhenshan</creator><creator>Zhou, Hongkuan</creator><creator>Li, Rui</creator><creator>Su, Xiaojie</creator><creator>Morin, Fabrice O.</creator><creator>Huang, Kai</creator><creator>Knoll, Alois</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-0896-2517</orcidid><orcidid>https://orcid.org/0000-0003-0185-7420</orcidid><orcidid>https://orcid.org/0000-0002-8877-8524</orcidid><orcidid>https://orcid.org/0000-0003-0359-7810</orcidid><orcidid>https://orcid.org/0000-0002-3665-9822</orcidid><orcidid>https://orcid.org/0000-0003-1802-0264</orcidid><orcidid>https://orcid.org/0000-0003-4840-076X</orcidid></search><sort><creationdate>20230301</creationdate><title>Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity</title><author>Bing, Zhenshan ; Zhou, Hongkuan ; Li, Rui ; Su, Xiaojie ; Morin, Fabrice O. ; Huang, Kai ; Knoll, Alois</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-be7e4570485c8e8e4f37deee598dcb6ad9bd8f9ab0168ff23e05c7b6aebe99253</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Barriers</topic><topic>Curricula</topic><topic>Euclidean distance</topic><topic>Hindsight experience replay (HER)</topic><topic>Machine learning</topic><topic>Optimization</topic><topic>path planning</topic><topic>Reinforcement learning</topic><topic>robotic arm manipulation</topic><topic>Robots</topic><topic>Task analysis</topic><topic>Training</topic><topic>Trajectory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bing, Zhenshan</creatorcontrib><creatorcontrib>Zhou, Hongkuan</creatorcontrib><creatorcontrib>Li, Rui</creatorcontrib><creatorcontrib>Su, Xiaojie</creatorcontrib><creatorcontrib>Morin, Fabrice O.</creatorcontrib><creatorcontrib>Huang, Kai</creatorcontrib><creatorcontrib>Knoll, Alois</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on industrial electronics (1982)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bing, Zhenshan</au><au>Zhou, Hongkuan</au><au>Li, Rui</au><au>Su, Xiaojie</au><au>Morin, Fabrice O.</au><au>Huang, Kai</au><au>Knoll, Alois</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity</atitle><jtitle>IEEE transactions on industrial electronics (1982)</jtitle><stitle>TIE</stitle><date>2023-03-01</date><risdate>2023</risdate><volume>70</volume><issue>3</issue><spage>2759</spage><epage>2769</epage><pages>2759-2769</pages><issn>0278-0046</issn><eissn>1557-9948</eissn><coden>ITIED6</coden><abstract>In multigoal reinforcement learning (RL), algorithms usually suffer from inefficiency in the collection of successful experiences in tasks with sparse rewards. By utilizing the ideas of relabeling hindsight experience and curriculum learning, some prior works have greatly improved the sample efficiency in robotic manipulation tasks, such as hindsight experience replay (HER), hindsight goal generation (HGG), graph-based HGG (G-HGG), and curriculum-guided HER (CHER). However, none of these can learn efficiently to solve challenging manipulation tasks with distant goals and obstacles, since they rely either on heuristic or simple distance-guided exploration. In this article, we introduce graph-curriculum-guided HGG (GC-HGG), an extension of CHER and G-HGG, which works by selecting hindsight goals on the basis of graph-based proximity and diversity. We evaluated GC-HGG in four challenging manipulation tasks involving obstacles in both simulations and real-world experiments, in which significant enhancements in both sample efficiency and overall success rates over prior works were demonstrated. Videos and codes can be viewed at this link: https://videoviewsite.wixsite.com/gc-hgg .</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIE.2022.3172754</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-0896-2517</orcidid><orcidid>https://orcid.org/0000-0003-0185-7420</orcidid><orcidid>https://orcid.org/0000-0002-8877-8524</orcidid><orcidid>https://orcid.org/0000-0003-0359-7810</orcidid><orcidid>https://orcid.org/0000-0002-3665-9822</orcidid><orcidid>https://orcid.org/0000-0003-1802-0264</orcidid><orcidid>https://orcid.org/0000-0003-4840-076X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0278-0046 |
ispartof | IEEE transactions on industrial electronics (1982), 2023-03, Vol.70 (3), p.2759-2769 |
issn | 0278-0046 1557-9948 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TIE_2022_3172754 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Barriers Curricula Euclidean distance Hindsight experience replay (HER) Machine learning Optimization path planning Reinforcement learning robotic arm manipulation Robots Task analysis Training Trajectory |
title | Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T06%3A50%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Solving%20Robotic%20Manipulation%20With%20Sparse%20Reward%20Reinforcement%20Learning%20Via%20Graph-Based%20Diversity%20and%20Proximity&rft.jtitle=IEEE%20transactions%20on%20industrial%20electronics%20(1982)&rft.au=Bing,%20Zhenshan&rft.date=2023-03-01&rft.volume=70&rft.issue=3&rft.spage=2759&rft.epage=2769&rft.pages=2759-2769&rft.issn=0278-0046&rft.eissn=1557-9948&rft.coden=ITIED6&rft_id=info:doi/10.1109/TIE.2022.3172754&rft_dat=%3Cproquest_RIE%3E2737567159%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2737567159&rft_id=info:pmid/&rft_ieee_id=9772990&rfr_iscdi=true |