Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning

This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-C...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Autonomous robots 2022-03, Vol.46 (3), p.483-498
Hauptverfasser: Shahid, Asad Ali, Piga, Dario, Braghin, Francesco, Roveda, Loris
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 498
container_issue 3
container_start_page 483
container_title Autonomous robots
container_volume 46
creator Shahid, Asad Ali
Piga, Dario
Braghin, Francesco
Roveda, Loris
description This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.
doi_str_mv 10.1007/s10514-022-10034-z
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2638858312</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2638858312</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-1b691cdd333d852d57fc5870e6c4fed724de56d13509151aa6a20562781cb41d3</originalsourceid><addsrcrecordid>eNp9kE1PAyEQhonRxFr9A55IPKMDLMvu0TR-JU286JlQYNtttlCBPdhfL3WN3jzNTN6PSR6ErincUgB5lygIWhFgjJSbV-RwgmZUSE6kYPIUzaBlLRGi5efoIqUtALQSYIa2i-Bz78cwJmzKGsOAtcl98AkPTkff-zXW3mJt9T7ro4C7EHEMq5B7g3fa9_txmIS8iWFcb3B0vS8m43bO59-aS3TW6SG5q585R--PD2-LZ7J8fXpZ3C-J4TXPhK7qlhprOee2EcwK2RnRSHC1qTpnJausE7WlXEBLBdW61gxEzWRDzaqils_RzdS7j-FjdCmrbRijLy8Vq3nTiIZTVlxscpkYUoquU_vY73T8VBTUkamamKrCVH0zVYcS4lMoFbNfu_hX_U_qC2wbfIk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2638858312</pqid></control><display><type>article</type><title>Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning</title><source>SpringerLink Journals - AutoHoldings</source><creator>Shahid, Asad Ali ; Piga, Dario ; Braghin, Francesco ; Roveda, Loris</creator><creatorcontrib>Shahid, Asad Ali ; Piga, Dario ; Braghin, Francesco ; Roveda, Loris</creatorcontrib><description>This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.</description><identifier>ISSN: 0929-5593</identifier><identifier>EISSN: 1573-7527</identifier><identifier>DOI: 10.1007/s10514-022-10034-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Adaptation ; Algorithms ; Artificial Intelligence ; Computer Imaging ; Control ; Engineering ; Machine learning ; Mechatronics ; Optimization ; Pattern Recognition and Graphics ; Robotics ; Robotics and Automation ; Robots ; Vision</subject><ispartof>Autonomous robots, 2022-03, Vol.46 (3), p.483-498</ispartof><rights>The Author(s) 2022</rights><rights>The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-1b691cdd333d852d57fc5870e6c4fed724de56d13509151aa6a20562781cb41d3</citedby><cites>FETCH-LOGICAL-c363t-1b691cdd333d852d57fc5870e6c4fed724de56d13509151aa6a20562781cb41d3</cites><orcidid>0000-0002-4427-536X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10514-022-10034-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10514-022-10034-z$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Shahid, Asad Ali</creatorcontrib><creatorcontrib>Piga, Dario</creatorcontrib><creatorcontrib>Braghin, Francesco</creatorcontrib><creatorcontrib>Roveda, Loris</creatorcontrib><title>Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning</title><title>Autonomous robots</title><addtitle>Auton Robot</addtitle><description>This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.</description><subject>Adaptation</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Computer Imaging</subject><subject>Control</subject><subject>Engineering</subject><subject>Machine learning</subject><subject>Mechatronics</subject><subject>Optimization</subject><subject>Pattern Recognition and Graphics</subject><subject>Robotics</subject><subject>Robotics and Automation</subject><subject>Robots</subject><subject>Vision</subject><issn>0929-5593</issn><issn>1573-7527</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9kE1PAyEQhonRxFr9A55IPKMDLMvu0TR-JU286JlQYNtttlCBPdhfL3WN3jzNTN6PSR6ErincUgB5lygIWhFgjJSbV-RwgmZUSE6kYPIUzaBlLRGi5efoIqUtALQSYIa2i-Bz78cwJmzKGsOAtcl98AkPTkff-zXW3mJt9T7ro4C7EHEMq5B7g3fa9_txmIS8iWFcb3B0vS8m43bO59-aS3TW6SG5q585R--PD2-LZ7J8fXpZ3C-J4TXPhK7qlhprOee2EcwK2RnRSHC1qTpnJausE7WlXEBLBdW61gxEzWRDzaqils_RzdS7j-FjdCmrbRijLy8Vq3nTiIZTVlxscpkYUoquU_vY73T8VBTUkamamKrCVH0zVYcS4lMoFbNfu_hX_U_qC2wbfIk</recordid><startdate>20220301</startdate><enddate>20220301</enddate><creator>Shahid, Asad Ali</creator><creator>Piga, Dario</creator><creator>Braghin, Francesco</creator><creator>Roveda, Loris</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>F28</scope><scope>FR3</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>S0W</scope><orcidid>https://orcid.org/0000-0002-4427-536X</orcidid></search><sort><creationdate>20220301</creationdate><title>Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning</title><author>Shahid, Asad Ali ; Piga, Dario ; Braghin, Francesco ; Roveda, Loris</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-1b691cdd333d852d57fc5870e6c4fed724de56d13509151aa6a20562781cb41d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Adaptation</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Computer Imaging</topic><topic>Control</topic><topic>Engineering</topic><topic>Machine learning</topic><topic>Mechatronics</topic><topic>Optimization</topic><topic>Pattern Recognition and Graphics</topic><topic>Robotics</topic><topic>Robotics and Automation</topic><topic>Robots</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shahid, Asad Ali</creatorcontrib><creatorcontrib>Piga, Dario</creatorcontrib><creatorcontrib>Braghin, Francesco</creatorcontrib><creatorcontrib>Roveda, Loris</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>DELNET Engineering &amp; Technology Collection</collection><jtitle>Autonomous robots</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Shahid, Asad Ali</au><au>Piga, Dario</au><au>Braghin, Francesco</au><au>Roveda, Loris</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning</atitle><jtitle>Autonomous robots</jtitle><stitle>Auton Robot</stitle><date>2022-03-01</date><risdate>2022</risdate><volume>46</volume><issue>3</issue><spage>483</spage><epage>498</epage><pages>483-498</pages><issn>0929-5593</issn><eissn>1573-7527</eissn><abstract>This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10514-022-10034-z</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0002-4427-536X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0929-5593
ispartof Autonomous robots, 2022-03, Vol.46 (3), p.483-498
issn 0929-5593
1573-7527
language eng
recordid cdi_proquest_journals_2638858312
source SpringerLink Journals - AutoHoldings
subjects Adaptation
Algorithms
Artificial Intelligence
Computer Imaging
Control
Engineering
Machine learning
Mechatronics
Optimization
Pattern Recognition and Graphics
Robotics
Robotics and Automation
Robots
Vision
title Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T06%3A08%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Continuous%20control%20actions%20learning%20and%20adaptation%20for%20robotic%20manipulation%20through%20reinforcement%20learning&rft.jtitle=Autonomous%20robots&rft.au=Shahid,%20Asad%20Ali&rft.date=2022-03-01&rft.volume=46&rft.issue=3&rft.spage=483&rft.epage=498&rft.pages=483-498&rft.issn=0929-5593&rft.eissn=1573-7527&rft_id=info:doi/10.1007/s10514-022-10034-z&rft_dat=%3Cproquest_cross%3E2638858312%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2638858312&rft_id=info:pmid/&rfr_iscdi=true