Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning

This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-C...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Autonomous robots 2022-03, Vol.46 (3), p.483-498
Hauptverfasser:	Shahid, Asad Ali, Piga, Dario, Braghin, Francesco, Roveda, Loris
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Algorithms Artificial Intelligence Computer Imaging Control Engineering Machine learning Mechatronics Optimization Pattern Recognition and Graphics Robotics Robotics and Automation Robots Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	498
container_issue	3
container_start_page	483
container_title	Autonomous robots
container_volume	46
creator	Shahid, Asad Ali Piga, Dario Braghin, Francesco Roveda, Loris
description	This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.
doi_str_mv	10.1007/s10514-022-10034-z
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2638858312</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2638858312</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-1b691cdd333d852d57fc5870e6c4fed724de56d13509151aa6a20562781cb41d3</originalsourceid><addsrcrecordid>eNp9kE1PAyEQhonRxFr9A55IPKMDLMvu0TR-JU286JlQYNtttlCBPdhfL3WN3jzNTN6PSR6ErincUgB5lygIWhFgjJSbV-RwgmZUSE6kYPIUzaBlLRGi5efoIqUtALQSYIa2i-Bz78cwJmzKGsOAtcl98AkPTkff-zXW3mJt9T7ro4C7EHEMq5B7g3fa9_txmIS8iWFcb3B0vS8m43bO59-aS3TW6SG5q585R--PD2-LZ7J8fXpZ3C-J4TXPhK7qlhprOee2EcwK2RnRSHC1qTpnJausE7WlXEBLBdW61gxEzWRDzaqils_RzdS7j-FjdCmrbRijLy8Vq3nTiIZTVlxscpkYUoquU_vY73T8VBTUkamamKrCVH0zVYcS4lMoFbNfu_hX_U_qC2wbfIk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2638858312</pqid></control><display><type>article</type><title>Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning</title><source>SpringerLink Journals - AutoHoldings</source><creator>Shahid, Asad Ali ; Piga, Dario ; Braghin, Francesco ; Roveda, Loris</creator><creatorcontrib>Shahid, Asad Ali ; Piga, Dario ; Braghin, Francesco ; Roveda, Loris</creatorcontrib><description>This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.</description><identifier>ISSN: 0929-5593</identifier><identifier>EISSN: 1573-7527</identifier><identifier>DOI: 10.1007/s10514-022-10034-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Adaptation ; Algorithms ; Artificial Intelligence ; Computer Imaging ; Control ; Engineering ; Machine learning ; Mechatronics ; Optimization ; Pattern Recognition and Graphics ; Robotics ; Robotics and Automation ; Robots ; Vision</subject><ispartof>Autonomous robots, 2022-03, Vol.46 (3), p.483-498</ispartof><rights>The Author(s) 2022</rights><rights>The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-1b691cdd333d852d57fc5870e6c4fed724de56d13509151aa6a20562781cb41d3</citedby><cites>FETCH-LOGICAL-c363t-1b691cdd333d852d57fc5870e6c4fed724de56d13509151aa6a20562781cb41d3</cites><orcidid>0000-0002-4427-536X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10514-022-10034-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10514-022-10034-z$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Shahid, Asad Ali</creatorcontrib><creatorcontrib>Piga, Dario</creatorcontrib><creatorcontrib>Braghin, Francesco</creatorcontrib><creatorcontrib>Roveda, Loris</creatorcontrib><title>Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning</title><title>Autonomous robots</title><addtitle>Auton Robot</addtitle><description>This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.</description><subject>Adaptation</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Computer Imaging</subject><subject>Control</subject><subject>Engineering</subject><subject>Machine learning</subject><subject>Mechatronics</subject><subject>Optimization</subject><subject>Pattern Recognition and Graphics</subject><subject>Robotics</subject><subject>Robotics and Automation</subject><subject>Robots</subject><subject>Vision</subject><issn>0929-5593</issn><issn>1573-7527</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9kE1PAyEQhonRxFr9A55IPKMDLMvu0TR-JU286JlQYNtttlCBPdhfL3WN3jzNTN6PSR6ErincUgB5lygIWhFgjJSbV-RwgmZUSE6kYPIUzaBlLRGi5efoIqUtALQSYIa2i-Bz78cwJmzKGsOAtcl98AkPTkff-zXW3mJt9T7ro4C7EHEMq5B7g3fa9_txmIS8iWFcb3B0vS8m43bO59-aS3TW6SG5q585R--PD2-LZ7J8fXpZ3C-J4TXPhK7qlhprOee2EcwK2RnRSHC1qTpnJausE7WlXEBLBdW61gxEzWRDzaqils_RzdS7j-FjdCmrbRijLy8Vq3nTiIZTVlxscpkYUoquU_vY73T8VBTUkamamKrCVH0zVYcS4lMoFbNfu_hX_U_qC2wbfIk</recordid><startdate>20220301</startdate><enddate>20220301</enddate><creator>Shahid, Asad Ali</creator><creator>Piga, Dario</creator><creator>Braghin, Francesco</creator><creator>Roveda, Loris</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>F28</scope><scope>FR3</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>S0W</scope><orcidid>https://orcid.org/0000-0002-4427-536X</orcidid></search><sort><creationdate>20220301</creationdate><title>Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning</title><author>Shahid, Asad Ali ; Piga, Dario ; Braghin, Francesco ; Roveda, Loris</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-1b691cdd333d852d57fc5870e6c4fed724de56d13509151aa6a20562781cb41d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Adaptation</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Computer Imaging</topic><topic>Control</topic><topic>Engineering</topic><topic>Machine learning</topic><topic>Mechatronics</topic><topic>Optimization</topic><topic>Pattern Recognition and Graphics</topic><topic>Robotics</topic><topic>Robotics and Automation</topic><topic>Robots</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shahid, Asad Ali</creatorcontrib><creatorcontrib>Piga, Dario</creatorcontrib><creatorcontrib>Braghin, Francesco</creatorcontrib><creatorcontrib>Roveda, Loris</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>DELNET Engineering & Technology Collection</collection><jtitle>Autonomous robots</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Shahid, Asad Ali</au><au>Piga, Dario</au><au>Braghin, Francesco</au><au>Roveda, Loris</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning</atitle><jtitle>Autonomous robots</jtitle><stitle>Auton Robot</stitle><date>2022-03-01</date><risdate>2022</risdate><volume>46</volume><issue>3</issue><spage>483</spage><epage>498</epage><pages>483-498</pages><issn>0929-5593</issn><eissn>1573-7527</eissn><abstract>This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10514-022-10034-z</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0002-4427-536X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0929-5593
ispartof	Autonomous robots, 2022-03, Vol.46 (3), p.483-498
issn	0929-5593 1573-7527
language	eng
recordid	cdi_proquest_journals_2638858312
source	SpringerLink Journals - AutoHoldings
subjects	Adaptation Algorithms Artificial Intelligence Computer Imaging Control Engineering Machine learning Mechatronics Optimization Pattern Recognition and Graphics Robotics Robotics and Automation Robots Vision
title	Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T06%3A08%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Continuous%20control%20actions%20learning%20and%20adaptation%20for%20robotic%20manipulation%20through%20reinforcement%20learning&rft.jtitle=Autonomous%20robots&rft.au=Shahid,%20Asad%20Ali&rft.date=2022-03-01&rft.volume=46&rft.issue=3&rft.spage=483&rft.epage=498&rft.pages=483-498&rft.issn=0929-5593&rft.eissn=1573-7527&rft_id=info:doi/10.1007/s10514-022-10034-z&rft_dat=%3Cproquest_cross%3E2638858312%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2638858312&rft_id=info:pmid/&rfr_iscdi=true