Federated Reinforcement Learning Acceleration Method for Precise Control of Multiple Devices

Nowadays, Reinforcement Learning (RL) is applied to various real-world tasks and attracts much attention in the fields of games, robotics, and autonomous driving. It is very challenging and devices overwhelming to directly apply RL to real-world environments. Due to the reality gap simulated environ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2021, Vol.9, p.76296-76306
Hauptverfasser: Lim, Hyun-Kyo, Kim, Ju-Bong, Ullah, Ihsan, Heo, Joo-Seong, Han, Youn-Hee
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 76306
container_issue
container_start_page 76296
container_title IEEE access
container_volume 9
creator Lim, Hyun-Kyo
Kim, Ju-Bong
Ullah, Ihsan
Heo, Joo-Seong
Han, Youn-Hee
description Nowadays, Reinforcement Learning (RL) is applied to various real-world tasks and attracts much attention in the fields of games, robotics, and autonomous driving. It is very challenging and devices overwhelming to directly apply RL to real-world environments. Due to the reality gap simulated environment does not match perfectly to the real-world scenario and additional learning cannot be performed. Therefore, an efficient approach is required for RL to find an optimal control policy and get better learning efficacy. In this paper, we propose federated reinforcement learning based on multi agent environment which applying a new federation policy. The new federation policy allows multi agents to perform learning and share their learning experiences with each other e.g., gradient and model parameters to increase their learning level. The Actor-Critic PPO algorithm is used with four types of RL simulation environments, OpenAI Gym's CartPole, MoutainCar, Acrobot, and Pendulum. In addition, we did real experiments with multiple Rotary Inverted Pendulum (RIP) to evaluate and compare the learning efficiency of the proposed scheme with both environments.
doi_str_mv 10.1109/ACCESS.2021.3083087
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9439484</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9439484</ieee_id><doaj_id>oai_doaj_org_article_3e3e65d660e74c579a47b25a96f83dc6</doaj_id><sourcerecordid>2533491343</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-1ad667d96034aa073ca56dac6d1d617fb7c45cffa1ce6d35defbd12474d7c4093</originalsourceid><addsrcrecordid>eNpNUV1rGzEQPEIDCUl-QV4EebYr3erj9GiudhtwSGmSt4CQpVUicz65unOg_75yL5guC7vszswuTFXdMjpnjOqvi7ZdPj3Na1qzOdCmpDqrLmsm9QwEyC__9RfVzTBsaYmmjIS6rF5X6DHbET35hbEPKTvcYT-SNdrcx_6NLJzD7giJqScPOL4nTwqM_Mzo4oCkTf2YU0dSIA-Hboz7Dsk3_IgOh-vqPNhuwJvPelW9rJbP7Y_Z-vH7fbtYzxynzThj1kupvJYUuLVUgbNCeuukZ14yFTbKceFCsMyh9CA8ho1nNVfclw3VcFXdT7o-2a3Z57iz-Y9JNpp_g5TfjM1jdB0aQEApyj2KijuhtOVqUwurZWjAO1m07iatfU6_DziMZpsOuS_vm1oAcM2AQ0HBhHI5DUPGcLrKqDm6YiZXzNEV8-lKYd1OrIiIJ4bmoHnD4S-EXokX</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2533491343</pqid></control><display><type>article</type><title>Federated Reinforcement Learning Acceleration Method for Precise Control of Multiple Devices</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Lim, Hyun-Kyo ; Kim, Ju-Bong ; Ullah, Ihsan ; Heo, Joo-Seong ; Han, Youn-Hee</creator><creatorcontrib>Lim, Hyun-Kyo ; Kim, Ju-Bong ; Ullah, Ihsan ; Heo, Joo-Seong ; Han, Youn-Hee</creatorcontrib><description>Nowadays, Reinforcement Learning (RL) is applied to various real-world tasks and attracts much attention in the fields of games, robotics, and autonomous driving. It is very challenging and devices overwhelming to directly apply RL to real-world environments. Due to the reality gap simulated environment does not match perfectly to the real-world scenario and additional learning cannot be performed. Therefore, an efficient approach is required for RL to find an optimal control policy and get better learning efficacy. In this paper, we propose federated reinforcement learning based on multi agent environment which applying a new federation policy. The new federation policy allows multi agents to perform learning and share their learning experiences with each other e.g., gradient and model parameters to increase their learning level. The Actor-Critic PPO algorithm is used with four types of RL simulation environments, OpenAI Gym's CartPole, MoutainCar, Acrobot, and Pendulum. In addition, we did real experiments with multiple Rotary Inverted Pendulum (RIP) to evaluate and compare the learning efficiency of the proposed scheme with both environments.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2021.3083087</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Cart-pole problem ; Federated reinforcement learning ; Games ; gradient sharing ; Machine learning ; multi-agent ; Multiagent systems ; Optimal control ; Pendulums ; Performance evaluation ; Reinforcement learning ; Robotics ; Servers ; Systems architecture ; Training ; Transfer learning</subject><ispartof>IEEE access, 2021, Vol.9, p.76296-76306</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-1ad667d96034aa073ca56dac6d1d617fb7c45cffa1ce6d35defbd12474d7c4093</citedby><cites>FETCH-LOGICAL-c408t-1ad667d96034aa073ca56dac6d1d617fb7c45cffa1ce6d35defbd12474d7c4093</cites><orcidid>0000-0001-6406-3092 ; 0000-0002-8807-1158 ; 0000-0002-5204-2283 ; 0000-0002-5835-7972</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9439484$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,777,781,861,2096,4010,27614,27904,27905,27906,54914</link.rule.ids></links><search><creatorcontrib>Lim, Hyun-Kyo</creatorcontrib><creatorcontrib>Kim, Ju-Bong</creatorcontrib><creatorcontrib>Ullah, Ihsan</creatorcontrib><creatorcontrib>Heo, Joo-Seong</creatorcontrib><creatorcontrib>Han, Youn-Hee</creatorcontrib><title>Federated Reinforcement Learning Acceleration Method for Precise Control of Multiple Devices</title><title>IEEE access</title><addtitle>Access</addtitle><description>Nowadays, Reinforcement Learning (RL) is applied to various real-world tasks and attracts much attention in the fields of games, robotics, and autonomous driving. It is very challenging and devices overwhelming to directly apply RL to real-world environments. Due to the reality gap simulated environment does not match perfectly to the real-world scenario and additional learning cannot be performed. Therefore, an efficient approach is required for RL to find an optimal control policy and get better learning efficacy. In this paper, we propose federated reinforcement learning based on multi agent environment which applying a new federation policy. The new federation policy allows multi agents to perform learning and share their learning experiences with each other e.g., gradient and model parameters to increase their learning level. The Actor-Critic PPO algorithm is used with four types of RL simulation environments, OpenAI Gym's CartPole, MoutainCar, Acrobot, and Pendulum. In addition, we did real experiments with multiple Rotary Inverted Pendulum (RIP) to evaluate and compare the learning efficiency of the proposed scheme with both environments.</description><subject>Algorithms</subject><subject>Cart-pole problem</subject><subject>Federated reinforcement learning</subject><subject>Games</subject><subject>gradient sharing</subject><subject>Machine learning</subject><subject>multi-agent</subject><subject>Multiagent systems</subject><subject>Optimal control</subject><subject>Pendulums</subject><subject>Performance evaluation</subject><subject>Reinforcement learning</subject><subject>Robotics</subject><subject>Servers</subject><subject>Systems architecture</subject><subject>Training</subject><subject>Transfer learning</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUV1rGzEQPEIDCUl-QV4EebYr3erj9GiudhtwSGmSt4CQpVUicz65unOg_75yL5guC7vszswuTFXdMjpnjOqvi7ZdPj3Na1qzOdCmpDqrLmsm9QwEyC__9RfVzTBsaYmmjIS6rF5X6DHbET35hbEPKTvcYT-SNdrcx_6NLJzD7giJqScPOL4nTwqM_Mzo4oCkTf2YU0dSIA-Hboz7Dsk3_IgOh-vqPNhuwJvPelW9rJbP7Y_Z-vH7fbtYzxynzThj1kupvJYUuLVUgbNCeuukZ14yFTbKceFCsMyh9CA8ho1nNVfclw3VcFXdT7o-2a3Z57iz-Y9JNpp_g5TfjM1jdB0aQEApyj2KijuhtOVqUwurZWjAO1m07iatfU6_DziMZpsOuS_vm1oAcM2AQ0HBhHI5DUPGcLrKqDm6YiZXzNEV8-lKYd1OrIiIJ4bmoHnD4S-EXokX</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Lim, Hyun-Kyo</creator><creator>Kim, Ju-Bong</creator><creator>Ullah, Ihsan</creator><creator>Heo, Joo-Seong</creator><creator>Han, Youn-Hee</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-6406-3092</orcidid><orcidid>https://orcid.org/0000-0002-8807-1158</orcidid><orcidid>https://orcid.org/0000-0002-5204-2283</orcidid><orcidid>https://orcid.org/0000-0002-5835-7972</orcidid></search><sort><creationdate>2021</creationdate><title>Federated Reinforcement Learning Acceleration Method for Precise Control of Multiple Devices</title><author>Lim, Hyun-Kyo ; Kim, Ju-Bong ; Ullah, Ihsan ; Heo, Joo-Seong ; Han, Youn-Hee</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-1ad667d96034aa073ca56dac6d1d617fb7c45cffa1ce6d35defbd12474d7c4093</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Cart-pole problem</topic><topic>Federated reinforcement learning</topic><topic>Games</topic><topic>gradient sharing</topic><topic>Machine learning</topic><topic>multi-agent</topic><topic>Multiagent systems</topic><topic>Optimal control</topic><topic>Pendulums</topic><topic>Performance evaluation</topic><topic>Reinforcement learning</topic><topic>Robotics</topic><topic>Servers</topic><topic>Systems architecture</topic><topic>Training</topic><topic>Transfer learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lim, Hyun-Kyo</creatorcontrib><creatorcontrib>Kim, Ju-Bong</creatorcontrib><creatorcontrib>Ullah, Ihsan</creatorcontrib><creatorcontrib>Heo, Joo-Seong</creatorcontrib><creatorcontrib>Han, Youn-Hee</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lim, Hyun-Kyo</au><au>Kim, Ju-Bong</au><au>Ullah, Ihsan</au><au>Heo, Joo-Seong</au><au>Han, Youn-Hee</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Federated Reinforcement Learning Acceleration Method for Precise Control of Multiple Devices</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2021</date><risdate>2021</risdate><volume>9</volume><spage>76296</spage><epage>76306</epage><pages>76296-76306</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Nowadays, Reinforcement Learning (RL) is applied to various real-world tasks and attracts much attention in the fields of games, robotics, and autonomous driving. It is very challenging and devices overwhelming to directly apply RL to real-world environments. Due to the reality gap simulated environment does not match perfectly to the real-world scenario and additional learning cannot be performed. Therefore, an efficient approach is required for RL to find an optimal control policy and get better learning efficacy. In this paper, we propose federated reinforcement learning based on multi agent environment which applying a new federation policy. The new federation policy allows multi agents to perform learning and share their learning experiences with each other e.g., gradient and model parameters to increase their learning level. The Actor-Critic PPO algorithm is used with four types of RL simulation environments, OpenAI Gym's CartPole, MoutainCar, Acrobot, and Pendulum. In addition, we did real experiments with multiple Rotary Inverted Pendulum (RIP) to evaluate and compare the learning efficiency of the proposed scheme with both environments.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2021.3083087</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0001-6406-3092</orcidid><orcidid>https://orcid.org/0000-0002-8807-1158</orcidid><orcidid>https://orcid.org/0000-0002-5204-2283</orcidid><orcidid>https://orcid.org/0000-0002-5835-7972</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2021, Vol.9, p.76296-76306
issn 2169-3536
2169-3536
language eng
recordid cdi_ieee_primary_9439484
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Algorithms
Cart-pole problem
Federated reinforcement learning
Games
gradient sharing
Machine learning
multi-agent
Multiagent systems
Optimal control
Pendulums
Performance evaluation
Reinforcement learning
Robotics
Servers
Systems architecture
Training
Transfer learning
title Federated Reinforcement Learning Acceleration Method for Precise Control of Multiple Devices
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T23%3A00%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Federated%20Reinforcement%20Learning%20Acceleration%20Method%20for%20Precise%20Control%20of%20Multiple%20Devices&rft.jtitle=IEEE%20access&rft.au=Lim,%20Hyun-Kyo&rft.date=2021&rft.volume=9&rft.spage=76296&rft.epage=76306&rft.pages=76296-76306&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2021.3083087&rft_dat=%3Cproquest_ieee_%3E2533491343%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2533491343&rft_id=info:pmid/&rft_ieee_id=9439484&rft_doaj_id=oai_doaj_org_article_3e3e65d660e74c579a47b25a96f83dc6&rfr_iscdi=true