Adaptive Leader-Follower Formation Control and Obstacle Avoidance via Deep Reinforcement Learning

We propose a deep reinforcement learning (DRL) methodology for the tracking, obstacle avoidance, and formation control of nonholonomic robots. By separating vision-based control into a perception module and a controller module, we can train a DRL agent without sophisticated physics or 3D modeling. I...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2019-11
Hauptverfasser: Zhou, Yanlin, Lu, Fan, Pu, George, Ma, Xiyao, Sun, Runhan, Hsi-Yuan, Chen, Li, Xiaolin, Wu, Dapeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Zhou, Yanlin
Lu, Fan
Pu, George
Ma, Xiyao
Sun, Runhan
Hsi-Yuan, Chen
Li, Xiaolin
Wu, Dapeng
description We propose a deep reinforcement learning (DRL) methodology for the tracking, obstacle avoidance, and formation control of nonholonomic robots. By separating vision-based control into a perception module and a controller module, we can train a DRL agent without sophisticated physics or 3D modeling. In addition, the modular framework averts daunting retrains of an image-to-action end-to-end neural network, and provides flexibility in transferring the controller to different robots. First, we train a convolutional neural network (CNN) to accurately localize in an indoor setting with dynamic foreground/background. Then, we design a new DRL algorithm named Momentum Policy Gradient (MPG) for continuous control tasks and prove its convergence. We also show that MPG is robust at tracking varying leader movements and can naturally be extended to problems of formation control. Leveraging reward shaping, features such as collision and obstacle avoidance can be easily integrated into a DRL controller.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2315661588</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2315661588</sourcerecordid><originalsourceid>FETCH-proquest_journals_23156615883</originalsourceid><addsrcrecordid>eNqNjEEKwjAQAIMgKNo_LHgutImtvYpaPAiCeJe1XSUSd2sS6_dV8AGe5jDDDNRYG5On1VzrkUpCuGVZpsuFLgozVrhssYu2J9gRtuTTWpyTF3moxd8xWmFYCUcvDpBb2J9DxMYRLHuxLXJD0FuENVEHB7J8Ed_QnTh-f54tX6dqeEEXKPlxomb15rjapp2Xx5NCPN3k6fmjTtrkRVnmRVWZ_6o3JRdFKw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2315661588</pqid></control><display><type>article</type><title>Adaptive Leader-Follower Formation Control and Obstacle Avoidance via Deep Reinforcement Learning</title><source>Free E- Journals</source><creator>Zhou, Yanlin ; Lu, Fan ; Pu, George ; Ma, Xiyao ; Sun, Runhan ; Hsi-Yuan, Chen ; Li, Xiaolin ; Wu, Dapeng</creator><creatorcontrib>Zhou, Yanlin ; Lu, Fan ; Pu, George ; Ma, Xiyao ; Sun, Runhan ; Hsi-Yuan, Chen ; Li, Xiaolin ; Wu, Dapeng</creatorcontrib><description>We propose a deep reinforcement learning (DRL) methodology for the tracking, obstacle avoidance, and formation control of nonholonomic robots. By separating vision-based control into a perception module and a controller module, we can train a DRL agent without sophisticated physics or 3D modeling. In addition, the modular framework averts daunting retrains of an image-to-action end-to-end neural network, and provides flexibility in transferring the controller to different robots. First, we train a convolutional neural network (CNN) to accurately localize in an indoor setting with dynamic foreground/background. Then, we design a new DRL algorithm named Momentum Policy Gradient (MPG) for continuous control tasks and prove its convergence. We also show that MPG is robust at tracking varying leader movements and can naturally be extended to problems of formation control. Leveraging reward shaping, features such as collision and obstacle avoidance can be easily integrated into a DRL controller.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Adaptive control ; Algorithms ; Artificial neural networks ; Collision avoidance ; Control tasks ; Controllers ; Machine learning ; Modules ; Neural networks ; Obstacle avoidance ; Reagents ; Robot control ; Three dimensional models ; Tracking</subject><ispartof>arXiv.org, 2019-11</ispartof><rights>2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Zhou, Yanlin</creatorcontrib><creatorcontrib>Lu, Fan</creatorcontrib><creatorcontrib>Pu, George</creatorcontrib><creatorcontrib>Ma, Xiyao</creatorcontrib><creatorcontrib>Sun, Runhan</creatorcontrib><creatorcontrib>Hsi-Yuan, Chen</creatorcontrib><creatorcontrib>Li, Xiaolin</creatorcontrib><creatorcontrib>Wu, Dapeng</creatorcontrib><title>Adaptive Leader-Follower Formation Control and Obstacle Avoidance via Deep Reinforcement Learning</title><title>arXiv.org</title><description>We propose a deep reinforcement learning (DRL) methodology for the tracking, obstacle avoidance, and formation control of nonholonomic robots. By separating vision-based control into a perception module and a controller module, we can train a DRL agent without sophisticated physics or 3D modeling. In addition, the modular framework averts daunting retrains of an image-to-action end-to-end neural network, and provides flexibility in transferring the controller to different robots. First, we train a convolutional neural network (CNN) to accurately localize in an indoor setting with dynamic foreground/background. Then, we design a new DRL algorithm named Momentum Policy Gradient (MPG) for continuous control tasks and prove its convergence. We also show that MPG is robust at tracking varying leader movements and can naturally be extended to problems of formation control. Leveraging reward shaping, features such as collision and obstacle avoidance can be easily integrated into a DRL controller.</description><subject>Adaptive control</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Collision avoidance</subject><subject>Control tasks</subject><subject>Controllers</subject><subject>Machine learning</subject><subject>Modules</subject><subject>Neural networks</subject><subject>Obstacle avoidance</subject><subject>Reagents</subject><subject>Robot control</subject><subject>Three dimensional models</subject><subject>Tracking</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjEEKwjAQAIMgKNo_LHgutImtvYpaPAiCeJe1XSUSd2sS6_dV8AGe5jDDDNRYG5On1VzrkUpCuGVZpsuFLgozVrhssYu2J9gRtuTTWpyTF3moxd8xWmFYCUcvDpBb2J9DxMYRLHuxLXJD0FuENVEHB7J8Ed_QnTh-f54tX6dqeEEXKPlxomb15rjapp2Xx5NCPN3k6fmjTtrkRVnmRVWZ_6o3JRdFKw</recordid><startdate>20191115</startdate><enddate>20191115</enddate><creator>Zhou, Yanlin</creator><creator>Lu, Fan</creator><creator>Pu, George</creator><creator>Ma, Xiyao</creator><creator>Sun, Runhan</creator><creator>Hsi-Yuan, Chen</creator><creator>Li, Xiaolin</creator><creator>Wu, Dapeng</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20191115</creationdate><title>Adaptive Leader-Follower Formation Control and Obstacle Avoidance via Deep Reinforcement Learning</title><author>Zhou, Yanlin ; Lu, Fan ; Pu, George ; Ma, Xiyao ; Sun, Runhan ; Hsi-Yuan, Chen ; Li, Xiaolin ; Wu, Dapeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_23156615883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Adaptive control</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Collision avoidance</topic><topic>Control tasks</topic><topic>Controllers</topic><topic>Machine learning</topic><topic>Modules</topic><topic>Neural networks</topic><topic>Obstacle avoidance</topic><topic>Reagents</topic><topic>Robot control</topic><topic>Three dimensional models</topic><topic>Tracking</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Yanlin</creatorcontrib><creatorcontrib>Lu, Fan</creatorcontrib><creatorcontrib>Pu, George</creatorcontrib><creatorcontrib>Ma, Xiyao</creatorcontrib><creatorcontrib>Sun, Runhan</creatorcontrib><creatorcontrib>Hsi-Yuan, Chen</creatorcontrib><creatorcontrib>Li, Xiaolin</creatorcontrib><creatorcontrib>Wu, Dapeng</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Yanlin</au><au>Lu, Fan</au><au>Pu, George</au><au>Ma, Xiyao</au><au>Sun, Runhan</au><au>Hsi-Yuan, Chen</au><au>Li, Xiaolin</au><au>Wu, Dapeng</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Adaptive Leader-Follower Formation Control and Obstacle Avoidance via Deep Reinforcement Learning</atitle><jtitle>arXiv.org</jtitle><date>2019-11-15</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>We propose a deep reinforcement learning (DRL) methodology for the tracking, obstacle avoidance, and formation control of nonholonomic robots. By separating vision-based control into a perception module and a controller module, we can train a DRL agent without sophisticated physics or 3D modeling. In addition, the modular framework averts daunting retrains of an image-to-action end-to-end neural network, and provides flexibility in transferring the controller to different robots. First, we train a convolutional neural network (CNN) to accurately localize in an indoor setting with dynamic foreground/background. Then, we design a new DRL algorithm named Momentum Policy Gradient (MPG) for continuous control tasks and prove its convergence. We also show that MPG is robust at tracking varying leader movements and can naturally be extended to problems of formation control. Leveraging reward shaping, features such as collision and obstacle avoidance can be easily integrated into a DRL controller.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2019-11
issn 2331-8422
language eng
recordid cdi_proquest_journals_2315661588
source Free E- Journals
subjects Adaptive control
Algorithms
Artificial neural networks
Collision avoidance
Control tasks
Controllers
Machine learning
Modules
Neural networks
Obstacle avoidance
Reagents
Robot control
Three dimensional models
Tracking
title Adaptive Leader-Follower Formation Control and Obstacle Avoidance via Deep Reinforcement Learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T18%3A04%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Adaptive%20Leader-Follower%20Formation%20Control%20and%20Obstacle%20Avoidance%20via%20Deep%20Reinforcement%20Learning&rft.jtitle=arXiv.org&rft.au=Zhou,%20Yanlin&rft.date=2019-11-15&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2315661588%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2315661588&rft_id=info:pmid/&rfr_iscdi=true