PEARL: PrEference Appraisal Reinforcement Learning for Motion Planning

Robot motion planning often requires finding trajectories that balance different user intents, or preferences. One of these preferences is usually arrival at the goal, while another might be obstacle avoidance. Here, we formalize these, and similar, tasks as preference balancing tasks (PBTs) on acce...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Faust, Aleksandra, Chiang, Hao-Tien Lewis, Tapia, Lydia
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Robotics
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Faust, Aleksandra Chiang, Hao-Tien Lewis Tapia, Lydia
description	Robot motion planning often requires finding trajectories that balance different user intents, or preferences. One of these preferences is usually arrival at the goal, while another might be obstacle avoidance. Here, we formalize these, and similar, tasks as preference balancing tasks (PBTs) on acceleration controlled robots, and propose a motion planning solution, PrEference Appraisal Reinforcement Learning (PEARL). PEARL uses reinforcement learning on a restricted training domain, combined with features engineered from user-given intents. PEARL's planner then generates trajectories in expanded domains for more complex problems. We present an adaptation for rejection of stochastic disturbances and offer in-depth analysis, including task completion conditions and behavior analysis when the conditions do not hold. PEARL is evaluated on five problems, two multi-agent obstacle avoidance tasks and three that stochastically disturb the system at run-time: 1) a multi-agent pursuit problem with 1000 pursuers, 2) robot navigation through 900 moving obstacles, which is is trained with in an environment with only 4 static obstacles, 3) aerial cargo delivery, 4) two robot rendezvous, and 5) flying inverted pendulum. Lastly, we evaluate the method on a physical quadrotor UAV robot with a suspended load influenced by a stochastic disturbance. The video, https://youtu.be/ZkFt1uY6vlw contains the experiments and visualization of the simulations.
doi_str_mv	10.48550/arxiv.1811.12651
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1811_12651</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1811_12651</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-9cda8319f22d93961ef32e6c27e1b71b259522f25fe91bccfd2de9f31c7eade73</originalsourceid><addsrcrecordid>eNotz8FKxDAUheFsXMjoA7gyL9A698Y0jbsydFSoWIbZl9vkRgKdtGQG0bfXGV0d-BcHPiHuYF0-1lqvHyh_xc8SaoASsNJwLbZ92-y6J9nnNnDm5Fg2y5IpHmmSO44pzNnxgdNJdkw5xfQhf5N8m09xTrKfKJ3bjbgKNB359n9XYr9t95uXont_ft00XUGVgcI6T7UCGxC9VbYCDgq5cmgYRgMjaqsRA-rAFkbngkfPNihwhsmzUStx_3d7gQxLjgfK38MZNFxA6geRZUYP</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>PEARL: PrEference Appraisal Reinforcement Learning for Motion Planning</title><source>arXiv.org</source><creator>Faust, Aleksandra ; Chiang, Hao-Tien Lewis ; Tapia, Lydia</creator><creatorcontrib>Faust, Aleksandra ; Chiang, Hao-Tien Lewis ; Tapia, Lydia</creatorcontrib><description>Robot motion planning often requires finding trajectories that balance different user intents, or preferences. One of these preferences is usually arrival at the goal, while another might be obstacle avoidance. Here, we formalize these, and similar, tasks as preference balancing tasks (PBTs) on acceleration controlled robots, and propose a motion planning solution, PrEference Appraisal Reinforcement Learning (PEARL). PEARL uses reinforcement learning on a restricted training domain, combined with features engineered from user-given intents. PEARL's planner then generates trajectories in expanded domains for more complex problems. We present an adaptation for rejection of stochastic disturbances and offer in-depth analysis, including task completion conditions and behavior analysis when the conditions do not hold. PEARL is evaluated on five problems, two multi-agent obstacle avoidance tasks and three that stochastically disturb the system at run-time: 1) a multi-agent pursuit problem with 1000 pursuers, 2) robot navigation through 900 moving obstacles, which is is trained with in an environment with only 4 static obstacles, 3) aerial cargo delivery, 4) two robot rendezvous, and 5) flying inverted pendulum. Lastly, we evaluate the method on a physical quadrotor UAV robot with a suspended load influenced by a stochastic disturbance. The video, https://youtu.be/ZkFt1uY6vlw contains the experiments and visualization of the simulations.</description><identifier>DOI: 10.48550/arxiv.1811.12651</identifier><language>eng</language><subject>Computer Science - Robotics</subject><creationdate>2018-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1811.12651$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1811.12651$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Faust, Aleksandra</creatorcontrib><creatorcontrib>Chiang, Hao-Tien Lewis</creatorcontrib><creatorcontrib>Tapia, Lydia</creatorcontrib><title>PEARL: PrEference Appraisal Reinforcement Learning for Motion Planning</title><description>Robot motion planning often requires finding trajectories that balance different user intents, or preferences. One of these preferences is usually arrival at the goal, while another might be obstacle avoidance. Here, we formalize these, and similar, tasks as preference balancing tasks (PBTs) on acceleration controlled robots, and propose a motion planning solution, PrEference Appraisal Reinforcement Learning (PEARL). PEARL uses reinforcement learning on a restricted training domain, combined with features engineered from user-given intents. PEARL's planner then generates trajectories in expanded domains for more complex problems. We present an adaptation for rejection of stochastic disturbances and offer in-depth analysis, including task completion conditions and behavior analysis when the conditions do not hold. PEARL is evaluated on five problems, two multi-agent obstacle avoidance tasks and three that stochastically disturb the system at run-time: 1) a multi-agent pursuit problem with 1000 pursuers, 2) robot navigation through 900 moving obstacles, which is is trained with in an environment with only 4 static obstacles, 3) aerial cargo delivery, 4) two robot rendezvous, and 5) flying inverted pendulum. Lastly, we evaluate the method on a physical quadrotor UAV robot with a suspended load influenced by a stochastic disturbance. The video, https://youtu.be/ZkFt1uY6vlw contains the experiments and visualization of the simulations.</description><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz8FKxDAUheFsXMjoA7gyL9A698Y0jbsydFSoWIbZl9vkRgKdtGQG0bfXGV0d-BcHPiHuYF0-1lqvHyh_xc8SaoASsNJwLbZ92-y6J9nnNnDm5Fg2y5IpHmmSO44pzNnxgdNJdkw5xfQhf5N8m09xTrKfKJ3bjbgKNB359n9XYr9t95uXont_ft00XUGVgcI6T7UCGxC9VbYCDgq5cmgYRgMjaqsRA-rAFkbngkfPNihwhsmzUStx_3d7gQxLjgfK38MZNFxA6geRZUYP</recordid><startdate>20181130</startdate><enddate>20181130</enddate><creator>Faust, Aleksandra</creator><creator>Chiang, Hao-Tien Lewis</creator><creator>Tapia, Lydia</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20181130</creationdate><title>PEARL: PrEference Appraisal Reinforcement Learning for Motion Planning</title><author>Faust, Aleksandra ; Chiang, Hao-Tien Lewis ; Tapia, Lydia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-9cda8319f22d93961ef32e6c27e1b71b259522f25fe91bccfd2de9f31c7eade73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Faust, Aleksandra</creatorcontrib><creatorcontrib>Chiang, Hao-Tien Lewis</creatorcontrib><creatorcontrib>Tapia, Lydia</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Faust, Aleksandra</au><au>Chiang, Hao-Tien Lewis</au><au>Tapia, Lydia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PEARL: PrEference Appraisal Reinforcement Learning for Motion Planning</atitle><date>2018-11-30</date><risdate>2018</risdate><abstract>Robot motion planning often requires finding trajectories that balance different user intents, or preferences. One of these preferences is usually arrival at the goal, while another might be obstacle avoidance. Here, we formalize these, and similar, tasks as preference balancing tasks (PBTs) on acceleration controlled robots, and propose a motion planning solution, PrEference Appraisal Reinforcement Learning (PEARL). PEARL uses reinforcement learning on a restricted training domain, combined with features engineered from user-given intents. PEARL's planner then generates trajectories in expanded domains for more complex problems. We present an adaptation for rejection of stochastic disturbances and offer in-depth analysis, including task completion conditions and behavior analysis when the conditions do not hold. PEARL is evaluated on five problems, two multi-agent obstacle avoidance tasks and three that stochastically disturb the system at run-time: 1) a multi-agent pursuit problem with 1000 pursuers, 2) robot navigation through 900 moving obstacles, which is is trained with in an environment with only 4 static obstacles, 3) aerial cargo delivery, 4) two robot rendezvous, and 5) flying inverted pendulum. Lastly, we evaluate the method on a physical quadrotor UAV robot with a suspended load influenced by a stochastic disturbance. The video, https://youtu.be/ZkFt1uY6vlw contains the experiments and visualization of the simulations.</abstract><doi>10.48550/arxiv.1811.12651</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1811.12651
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1811_12651
source	arXiv.org
subjects	Computer Science - Robotics
title	PEARL: PrEference Appraisal Reinforcement Learning for Motion Planning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T10%3A25%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PEARL:%20PrEference%20Appraisal%20Reinforcement%20Learning%20for%20Motion%20Planning&rft.au=Faust,%20Aleksandra&rft.date=2018-11-30&rft_id=info:doi/10.48550/arxiv.1811.12651&rft_dat=%3Carxiv_GOX%3E1811_12651%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true