UAV navigation in high dynamic environments: A deep reinforcement learning approach

Unmanned Aerial Vehicle (UAV) navigation is aimed at guiding a UAV to the desired destinations along a collision-free and efficient path without human interventions, and it plays a crucial role in autonomous missions in harsh environments. The recently emerging Deep Reinforcement Learning (DRL) meth...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Chinese journal of aeronautics 2021-02, Vol.34 (2), p.479-489
Hauptverfasser:	GUO, Tong, JIANG, Nan, LI, Biyue, ZHU, Xi, WANG, Ya, DU, Wenbo
Format:	Artikel
Sprache:	eng
Schlagworte:	Autonomous vehicles Deep learning Motion planning Navigation Reinforcement learning Unmanned Aerial Vehicle (UAV)
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Unmanned Aerial Vehicle (UAV) navigation is aimed at guiding a UAV to the desired destinations along a collision-free and efficient path without human interventions, and it plays a crucial role in autonomous missions in harsh environments. The recently emerging Deep Reinforcement Learning (DRL) methods have shown promise for addressing the UAV navigation problem, but most of these methods cannot converge due to the massive amounts of interactive data when a UAV is navigating in high dynamic environments, where there are numerous obstacles moving fast. In this work, we propose an improved DRL-based method to tackle these fundamental limitations. To be specific, we develop a distributed DRL framework to decompose the UAV navigation task into two simpler sub-tasks, each of which is solved through the designed Long Short-Term Memory (LSTM) based DRL network by using only part of the interactive data. Furthermore, a clipped DRL loss function is proposed to closely stack the two sub-solutions into one integral for the UAV navigation problem. Extensive simulation results are provided to corroborate the superiority of the proposed method in terms of the convergence and effectiveness compared with those of the state-of-the-art DRL methods.
ISSN:	1000-9361
DOI:	10.1016/j.cja.2020.05.011