Deep Reinforcement Learning with Corrective Feedback for Autonomous UAV Landing on a Mobile Platform

Autonomous Unmanned Aerial Vehicle (UAV) landing remains a challenge in uncertain environments, e.g., landing on a mobile ground platform such as an Unmanned Ground Vehicle (UGV) without knowing its motion dynamics. A traditional PID (Proportional, Integral, Derivative) controller is a choice for th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Drones (Basel) 2022-09, Vol.6 (9), p.238
Hauptverfasser:	Wu, Lizhen, Wang, Chang, Zhang, Pengpeng, Wei, Changyun
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Control systems design Controllers DDPG Deep learning deep reinforcement learning Feedback Heuristic Interactive learning Landing Machine learning Methods Modules Parameters PID Proportional integral derivative Tuning UAV landing Unmanned aerial vehicles Unmanned ground vehicles
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Autonomous Unmanned Aerial Vehicle (UAV) landing remains a challenge in uncertain environments, e.g., landing on a mobile ground platform such as an Unmanned Ground Vehicle (UGV) without knowing its motion dynamics. A traditional PID (Proportional, Integral, Derivative) controller is a choice for the UAV landing task, but it suffers the problem of manual parameter tuning, which becomes intractable if the initial landing condition changes or the mobile platform keeps moving. In this paper, we design a novel learning-based controller that integrates a standard PID module with a deep reinforcement learning module, which can automatically optimize the PID parameters for velocity control. In addition, corrective feedback based on heuristics of parameter tuning can speed up the learning process compared with traditional DRL algorithms that are typically time-consuming. In addition, the learned policy makes the UAV landing smooth and fast by allowing the UAV to adjust its speed adaptively according to the dynamics of the environment. We demonstrate the effectiveness of the proposed algorithm in a variety of quadrotor UAV landing tasks with both static and dynamic environmental settings.
ISSN:	2504-446X 2504-446X
DOI:	10.3390/drones6090238