Drone Elevation Control Based on Python-Unity Integrated Framework for Reinforcement Learning Applications

Reinforcement learning (RL) applications require a huge effort to become established in real-world environments, due to the injury and break down risks during interactions between the RL agent and the environment, in the online training process. In addition, the RL platform tools (e.g., Python OpenA...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Drones (Basel) 2023-04, Vol.7 (4), p.225
Hauptverfasser: Abbass, Mahmoud Abdelkader Bashery, Kang, Hyun-Soo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Reinforcement learning (RL) applications require a huge effort to become established in real-world environments, due to the injury and break down risks during interactions between the RL agent and the environment, in the online training process. In addition, the RL platform tools (e.g., Python OpenAI’s Gym, Unity ML-Agents, PyBullet, DART, MoJoCo, RaiSim, Isaac, and AirSim), that are required to reduce the real-world challenges, suffer from drawbacks (e.g., the limited number of examples and applications, and difficulties in implementation of the RL algorithms, due to difficulties with the programing language). This paper presents an integrated RL framework, based on Python–Unity interaction, to demonstrate the ability to create a new RL platform tool, based on making a stable user datagram protocol (UDP) communication between the RL agent algorithm (developed using the Python programing language as a server), and the simulation environment (created using the Unity simulation software as a client). This Python–Unity integration process, increases the advantage of the overall RL platform (i.e., flexibility, scalability, and robustness), with the ability to create different environment specifications. The challenge of RL algorithms’ implementation and development is also achieved. The proposed framework is validated by applying two popular deep RL algorithms (i.e., Vanilla Policy Gradient (VPG) and Actor-Critic (A2C)), on an elevation control challenge for a quadcopter drone. The validation results for these experimental tests, prove the innovation of the proposed framework, to be used in RL applications, because both implemented algorithms achieve high stability, by achieving convergence to the required performance through the semi-online training process.
ISSN:2504-446X
2504-446X
DOI:10.3390/drones7040225