Graph-based multi-agent reinforcement learning for large-scale UAVs swarm system control

In this study, a novel graph-embedding technique based on a graph neural network (GNN) is proposed to identify the topology in the motion of a unmanned aerial vehicles (UAV) swarm and quickly obtain local information around each agent. We also propose a model reference reinforcement learning method...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Aerospace science and technology 2024-07, Vol.150, p.109166, Article 109166
Hauptverfasser:	Zhao, Bocheng, Huo, Mingying, Li, Zheng, Yu, Ze, Qi, Naiming
Format:	Artikel
Sprache:	eng
Schlagworte:	Collision avoidance Graph neural network Multi-agent reinforcement learning Potential field function Unmanned aerial vehicles swarm
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this study, a novel graph-embedding technique based on a graph neural network (GNN) is proposed to identify the topology in the motion of a unmanned aerial vehicles (UAV) swarm and quickly obtain local information around each agent. We also propose a model reference reinforcement learning method to learn the potential field function and determine an appropriate strategy for each agent that can satisfy the requirements of collaborative motion and obstacle avoidance for large-scale UAV swarms. First, a new swarm structure is proposed to provide reserved maneuvering space for UAVs during flight. After encoding the obstacle avoidance behavior of multiple UAVs into spatial graphs, a graph attention mechanism (GAT) was employed to extract the dynamic information from them. Consequently, each individual autonomously generate actions based on its local data. Second, a new distributed control algorithm based on multi-agent reinforcement learning (MARL) is proposed to learn the potential field function from the local information. Each individual can repel and cooperate with the target within a short range and attract objects over a long distance. Finally, simulation results demonstrate the effectiveness and superiority of the proposed method, which has great potential for application in online autonomous collaboration.
ISSN:	1270-9638 1626-3219
DOI:	10.1016/j.ast.2024.109166