Multi-agent cooperative area coverage: A two-stage planning approach based on reinforcement learning

Multi-agent area coverage aims to accomplish the complete traversal of the target area through cooperation between agents. Focusing on the problems of low coverage efficiency and weak practicability in the existing methods, we propose a two-stage area coverage method based on multi-agent deep reinfo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information sciences 2024-09, Vol.678, p.121025, Article 121025
Hauptverfasser:	Yuan, Guohui, Xiao, Jian, He, Jinhui, Jia, Honyu, Wang, Yaoting, Wang, Zhuoran
Format:	Artikel
Sprache:	eng
Schlagworte:	Cooperative navigation control Coverage path planning Multi-agent area coverage Reinforcement learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Multi-agent area coverage aims to accomplish the complete traversal of the target area through cooperation between agents. Focusing on the problems of low coverage efficiency and weak practicability in the existing methods, we propose a two-stage area coverage method based on multi-agent deep reinforcement learning. In the first stage, we convert the coverage path planning problem into an optimal grid selection problem, and according to the equivalence of agents in cooperative tasks, we propose a distributed coverage path planning algorithm based on QMIX and a grid coverage map. The second stage is to realize the cooperative navigation control in a constrained environment with obstacles and non-ideal communication conditions. To implement the stage, we design a hybrid attention mechanism to adaptively aggregate important feature information of adjacent agents and obstacles, which efficiently exploits the limited local perception and communication capabilities of agents to perform cooperative control. The experimental results show that the proposed two-stage multi-agent area coverage method can accomplish the area coverage task in the environment with random obstacles, and the area coverage efficiency and robustness are significantly better than other reinforcement learning based or traditional coverage algorithms. In addition, the results also verify that the proposed method has the advantage of adapting to the dynamic changes in the number of agents and the communication range.
ISSN:	0020-0255 1872-6291
DOI:	10.1016/j.ins.2024.121025