Multi-UAV Collaborative Path Planning using Hierarchical Reinforcement Learning and Simulated Annealing
In practice, classical path optimization algorithms performs poorly when applied to an unknown environment, swarm intelligence algorithms need further improvement in agility and accuracy to avoid a moving object in dynamic environment, and reinforcement learning algorithm, a usual solution adopted i...
Gespeichert in:
Veröffentlicht in: | International journal of performability engineering 2022-07, Vol.18 (7), p.463 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In practice, classical path optimization algorithms performs poorly when applied to an unknown environment, swarm intelligence algorithms need further improvement in agility and accuracy to avoid a moving object in dynamic environment, and reinforcement learning algorithm, a usual solution adopted in machine learning, may give rise to curse of dimensionality due to the complexity of scenario. In view of aforesaid practical problems, this paper proposes using MAXQ hierarchical reinforcement learning method to achieve dimensionality reduction by abstraction and combining leader-wingman approach with dynamic dead zone to model after cooperative formation and design triangular form. A novel algorithm based on MAXQ and simulated annealing is designed to solve unmanned aerial vehicle (UAV) path planning problem, which accomplishes grid method-based path planning simulation in 2D scenarios. A comparative analysis is performed on Q-Learning, ε-Q-Learning, standard MAXQ and SA-MAXQ algorithms in terms of their convergence, time consumption and search steps. Moreover, leader-wingman method is combined with dynamic dead zone in modelling triangular form for Multi-UAV collaborative formation. The experimental results indicate SA-MAXQ algorithm yields quicker astringence, lower volatility, better learning effect, less time consumed and optimized searched route. |
---|---|
ISSN: | 0973-1318 |
DOI: | 10.23940/ijpe.22.07.p1.463474 |