Dashing for the Golden Snitch: Multi-Drone Time-Optimal Motion Planning with Multi-Agent Reinforcement Learning
Recent innovations in autonomous drones have facilitated time-optimal flight in single-drone configurations and enhanced maneuverability in multi-drone systems through the application of optimal control and learning-based methods. However, few studies have achieved time-optimal motion planning for m...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent innovations in autonomous drones have facilitated time-optimal flight
in single-drone configurations and enhanced maneuverability in multi-drone
systems through the application of optimal control and learning-based methods.
However, few studies have achieved time-optimal motion planning for multi-drone
systems, particularly during highly agile maneuvers or in dynamic scenarios.
This paper presents a decentralized policy network for time-optimal multi-drone
flight using multi-agent reinforcement learning. To strike a balance between
flight efficiency and collision avoidance, we introduce a soft collision
penalty inspired by optimization-based methods. By customizing PPO in a
centralized training, decentralized execution (CTDE) fashion, we unlock higher
efficiency and stability in training, while ensuring lightweight
implementation. Extensive simulations show that, despite slight performance
trade-offs compared to single-drone systems, our multi-drone approach maintains
near-time-optimal performance with low collision rates. Real-world experiments
validate our method, with two quadrotors using the same network as simulation
achieving a maximum speed of 13.65 m/s and a maximum body rate of 13.4 rad/s in
a 5.5 m * 5.5 m * 2.0 m space across various tracks, relying entirely on
onboard computation. |
---|---|
DOI: | 10.48550/arxiv.2409.16720 |