Federated deep reinforcement learning based trajectory design for UAV-assisted networks with mobile ground devices
Exploiting highly maneuverable unmanned aerial vehicles (UAVs) has been considered as an efficient way to assist wireless systems, e.g., for applications of data collection. However, several challenges remain to be addressed in the design of such UAV-assisted networks, including multi-UAV joint traj...
Gespeichert in:
Veröffentlicht in: | Scientific reports 2024-10, Vol.14 (1), p.22753-21, Article 22753 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Exploiting highly maneuverable unmanned aerial vehicles (UAVs) has been considered as an efficient way to assist wireless systems, e.g., for applications of data collection. However, several challenges remain to be addressed in the design of such UAV-assisted networks, including multi-UAV joint trajectory determination, data privacy protection, and adaption to the complex channel environment particularly with mobile ground devices (GDs). In this paper, we study a multi-UAV assisted data collection system where UAVs collect data locally from mobile GDs. The aim is to minimize the whole operation time cost via jointly optimizing the UAVs’ three-dimensional (3D) trajectory together with the GDs’ communication scheduling, while satisfying the constraints of no-fly zones (NFZs) and collision avoidance. With a nonconvex feasible set (due to the NFZs), the established problem is nonconvex. Moreover, the randomness of GDs movements significantly reduces the performance of a typical redesign, i.e., determining the UAVs’ trajectory and users’ scheduling before starting the data collection task. To tackle these issues, we first transform the established problem into a Markov decision one, and then propose a multi-agent federated reinforcement learning (MAFRL)-based approach to optimize the dynamic long-term objective via jointly determining UAVs’ 3D trajectory and GD’s communication scheduling. A multi-step propagation technique and a dueling network architecture are adopted to enhance the neural network utilized to train agents, i.e., to accelerate the convergence rate of the proposed method and improve its overall stability. Finally, experimental results reveal the effectiveness of our proposed method in the considered practical scenario. |
---|---|
ISSN: | 2045-2322 2045-2322 |
DOI: | 10.1038/s41598-024-72654-y |