Task offloading and trajectory scheduling for UAV-enabled MEC networks: An MADRL algorithm with prioritized experience replay

As a new network architecture, the air-ground cooperative network is a support for future 6G network to achieve ubiquitous connectivities. To effectively relieve the computational pressure of massive data in 6G wireless networks, Unmanned Aerial Vehicles (UAVs) equipped with Mobile Edge Computing (M...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Ad hoc networks 2024-03, Vol.154, p.103371, Article 103371
Hauptverfasser: Shi, Huaguang, Tian, Yuxiang, Li, Hengji, Huang, Jian, Shi, Lei, Zhou, Yi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As a new network architecture, the air-ground cooperative network is a support for future 6G network to achieve ubiquitous connectivities. To effectively relieve the computational pressure of massive data in 6G wireless networks, Unmanned Aerial Vehicles (UAVs) equipped with Mobile Edge Computing (MEC) servers have become an emerging technology that provides computing resources for Mobile Devices (MDs). Due to limited on-board energy and computational capabilities, this paper investigates a multi-UAV collaborative assisted MEC architecture. The optimization problem of minimizing the total computational cost is constructed by jointly optimizing the UAVs trajectories and MDs offloading strategies scheduling. The coupling between the optimization variables and the non-convexity of the problem can make it difficult to solve directly. To address the above concerns, the non-convex optimization problem is converted into a Markov decision process. The UAVs-assisted Offloading Strategy based on Reinforcement Learning (UOS-RL) algorithm is proposed to address the convergence difficulties caused by the high-dimensional continuous action space. Furthermore, the experience data generated by agents interacting with the environment is highly differentiated due to the highly dynamic variation of the environment. Hence, a Priority Experience Replay (PER) mechanism is proposed to improve the training efficiency of the UOS-RL algorithm based on the priority of the experience data. Simulation results show that the proposed PER-UOS-RL algorithm outperforms the existing works in terms of the computational cost.
ISSN:1570-8705
1570-8713
DOI:10.1016/j.adhoc.2023.103371