DNN Task Assignment in UAV Networks: A Generative AI Enhanced Multi-Agent Reinforcement Learning Approach
Unmanned Aerial Vehicles (UAVs) possess high mobility and flexible deployment capabilities, prompting the development of UAVs for various application scenarios within the Internet of Things (IoT). The unique capabilities of UAVs give rise to increasingly critical and complex tasks in uncertain and p...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Unmanned Aerial Vehicles (UAVs) possess high mobility and flexible deployment
capabilities, prompting the development of UAVs for various application
scenarios within the Internet of Things (IoT). The unique capabilities of UAVs
give rise to increasingly critical and complex tasks in uncertain and
potentially harsh environments. The substantial amount of data generated from
these applications necessitates processing and analysis through deep neural
networks (DNNs). However, UAVs encounter challenges due to their limited
computing resources when managing DNN models. This paper presents a joint
approach that combines multiple-agent reinforcement learning (MARL) and
generative diffusion models (GDM) for assigning DNN tasks to a UAV swarm, aimed
at reducing latency from task capture to result output. To address these
challenges, we first consider the task size of the target area to be inspected
and the shortest flying path as optimization constraints, employing a greedy
algorithm to resolve the subproblem with a focus on minimizing the UAV's flying
path and the overall system cost. In the second stage, we introduce a novel DNN
task assignment algorithm, termed GDM-MADDPG, which utilizes the reverse
denoising process of GDM to replace the actor network in multi-agent deep
deterministic policy gradient (MADDPG). This approach generates specific DNN
task assignment actions based on agents' observations in a dynamic environment.
Simulation results indicate that our algorithm performs favorably compared to
benchmarks in terms of path planning, Age of Information (AoI), energy
consumption, and task load balancing. |
---|---|
DOI: | 10.48550/arxiv.2411.08299 |