Multitask Transfer Deep Reinforcement Learning for Timely Data Collection in Rechargeable-UAV-Aided IoT Networks

Thanks to their high-flexibility and low-operational cost, unmanned aerial vehicles (UAVs) can be used to support mission-critical applications in the Internet of Things (IoT). However, due to the limited onboard energy, it is difficult for UAVs to provide continuous data collection. In this article...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE internet of things journal 2023-12, Vol.10 (23), p.20545-20559
Hauptverfasser: Yi, Mengjie, Wang, Xijun, Liu, Juan, Zhang, Yan, Hou, Ronghui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Thanks to their high-flexibility and low-operational cost, unmanned aerial vehicles (UAVs) can be used to support mission-critical applications in the Internet of Things (IoT). However, due to the limited onboard energy, it is difficult for UAVs to provide continuous data collection. In this article, we study the problem of rechargeable-UAV-aided timely data collection in IoT networks, where the UAV collects status updates from multiple sensors and gets recharged from the charging stations (CSs) to keep its energy level above a threshold. To tradeoff the information freshness and energy consumption, we formulate a Markov decision process (MDP) with the objective of minimizing the weighted sum of the average total Age of Information and average recharging price. Under the dynamics and uncertainty of the environment, we propose a multitask transfer deep reinforcement learning method to jointly optimize the UAV ’ s flight trajectory, transmission scheduling, and battery recharging. To enable the application of the learned policy to new environments with similar settings and avoid starting from scratch, we develop a multitask network made up of common knowledge layers and task-specific knowledge layers. It specifically makes it possible for the transfer of common knowledge between environments with different network scales (e.g., different numbers of sensors/CSs) and/or topologies (e.g., different locations of sensors/CSs). Simulation results demonstrate that the proposed algorithm can adapt to new environments and achieve superior performance compared to the baseline algorithms.
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2023.3300927