Auxiliary-Task-Based Energy-Efficient Resource Orchestration in Mobile Edge Computing

Advances in edge computing significantly impact the development of mobile networks. As the most important research goal related to edge networks, resource orchestration has been well studied in recent years; however, existing approaches based on deep reinforcement learning share similar bottlenecks...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on green communications and networking 2023-03, Vol.7 (1), p.313-327
Hauptverfasser: Zhu, Kaige, Zhang, Zhenjiang, Zhao, Mingxiong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Advances in edge computing significantly impact the development of mobile networks. As the most important research goal related to edge networks, resource orchestration has been well studied in recent years; however, existing approaches based on deep reinforcement learning share similar bottlenecks in training inefficiency. In this paper, we treat drones, whose available time is significantly limited by their batteries, as the mobile terminals of a target edge network and aim to maximize the energy efficiency. The battery-constrained resource orchestration problem is formulated as a nonconvex optimization problem with consideration of both operating costs and available battery. Owing to the NP-hard nature of mixed-integer programming, the Auxiliary-Task-based dynamic Weighting Resource Orchestration (ATWRO) algorithm is proposed. To improve the sample efficiency, related parameters serving as auxiliary tasks are employed to provide additional gradient information. We further refine the exploration space and apply an alternative replay buffer to develop a customized reinforcement learning approach. Extensive experiments demonstrate the effectiveness of the proposed scheme, as they prove that by employing auxiliary tasks, reinforcement learning agents can be trained with higher efficiency. Moreover, the service time of the whole system can be prolonged, and a higher number of completed tasks can be guaranteed.
ISSN:2473-2400
2473-2400
DOI:10.1109/TGCN.2022.3201615