A Deep-Reinforcement-Learning-Based Approach to Dynamic eMBB/URLLC Multiplexing in 5G NR

This article investigates the dynamic multiplexing of enhanced mobile broadband (eMBB) and ultrareliable and low latency communications (URLLC) on the same channel in 5G NR. Due to significant difference in transmission time scale, URLLC employs a preemptive puncturing technique to multiplex its tra...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE internet of things journal 2020-07, Vol.7 (7), p.6439-6456
Hauptverfasser:	Huang, Yan, Li, Shaoran, Li, Chengzhang, Hou, Y. Thomas, Lou, Wenjing
Format:	Artikel
Sprache:	eng
Schlagworte:	3GPP Standards 5G NR Algorithms Approximation algorithms Broadband Computer Science Computer Science, Information Systems Decoding Deep learning deep reinforcement learning (DRL) Engineering Engineering, Electrical & Electronic enhanced mobile broadband (eMBB)/ultrareliable and low latency communication (URLLC) multiplexing Mobile computing Multiplexing Network latency Neural networks Optimization Piercing Preempting preemption puncturing resource allocation Resource management Resource scheduling Science & Technology Technology Telecommunications
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This article investigates the dynamic multiplexing of enhanced mobile broadband (eMBB) and ultrareliable and low latency communications (URLLC) on the same channel in 5G NR. Due to significant difference in transmission time scale, URLLC employs a preemptive puncturing technique to multiplex its traffic onto eMBB traffic for transmission. The optimization problem to solve is to minimize the adverse impact of such preemptive puncturing on eMBB users. We present DEMUX-a model-free deep reinforcement learning (DRL)-based solution to this problem. The essence of DEMUX is to use deep function approximators (neural networks) to learn an optimal algorithm for determining the preemption solution in each eMBB transmission time interval (TTI). Our novel contributions in the design of DEMUX include the first use of the DRL method with a large and continuous action domain for resource scheduling in NR, a mechanism to ensure fast and stable learning convergence by exploiting the intrinsic properties of the problem, and a mechanism to obtain a feasible preemption solution from the unconstrained output of a neural network while minimizing loss of information. The experimental results show that DEMUX significantly outperforms state-of-the-art algorithms proposed in the 3GPP standards body and the literature.
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2020.2978692