D2D Resource Allocation with Power Control Based on Multi-player Multi-armed Bandit

Device-to-device (D2D) communication is defined as the direct communication between two D2D user equipments (DUEs) without traversing the evolved NodeB of 5G networks. In the underlay mode of resource reuse, DUEs and cellular user equipments share resource blocks to improve system throughput by reus...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Wireless personal communications 2020-08, Vol.113 (3), p.1455-1470
Hauptverfasser: Kuo, Fang-Chang, Schindelhauer, Christian, Wang, Hwang-Cheng, Lin, Wen-Jun, Tseng, Chih-Cheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Device-to-device (D2D) communication is defined as the direct communication between two D2D user equipments (DUEs) without traversing the evolved NodeB of 5G networks. In the underlay mode of resource reuse, DUEs and cellular user equipments share resource blocks to improve system throughput by reusing the spectrum. In order to further enhance the performance, an extended version of reinforcement learning algorithm, Multi-Player Multi-Armed Bandit, is employed to control the transmission power of the DUEs to reduce the interference induced by resource sharing. Three learning strategies, namely Epsilon-first, Epsilon-greedy, Upper-Confidence-Bound, are applied. Simulation results show that the proposed method improves performance in terms of the average transmission power of D2D pairs, the ratio of unallocated D2D pairs, energy efficiency, and total throughput.
ISSN:0929-6212
1572-834X
DOI:10.1007/s11277-020-07313-2