Deep Reinforcement Learning for Dynamic Spectrum Access: Convergence Analysis and System Design

In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum ac...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on wireless communications 2024-12, Vol.23 (12), p.18888-18902
Hauptverfasser:	Safavinejad, Ramin, Chang, Hao-Hsuan, Liu, Lingjia
Format:	Artikel
Sprache:	eng
Schlagworte:	5G beyond and 6G 5G mobile communication 6G mobile communication Convergence covering numbers Deep learning Deep reinforcement learning (DRL) dynamic spectrum access (DSA) echo state network (ESN) Networks Parameters Performance evaluation Radio spectra recurrent neural network Spectrum allocation Systems design Training Training data Upper bound Upper bounds Wireless communication
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the SU-PU interaction is limited, deep reinforcement learning has been introduced to help SUs conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate information from recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency since it needs a rather large number of training samples to tune its parameters which is a computationally demanding task. Deep echo state network (DEQN) has been introduced for DSA networks to address the sample efficiency issue of DRQN. In this work, we compare the convergence of DRQN and DEQN by comparing the upper bounds we obtain on their covering number, a notion of richness. Furthermore, we introduce a method to determine the right hyper-parameters for DEQN, providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based ones.
ISSN:	1536-1276 1558-2248
DOI:	10.1109/TWC.2024.3414428