Cooperative Multi-Agent Reinforcement-Learning-Based Distributed Dynamic Spectrum Access in Cognitive Radio Networks

With the development of wireless communication and Internet of Things (IoT), there are massive wireless devices that need to share the limited spectrum resources. Dynamic spectrum access (DSA) is a promising paradigm to remedy the problem of inefficient spectrum utilization brought upon by the histo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE internet of things journal 2022-10, Vol.9 (19), p.19477-19488
Hauptverfasser:	Tan, Xiang, Zhou, Li, Wang, Haijun, Sun, Yuli, Zhao, Haitao, Seet, Boon-Chong, Wei, Jibo, Leung, Victor C. M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Cognitive radio Cognitive radio networks Collision rates Command and control cooperative game decentralized partially observable Markov decision process (Dec-POMDP) deep recurrent <italic xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Q -network (DRQN) dynamic spectrum access (DSA) Games Internet of Things Machine learning Markov game Markov processes multi-agent reinforcement learning (MARL) Multiagent systems Radio networks Reinforcement learning Spectrum allocation Wireless communication Wireless communications Wireless sensor networks
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	With the development of wireless communication and Internet of Things (IoT), there are massive wireless devices that need to share the limited spectrum resources. Dynamic spectrum access (DSA) is a promising paradigm to remedy the problem of inefficient spectrum utilization brought upon by the historical command-and-control approach to spectrum allocation. In this article, we investigate the distributed DSA problem for multiusers in a typical multichannel cognitive radio network. The problem is formulated as a decentralized partially observable Markov decision process (Dec-POMDP), and we propose a centralized off-line training and distributed online execution framework based on cooperative multi-agent reinforcement learning (MARL). We employ the deep recurrent Q -network (DRQN) to address the partial observability of the state for each cognitive user. The ultimate goal is to learn a cooperative strategy which maximizes the sum throughput of a cognitive radio network in a distributed fashion without information exchange between cognitive users. Finally, we validate the proposed algorithm in various settings through extensive experiments. The experimental results show that the proposed CoMARL-DSA algorithm outperforms the state-of-the-art deep Q -learning for spectrum access (DQSA) in terms of successful access rate and collision rate by at least 14% and 12%, respectively.
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2022.3168296