Cooperative Multi-Agent Reinforcement-Learning-Based Distributed Dynamic Spectrum Access in Cognitive Radio Networks
With the development of wireless communication and Internet of Things (IoT), there are massive wireless devices that need to share the limited spectrum resources. Dynamic spectrum access (DSA) is a promising paradigm to remedy the problem of inefficient spectrum utilization brought upon by the histo...
Gespeichert in:
Veröffentlicht in: | IEEE internet of things journal 2022-10, Vol.9 (19), p.19477-19488 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the development of wireless communication and Internet of Things (IoT), there are massive wireless devices that need to share the limited spectrum resources. Dynamic spectrum access (DSA) is a promising paradigm to remedy the problem of inefficient spectrum utilization brought upon by the historical command-and-control approach to spectrum allocation. In this article, we investigate the distributed DSA problem for multiusers in a typical multichannel cognitive radio network. The problem is formulated as a decentralized partially observable Markov decision process (Dec-POMDP), and we propose a centralized off-line training and distributed online execution framework based on cooperative multi-agent reinforcement learning (MARL). We employ the deep recurrent Q -network (DRQN) to address the partial observability of the state for each cognitive user. The ultimate goal is to learn a cooperative strategy which maximizes the sum throughput of a cognitive radio network in a distributed fashion without information exchange between cognitive users. Finally, we validate the proposed algorithm in various settings through extensive experiments. The experimental results show that the proposed CoMARL-DSA algorithm outperforms the state-of-the-art deep Q -learning for spectrum access (DQSA) in terms of successful access rate and collision rate by at least 14% and 12%, respectively. |
---|---|
ISSN: | 2327-4662 2327-4662 |
DOI: | 10.1109/JIOT.2022.3168296 |