Joint relay and channel selection against mobile and smart jammer: A deep reinforcement learning approach

This paper investigates the joint relay and channel selection problem using a deep reinforcement learning (DRL) algorithm for cooperative communications in a dynamic jamming environment. The latest types of jammers include the mobile and smart jammer that contains multiple jamming patterns. This new...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IET communications 2021-10, Vol.15 (17), p.2237-2251
Hauptverfasser: Yuan, Hongcheng, Song, Fei, Chu, Xiaojing, Li, Wen, Wang, Ximing, Han, Hao, Gong, Yuping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper investigates the joint relay and channel selection problem using a deep reinforcement learning (DRL) algorithm for cooperative communications in a dynamic jamming environment. The latest types of jammers include the mobile and smart jammer that contains multiple jamming patterns. This new type of jammer poses serious challenges to reliable communications such as huge environment states, tightly coupled joint action selections and real‐time decision requirements. To cope with these challenges, a DRL‐based relay‐assisted cooperative communication scheme is proposed. In this scheme, the joint selection problem is constructed as a Markov decision process (MDP) and a double deep Q network (DDQN) based anti‐jamming scheme is proposed to address the unknown and dynamic jamming behaviors. Concretely, a joint decision‐making network composed of three sub‐networks is designed and the independent learning method of each sub‐network is proposed. The simulation results show that the user agent is able to anticipate the jammer behaviors and elude the jamming in advance. Furthermore, compared with the sensing‐based algorithm, the Q learning‐based algorithm and the existing DRL‐based anti‐jamming approaches, the proposed algorithm maintains a higher average normalized throughput.
ISSN:1751-8628
1751-8636
DOI:10.1049/cmu2.12257