Efficient Jamming Resource Allocation Against Frequency-Hopping Spread Spectrum in WSNs with Asynchronous Deep Reinforcement Learning

Jamming against frequency-hopping spread spectrum (FHSS) in wireless sensor networks (WSNs) has been primarily investigated with the follower jamming mode. However, implementing follower jamming in practical applications encounters manifold challenges, such as stringent requirements on hardware perf...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE sensors journal 2024-04, Vol.24 (8), p.1-1
Hauptverfasser:	Rao, Ning, Xu, Hua, Wang, Dan, Qi, Zisen, Zhang, Yue, Gu, Wanyi, Peng, Xiang
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms asynchronous deep reinforcement learning Bandwidths Convergence Decision making Decisions Deep learning Frequency hopping Jamming jamming against FHSS Long Short-Term Memory Markov processes partial-band noise jamming Resource allocation Spectrum allocation Spread spectrum Synchronism Wireless sensor networks
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Jamming against frequency-hopping spread spectrum (FHSS) in wireless sensor networks (WSNs) has been primarily investigated with the follower jamming mode. However, implementing follower jamming in practical applications encounters manifold challenges, such as stringent requirements on hardware performance, difficulties in attaining accurate synchronization with signals. Diverging from existing works, in this paper, we propose a novel partial-band noise jamming (PBNJ) decision-making algorithm based on asynchronous deep reinforcement learning, which can allocate central jamming frequency and bandwidth more efficiently in FHSS jamming. Firstly, we model the problem of allocating jamming resource of PBNJ to disrupt the FHSS communication in WSNs as a Markov decision process (MDP). Next, considering the interrelationship among decisions made by different jamming nodes (JNs), we construct a multi-step decision framework in a time-division manner, and the Long Short-Term Memory (LSTM) network is leveraged to fully extract decision features from historical data, capturing correlations between jamming strategies of the deployed JNs, and guides future jamming decisions and enhances collaboration among different JNs in jamming resources allocation. Furthermore, to accelerate the convergence, we adopt the asynchronous advantage actor-critic (A3C) algorithm to optimize the allocation of central jamming frequency and bandwidth of JNs, utilizing the architecture of multi-threaded parallel training, and update the actor network and critic network in an asynchronous gradient descent manner. Simulation results show that the proposed LSTM-A3C algorithm converges fast and outperforms various baselines in terms of the convergence speed, jamming success rate and the total reward.
ISSN:	1530-437X 1558-1748
DOI:	10.1109/JSEN.2024.3369038