Reinforcement Learning-Based Adaptive Stateless Routing for Ambient Backscatter Wireless Sensor Networks

This paper explores the routing problem in ambient backscatter wireless sensor networks (AB-WSNs) using reinforcement learning approaches. Ambient RF signals serve as the only power source for battery-less sensor nodes and are also leveraged to enable backscatter communication among these nodes. Thi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on communications 2024-07, Vol.72 (7), p.4206-4225
Hauptverfasser: Guo, Huanyu, Yang, Donghua, Gao, Hong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper explores the routing problem in ambient backscatter wireless sensor networks (AB-WSNs) using reinforcement learning approaches. Ambient RF signals serve as the only power source for battery-less sensor nodes and are also leveraged to enable backscatter communication among these nodes. This results in intermittent connection and dynamic topology within AB-WSNs, thereby making it difficult to route data to the sink, e.g., data may not reach the sink in a timely manner. We first introduce a multi-agent network model with two mechanisms to address this issue. We then model the routing problem with the Markov decision process, allowing each node to make informed route decisions based on the current state of its neighbors. With the aim of enabling each node to learn the optimal routing policy and do adaptive stateless routing, we propose two learning algorithms. The first, a value-based learning algorithm, is designed for sparse AB-WSNs. And the second, a policy-based learning algorithm, is intended to tackle the curse of dimensionality in dense AB-WSNs. We analyze the convergence of both learning algorithms and evaluate their performance through extensive experiments. The experiment results validate the convergence and efficiency of the proposed learning algorithms under various conditions.
ISSN:0090-6778
1558-0857
DOI:10.1109/TCOMM.2024.3369694