Adaptive Duty Cycling in Sensor Networks With Energy Harvesting Using Continuous-Time Markov Chain and Fluid Models

The dynamic and unpredictable nature of energy harvesting sources available for wireless sensor networks, and the time variation in network statistics like packet transmission rates and link qualities, necessitate the use of adaptive duty cycling techniques. Such adaptive control allows sensor nodes...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal on selected areas in communications 2015-12, Vol.33 (12), p.2687-2700
Hauptverfasser: Chan, Wai Hong Ronald, Zhang, Pengfei, Nevat, Ido, Nagarajan, Sai Ganesh, Valera, Alvin C., Tan, Hwee-Xian, Gautam, Natarajan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The dynamic and unpredictable nature of energy harvesting sources available for wireless sensor networks, and the time variation in network statistics like packet transmission rates and link qualities, necessitate the use of adaptive duty cycling techniques. Such adaptive control allows sensor nodes to achieve long-run energy neutrality, where energy supply and demand are balanced in a dynamic environment such that the nodes function continuously. In this paper, we develop a new framework enabling an adaptive duty cycling scheme for sensor networks that takes into account the node battery level, ambient energy that can be harvested, and application-level QoS requirements. We model the system as a Markov decision process (MDP) that modifies its state transition policy using reinforcement learning. The MDP uses continuous time Markov chains (CTMCs) to model the network state of a node to obtain key QoS metrics like latency, loss probability, and power consumption, as well as to model the node battery level taking into account physically feasible rates of change. We show that with an appropriate choice of the reward function for the MDP, as well as a suitable learning rate, exploitation probability, and discount factor, the need to maintain minimum QoS levels for optimal network performance can be balanced with the need to promote the maintenance of a finite battery level to ensure node operability. Extensive simulation results show the benefit of our algorithm for different reward functions and parameters.
ISSN:0733-8716
1558-0008
DOI:10.1109/JSAC.2015.2478717