Intelligent multi-zone residential HVAC control strategy based on deep reinforcement learning

Residential heating, ventilation, and air conditioning (HVAC) has been considered as an important demand response resource. However, the optimization of residential HVAC control is no trivial task due to the complexity of the thermal dynamic models of buildings and uncertainty associated with both o...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied energy 2020-11, Vol.281 (1)
Hauptverfasser:	Du, Yan, Zandi, Helia, Kotevska, Olivera, Kurte, Kuldeep, Munk, Jeffery, Amasyali, Kadir, Mckee, Evan, Li, Fangxing
Format:	Artikel
Sprache:	eng
Schlagworte:	Actor-critic learning Deep deterministic policy gradient (DDPG) Deep reinforcement learning (deep RL) Demand response ENERGY CONSERVATION, CONSUMPTION, AND UTILIZATION Multi-zone residential HVAC
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Residential heating, ventilation, and air conditioning (HVAC) has been considered as an important demand response resource. However, the optimization of residential HVAC control is no trivial task due to the complexity of the thermal dynamic models of buildings and uncertainty associated with both occupant-driven heat loads and weather forecasts. In this paper, we apply a novel model-free deep reinforcement learning (RL) method, known as the deep deterministic policy gradient (DDPG), to generate an optimal control strategy for a multi-zone residential HVAC system with the goal of minimizing energy consumption cost while maintaining the users’ comfort. Here, the applied deep RL-based method learns through continuous interaction with a simulated building environment and without referring to any prior model knowledge. Simulation results show that compared with the state-of-art deep Q network (DQN), the DDPG-based HVAC control strategy can reduce the energy consumption cost by 15% and reduce the comfort violation by 79%; and when compared with a rule-based HVAC control strategy, the comfort violation can be reduced by 98%. In addition, experiments with different building models and retail price models demonstrate that the well-trained DDPG-based HVAC control strategy has high generalization and adaptability to unseen environments, which indicates its practicability for real-world implementation.
ISSN:	0306-2619 1872-9118