Distributed Online Dispatch for Microgrids Using Hierarchical Reinforcement Learning Embedded With Operation Knowledge

This paper considers the problem of distributed online economic dispatch (DOED) from sequential data using reinforcement learning. Learning operation behavior in high-dimension environments with constraints is a major challenge for the DOED of networked microgrids (MGs), where insufficient explorati...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on power systems 2023-07, Vol.38 (4), p.2989-3002
Hauptverfasser: Lu, Tianguang, Hao, Ran, Ai, Qian, He, Hongying
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper considers the problem of distributed online economic dispatch (DOED) from sequential data using reinforcement learning. Learning operation behavior in high-dimension environments with constraints is a major challenge for the DOED of networked microgrids (MGs), where insufficient exploration prevents agents from building complex policies. Therefore, this paper develops a hierarchical reinforcement learning (HRL) to handle the DOED problem, where radial basis function (RBF) approximation is incorporated to make policies in continuous space. Based on the hierarchical framework, the HRL algorithm improves learning efficiency and reduces computational cost. The online HRL achieves distributed self-adaption and better performance of real-time dispatch by a modest number of interacting variables. In addition, guided by domain knowledge, the HRL algorithm avoids baseline violation and additional learning beyond feasible action space. In the case of an actual networked MG cluster in Qingdao with real operation data, simulation is conducted to verify that the proposed hierarchical learning can reduce long-term operation costs and enhance operation stability. To explore the learning process, we also provide its convergence condition and analyze the sensitivity of the learning parameters.
ISSN:0885-8950
1558-0679
DOI:10.1109/TPWRS.2021.3092220