Safe Deep Reinforcement Learning for Microgrid Energy Management in Distribution Networks with Leveraged Spatial-Temporal Perception

Microgrids (MG) have recently attracted great interest as an effective solution to the challenging problem of distributed energy resources' management in distribution networks. In this context, despite deep reinforcement learning (DRL) constitutes a well-suited model-free and data-driven method...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on smart grid 2023-09, Vol.14 (5), p.1-1
Hauptverfasser: Ye, Yujian, Wang, Hongru, Chen, Peiling, Tang, Yi, Strbac, Goran
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Microgrids (MG) have recently attracted great interest as an effective solution to the challenging problem of distributed energy resources' management in distribution networks. In this context, despite deep reinforcement learning (DRL) constitutes a well-suited model-free and data-driven methodological framework, its application to MG energy management is still challenging, driven by their limitations on environment status perception and constraint satisfaction. In this paper, the MG energy management problem is formalized as a Constrained Markov Decision Process, and is solved with the state-of-the-art interior-point policy optimization (IPO) method. In contrast to conventional DRL approaches, IPO facilitates efficient learning in multi-dimensional, continuous state and action spaces, while promising satisfaction of complex network constraints of the distribution network. The generalization capability of IPO is further enhanced through the extraction of spatial-temporal correlation features from original MG operating status, combining the strength of edge conditioned convolutional network and long short-term memory network. Case studies based on an IEEE 15-bus and 123-bus test feeders with real-world data demonstrate the superior performance of the proposed method in improving MG cost effectiveness, safeguarding the secure operation of the network and uncertainty adaptability, through performance benchmarking against model-based and DRL-based baseline methods. Finally, case studies also analyze the computational and scalability performance of proposed and baseline methods.
ISSN:1949-3053
1949-3061
DOI:10.1109/TSG.2023.3243170