An optimal solutions-guided deep reinforcement learning approach for online energy storage control

As renewable energy becomes more prevalent in the power grid, energy storage systems (ESSs) are playing an ever-increasingly crucial role in mitigating short-term supply–demand imbalances. However, the operation and control of ESS are not straightforward, given the ever-changing electricity prices i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied energy 2024-05, Vol.361, p.122915, Article 122915
Hauptverfasser: Xu, Gaoyuan, Shi, Jian, Wu, Jiaman, Lu, Chenbei, Wu, Chenye, Wang, Dan, Han, Zhu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As renewable energy becomes more prevalent in the power grid, energy storage systems (ESSs) are playing an ever-increasingly crucial role in mitigating short-term supply–demand imbalances. However, the operation and control of ESS are not straightforward, given the ever-changing electricity prices in the market environment and the stochastic and intermittent nature of renewable energy generations, which respond to real-time load variations. In this paper, we propose a deep reinforcement learning (DRL) approach to address the electricity arbitrage problem associated with optimal ESS management. First, we analyze the structure of the optimal offline ESS control problem using the mixed-integer linear programming (MILP) formulation. This formulation identifies optimal control actions to absorb excess renewable energy and perform price arbitrage strategies. To tackle the uncertainties inherent in the prediction data, we then recast the online ESS control problem into a Markov Decision Process (MDP) framework and develop the DRL approach, which involves integrating the optimal offline control solution obtained from the training data into the training process and introducing noise to the state transitions. Unlike typical offline DRL training over a long time interval, we employ the Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization (PPO) algorithms with smaller neural networks training over a short time interval. Numerical studies demonstrate the promising potential of the proposed DRL-enabled approach for achieving better online control performance than the model predictive control (MPC) method under different price errors. This highlights the sample efficiency and robustness of our DRL approaches in managing ESS for electricity arbitrage. •Incorporate optimal strategy as prior knowledge by revising the DRL reward.•Employ small neural networks to efficiently train samples and update models.•Utilize noise to enhance error tolerance and reduce distribution shifts’ impact.
ISSN:0306-2619
1872-9118
DOI:10.1016/j.apenergy.2024.122915