Constrained EV Charging Scheduling Based on Safe Deep Reinforcement Learning

Electric vehicles (EVs) have been popularly adopted and deployed over the past few years because they are environment-friendly. When integrated into smart grids, EVs can operate as flexible loads or energy storage devices to participate in demand response (DR). By taking advantage of time-varying el...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on smart grid 2020-05, Vol.11 (3), p.2427-2439
Hauptverfasser: Li, Hepeng, Wan, Zhiqiang, He, Haibo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Electric vehicles (EVs) have been popularly adopted and deployed over the past few years because they are environment-friendly. When integrated into smart grids, EVs can operate as flexible loads or energy storage devices to participate in demand response (DR). By taking advantage of time-varying electricity prices in DR, the charging cost can be reduced by optimizing the charging/discharging schedules. However, since there exists randomness in the arrival and departure time of an EV and the electricity price, it is difficult to determine the optimal charging/discharging schedules to guarantee that the EV is fully charged upon departure. To address this issue, we formulate the EV charging/discharging scheduling problem as a constrained Markov Decision Process (CMDP). The aim is to find a constrained charging/discharging scheduling strategy to minimize the charging cost as well as guarantee the EV can be fully charged. To solve the CMDP, a model-free approach based on safe deep reinforcement learning (SDRL) is proposed. The proposed approach does not require any domain knowledge about the randomness. It directly learns to generate the constrained optimal charging/discharging schedules with a deep neural network (DNN). Unlike existing reinforcement learning (RL) or deep RL (DRL) paradigms, the proposed approach does not need to manually design a penalty term or tune a penalty coefficient. Numerical experiments with real-world electricity prices demonstrate the effectiveness of the proposed approach.
ISSN:1949-3053
1949-3061
DOI:10.1109/TSG.2019.2955437