Battery life constrained real-time energy management strategy for hybrid electric vehicles based on reinforcement learning

Hybrid electric vehicles (HEVs) bridge the gap between internal combustion engine vehicles and pure electric vehicles, and are therefore regarded as a promising solution to the energy crisis. This paper proposes a real-time energy management strategy (EMS) for hybrid electric vehicles based on reinf...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Energy (Oxford) 2022-11, Vol.259, p.124986, Article 124986
Hauptverfasser: Han, Lijin, Yang, Ke, Ma, Tian, Yang, Ningkang, Liu, Hui, Guo, Lingxiong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Hybrid electric vehicles (HEVs) bridge the gap between internal combustion engine vehicles and pure electric vehicles, and are therefore regarded as a promising solution to the energy crisis. This paper proposes a real-time energy management strategy (EMS) for hybrid electric vehicles based on reinforcement learning (RL) to improve fuel economy and minimize battery degradation. First, an online recursive Markov chain (MC) is developed that continuously collects statistical features from actual driving conditions, and thus an adaptive and accurate environment model is established. Then, a novel RL algorithm, eligibility trace, is introduced to learn the control policy online based on MC model. By introducing a trace-decay parameter, the eligibility trace algorithm unifies the returns of different steps, forming a more reliable estimate of the optimal value function, and therefore outperforms traditional RL algorithms in optimization. Furthermore, induced matrix norm (IMN) is employed as a standard to measure difference between transition probability matrices (TPM) of MC and to decide when to update environment model as well as recalculate the control policy. Therefore, the EMS's adaptability to various driving conditions are significantly enhanced. Simulation results indicate that eligibility trace shows the best performance in both improving fuel economy and reducing battery life loss compared with Q-learning and rule-based method. •A battery life loss model is established and considered in the energy management.•An indirect reinforcement learning algorithm, eligibility trace, is developed.•The power transition probability of the Markov Chain is updated online and recursively.•The introduced matrix norm is used to determine the update of the control policy.•The proposed strategy improves fuel economy and battery life while operates in real time.
ISSN:0360-5442
DOI:10.1016/j.energy.2022.124986