Real-time energy management for HEV combining naturalistic driving data and deep reinforcement learning with high generalization
Generalization to unseen environments is still a challenge for deep reinforcement learning (DRL)-based energy management strategies (EMSs). This paper proposes a real-time EMS with high generalization for a light-duty hybrid electric vehicle (HEV) from two perspectives: enhancing the generalization...
Gespeichert in:
Veröffentlicht in: | Applied energy 2025-01, Vol.377, p.124350, Article 124350 |
---|---|
Hauptverfasser: | , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Generalization to unseen environments is still a challenge for deep reinforcement learning (DRL)-based energy management strategies (EMSs). This paper proposes a real-time EMS with high generalization for a light-duty hybrid electric vehicle (HEV) from two perspectives: enhancing the generalization of the DRL algorithm and improving the accuracy of application scenario representation in the training environment. The enhanced DRL algorithm named ATSAC can adjust the update frequency and learning rate of SAC automatically to improve the generalization. With the advancement of naturalistic driving big data (NDBD) and machine learning, a specific training cycle is synthesized based on NDBD to reflect an urban-suburban real-world driving scenario more accurately. By the comprehensive comparison with SAC and TD3 based EMSs applied to unseen driving scenarios, the proposed algorithm achieves significant improvement in computational efficiency, optimality, and generalization. The results show that the computational efficiency of ATSAC is increased by 52.32% compared to SAC. The negative total reward (NTR) of ATSAC is decreased by 18.22% and 69.81% compared to SAC and TD3, respectively. Further analysis shows that the EMS trained through the synthetic driving cycle obtains 18.37% lower NTR than WLTC which demonstrates that the synthetic method can reflect the state transition probability of real-world driving scenarios better than WLTC.
•A novel DRL algorithm with high generalization is researched for energy management.•An SODC is synthesized through big data and machine learning as the training cycle.•Iteration dropout can avoid overfitting to improve the generalization of DRL.•Adaptive learning rate can balance the exploration and exploitation. |
---|---|
ISSN: | 0306-2619 |
DOI: | 10.1016/j.apenergy.2024.124350 |