Dynamic joint optimization of power generation and voyage scheduling in ship power system based on deep reinforcement learning
•The joint optimization problem of AES aims to minimize generator operation and battery loss costs.•The MSD3QN method integrates three techniques to enhance the training performance of DQN agent.•Based on action classification, the bi-level MSD3QN method is developed to optimize power generation and...
Gespeichert in:
Veröffentlicht in: | Electric power systems research 2024-04, Vol.229, p.110165, Article 110165 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •The joint optimization problem of AES aims to minimize generator operation and battery loss costs.•The MSD3QN method integrates three techniques to enhance the training performance of DQN agent.•Based on action classification, the bi-level MSD3QN method is developed to optimize power generation and sailing speed.•Comparative studies exhibit the proposed method outperforms in optimization performance and scalability.
The joint optimization strategy of power generation and voyage scheduling for the ship power system (SPS) is crucial for enhancing the flexibility and economy of the all-electric ship (AES). However, traditional optimization-based methods have limitations in terms of robustness and the requirement to model uncertainty. This paper proposes a novel deep reinforcement learning (DRL) method to address the joint optimization problem of AES under uncertain navigation conditions and variable load demands. The joint optimization model of AES is formulated with the goal of minimizing generator operation and battery degradation costs. Then, a deep Q network (DQN) integrated with dueling network architecture, double Q-learning, and multi-step bootstrap technology, what is called multi-step dueling double DQN (MSD3QN) algorithm, is applied to optimize power generation and sailing speed. Moreover, by incorporating an action classification mechanism and hierarchical optimization concept, the MSD3QN algorithm is combined with an optimization solver to form the bi-level MSD3QN algorithm, which improves the optimization performance of the agent. The proposed bi-level MSD3QN method enables end-to-end control from measured data to operating instructions. Two case studies are conducted utilizing operational data obtained from SPS. The numerical results validate the effectiveness, dynamic optimization performance, and scalability of the bi-level MSD3QN method. |
---|---|
ISSN: | 0378-7796 1873-2046 |
DOI: | 10.1016/j.epsr.2024.110165 |