Automated deep reinforcement learning for real-time scheduling strategy of multi-energy system integrated with post-carbon and direct-air carbon captured system
[Display omitted] •PCC and DAC are considered as part of the energy system for the feasibility of ZCMES.•A soft-actor DRL is developed for the real-time scheduling of the proposed ZCMES.•An automatic hyperparameter tuning feature is integrated with the proposed DRL.•The configuration with PCCS and s...
Gespeichert in:
Veröffentlicht in: | Applied energy 2023-03, Vol.333, p.120633, Article 120633 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | [Display omitted]
•PCC and DAC are considered as part of the energy system for the feasibility of ZCMES.•A soft-actor DRL is developed for the real-time scheduling of the proposed ZCMES.•An automatic hyperparameter tuning feature is integrated with the proposed DRL.•The configuration with PCCS and solid-sorbent DACS is considered the most suitable configuration.•The proposed configuration achieved the highest CO2 captured-released ratio (CCRR) and the least CO2 released indicator.
The carbon-capturing process with the aid of CO2 removal technology (CDRT) has been recognised as an alternative and a prominent approach to deep decarbonisation. However, the main hindrance is the enormous energy demand and the economic implication of CDRT if not effectively managed. Hence, a novel deep reinforcement learning agent (DRL), integrated with an automated hyperparameter selection feature, is proposed in this study for the real-time scheduling of a multi-energy system (MES) coupled with CDRT. Post-carbon capture systems (PCCS) and direct-air capture systems (DACS) are considered CDRT. Various possible configurations are evaluated using real-time multi-energy data of a district in Arizona, the United States, and CDRT parameters from manufacturers' catalogues and pilot project documentation. The simulation results validate that an optimised soft-actor critic (SAC) DRL algorithm outperformed the Twin-delayed deep deterministic policy gradient (TD3) algorithm due to its maximum entropy feature. We then trained four (4) SAC DRL agents, equivalent to the number of considered case studies, using optimised hyperparameter values and deployed them in real time for evaluation. The results show that the proposed DRL agent can meet the prosumers' multi-energy demand and schedule the CDRT energy demand economically without specified constraints violation. Also, the proposed DRL agent outperformed rule-based scheduling by 23.65%. However, the configuration with PCCS and solid-sorbent DACS is considered the most suitable configuration with a high CO2 captured-released ratio (CCRR) of 38.54, low CO2 released indicator (CRI) value of 2.53, and a 36.5% reduction in CDR cost due to waste heat utilisation and high absorption capacity of the selected sorbent. However, the adoption of CDRT is not economically viable at the current carbon price. Finally, we showed that CDRT would be attractive at a carbon price of 400-450USD/ton with the provision of tax incentives by the policymakers. |
---|---|
ISSN: | 0306-2619 1872-9118 |
DOI: | 10.1016/j.apenergy.2022.120633 |