Domain-adapted Learning and Imitation: DRL for Power Arbitrage
In this paper, we discuss the Dutch power market, which is comprised of a day-ahead market and an intraday balancing market that operates like an auction. Due to fluctuations in power supply and demand, there is often an imbalance that leads to different prices in the two markets, providing an oppor...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we discuss the Dutch power market, which is comprised of a
day-ahead market and an intraday balancing market that operates like an
auction. Due to fluctuations in power supply and demand, there is often an
imbalance that leads to different prices in the two markets, providing an
opportunity for arbitrage. To address this issue, we restructure the problem
and propose a collaborative dual-agent reinforcement learning approach for this
bi-level simulation and optimization of European power arbitrage trading. We
also introduce two new implementations designed to incorporate domain-specific
knowledge by imitating the trading behaviours of power traders. By utilizing
reward engineering to imitate domain expertise, we are able to reform the
reward system for the RL agent, which improves convergence during training and
enhances overall performance. Additionally, the tranching of orders increases
bidding success rates and significantly boosts profit and loss (P&L). Our study
demonstrates that by leveraging domain expertise in a general learning problem,
the performance can be improved substantially, and the final integrated
approach leads to a three-fold improvement in cumulative P&L compared to the
original agent. Furthermore, our methodology outperforms the highest benchmark
policy by around 50% while maintaining efficient computational performance. |
---|---|
DOI: | 10.48550/arxiv.2301.08360 |