Statistical arbitrage trading on the intraday market using the asynchronous advantage actor–critic method

In this paper, we focus on statistical arbitrage trading opportunities involving the continuous exploitation of price differences arising during an intraday trading period with the option of closing positions on the balancing market. We aim to maximise the reward–risk ratio of an autonomous trading...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied energy 2022-05, Vol.314, p.118912, Article 118912
Hauptverfasser: Demir, Sumeyra, Stappers, Bart, Kok, Koen, Paterakis, Nikolaos G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we focus on statistical arbitrage trading opportunities involving the continuous exploitation of price differences arising during an intraday trading period with the option of closing positions on the balancing market. We aim to maximise the reward–risk ratio of an autonomous trading strategy. To find an optimal trading policy, we propose utilising the asynchronous advantage actor–critic (A3C) algorithm, a deep reinforcement learning method, with function approximators of two-headed shared deep neural networks. We enforce a risk-constrained trading strategy by limiting the maximum allowed position, and conduct state engineering and selection processes. We introduce a novel reward function and goal-based exploration, i.e. behaviour cloning. Our methodology is evaluated on a case study using the limit order book of the European single intraday coupled market (SIDC) available for the Dutch market area. The majority of hourly products on the test set return a profit. We expect our study to benefit electricity traders, renewable electricity producers and researchers who seek to implement state-of-art intelligent trading strategies. •Purely financial arbitrage trading strategy is explored for the intraday market.•A3C, a deep RL method, is utilised to develop a risk-constrained trading strategy.•State engineering and selection are implemented to increase the performance of A3C.•A novel reward function and behaviour cloning are proposed to motivate A3C agents.•A3C surpasses the benchmarks by returning higher revenue with lower transactions.
ISSN:0306-2619
1872-9118
DOI:10.1016/j.apenergy.2022.118912