Non-stationarity in multiagent reinforcement learning in electricity market simulation

The design of electricity markets may be facilitated by simulating actors’ behaviors. Recent studies model human decision-makers within markets as agents which learn strategies that maximize expected profits. This work investigates the problem of ‘non-stationarity’ in the context of market simulatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Electric power systems research 2024-10, Vol.235, p.110712, Article 110712
Hauptverfasser: Renshaw-Whitman, Charles, Zobernig, Viktor, Cremer, Jochen L., de Vries, Laurens
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The design of electricity markets may be facilitated by simulating actors’ behaviors. Recent studies model human decision-makers within markets as agents which learn strategies that maximize expected profits. This work investigates the problem of ‘non-stationarity’ in the context of market simulations, a problem with the learning-algorithms used by such studies which results in agents behaving irrationally, thus limiting the studies’ applicability to real-world strategic behavior. Isolating the source of the problem for a day-ahead electricity market, this paper proposes methods which meliorate this problem in simple test-cases, and proves requirements under which ‘centralized-training, decentralized-execution’ value-learning methods will converge to correct behavior in general. Subsequently, this paper proposes a framework for ‘adversarial market design’ that includes the market-designer as an agent. This allows the optimization of market-designs subject to possibly strategic behavior of participating firms — in turn enabling the automated selection of the optimal market from any set of markets. •Reinforcement learning can be used to simulate firm behavior in electricity markets.•Reinforcement-learning algorithms may fail to converge in cases with multiple agents.•The value function must account for other firms’ actions to guarantee convergence.•‘Adversarial market design’ uses reinforcement learning to create new market designs.
ISSN:0378-7796
1873-2046
DOI:10.1016/j.epsr.2024.110712