Online Residential Demand Response via Contextual Multi-Armed Bandits
Residential loads have great potential to enhance the efficiency and reliability of electricity systems via demand response (DR) programs. One major challenge in residential DR is to handle the unknown and uncertain customer behaviors. Previous works use learning techniques to predict customer DR be...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Residential loads have great potential to enhance the efficiency and
reliability of electricity systems via demand response (DR) programs. One major
challenge in residential DR is to handle the unknown and uncertain customer
behaviors. Previous works use learning techniques to predict customer DR
behaviors, while the influence of time-varying environmental factors is
generally neglected, which may lead to inaccurate prediction and inefficient
load adjustment. In this paper, we consider the residential DR problem where
the load service entity (LSE) aims to select an optimal subset of customers to
maximize the expected load reduction with a financial budget. To learn the
uncertain customer behaviors under the environmental influence, we formulate
the residential DR as a contextual multi-armed bandit (MAB) problem, and the
online learning and selection (OLS) algorithm based on Thompson sampling is
proposed to solve it. This algorithm takes the contextual information into
consideration and is applicable to complicated DR settings. Numerical
simulations are performed to demonstrate the learning effectiveness of the
proposed algorithm. |
---|---|
DOI: | 10.48550/arxiv.2003.03627 |