Incorporating learning into direct policy search for flood risk management
Direct policy search (DPS) is a method for identifying optimal policies (i.e., rules) for managing a system in response to changing conditions. In this article, we introduce a new adaptive way to incorporate learning into DPS. The standard DPS approach identifies “robust” policies by optimizing thei...
Gespeichert in:
Veröffentlicht in: | Risk analysis 2024-01, Vol.44 (1), p.190-202 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Direct policy search (DPS) is a method for identifying optimal policies (i.e., rules) for managing a system in response to changing conditions. In this article, we introduce a new adaptive way to incorporate learning into DPS. The standard DPS approach identifies “robust” policies by optimizing their average performance over a large ensemble of future states of the world (SOW). Our approach exploits information gained over time, updating prior beliefs about the kind of SOW being experienced. We first run the standard DPS approach multiple times, but with varying sets of weights applied to the SOWs when calculating average performance. Adaptive “metapolicies” then further improve performance by specifying how control of the system should switch between policies identified using different weight sets, depending on our updated beliefs about the relative likelihood of being in certain SOWs. We outline the general method and illustrate it using a case study of efficient dike heightening that simultaneously minimizes protection system costs and flood damage resulting from rising sea levels and storm surge. The solutions identified by our adaptive algorithm dominate the standard DPS on these two objectives, with an average marginal damage reduction of 35.1% for policies with similar costs; improvements are largest in SOWs with relatively lower sea level rise. We also evaluate how performance varies under different ways of implementing the algorithm, such as changing the frequency with which beliefs are updated. |
---|---|
ISSN: | 0272-4332 1539-6924 |
DOI: | 10.1111/risa.14136 |