Bayesian Safe Policy Learning with Chance Constrained Optimization: Application to Military Security Assessment during the Vietnam War
Algorithmic decisions and recommendations are used in many high-stakes decision-making settings such as criminal justice, medicine, and public policy. We investigate whether it would have been possible to improve a security assessment algorithm employed during the Vietnam War, using outcomes measure...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Algorithmic decisions and recommendations are used in many high-stakes
decision-making settings such as criminal justice, medicine, and public policy.
We investigate whether it would have been possible to improve a security
assessment algorithm employed during the Vietnam War, using outcomes measured
immediately after its introduction in late 1969. This empirical application
raises several methodological challenges that frequently arise in high-stakes
algorithmic decision-making. First, before implementing a new algorithm, it is
essential to characterize and control the risk of yielding worse outcomes than
the existing algorithm. Second, the existing algorithm is deterministic, and
learning a new algorithm requires transparent extrapolation. Third, the
existing algorithm involves discrete decision tables that are difficult to
optimize over.
To address these challenges, we introduce the Average Conditional Risk
(ACRisk), which first quantifies the risk that a new algorithmic policy leads
to worse outcomes for subgroups of individual units and then averages this over
the distribution of subgroups. We also propose a Bayesian policy learning
framework that maximizes the posterior expected value while controlling the
posterior expected ACRisk. This framework separates the estimation of
heterogeneous treatment effects from policy optimization, enabling flexible
estimation of effects and optimization over complex policy classes. We
characterize the resulting chance-constrained optimization problem as a
constrained linear programming problem. Our analysis shows that compared to the
actual algorithm used during the Vietnam War, the learned algorithm assesses
most regions as more secure and emphasizes economic and political factors over
military factors. |
---|---|
DOI: | 10.48550/arxiv.2307.08840 |