Policy Gradient Importance Sampling for Bayesian Inference

In this paper, we propose a novel adaptive importance sampling (AIS) algorithm for probabilistic inference. The sampler learns a proposal distribution adaptation strategy by framing AIS as a reinforcement learning problem. Under this structure, the proposal distribution of the sampler is treated as...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on signal processing 2021, Vol.69, p.4245-4256
Hauptverfasser: El-Laham, Yousef, Bugallo, Monica F.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we propose a novel adaptive importance sampling (AIS) algorithm for probabilistic inference. The sampler learns a proposal distribution adaptation strategy by framing AIS as a reinforcement learning problem. Under this structure, the proposal distribution of the sampler is treated as an agent whose state is controlled using a parameterized policy. At each iteration of the algorithm, the agent earns a reward that is related to its contribution to the variance of the AIS estimator of the normalization constant of the target distribution. Policy gradient methods are employed to learn a locally optimal policy that maximizes the expected value of the sum of all rewards. Numerical simulations on two different examples demonstrate promising results for the future application of the proposed method to complex Bayesian models.
ISSN:1053-587X
1941-0476
DOI:10.1109/TSP.2021.3093792