Focusing on Difficult Directions for Learning HMC Trajectory Lengths
Hamiltonian Monte Carlo (HMC) is a premier Markov Chain Monte Carlo (MCMC) algorithm for continuous target distributions. Its full potential can only be unleashed when its problem-dependent hyperparameters are tuned well. The adaptation of one such hyperparameter, trajectory length ($\tau$), has bee...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Hamiltonian Monte Carlo (HMC) is a premier Markov Chain Monte Carlo (MCMC)
algorithm for continuous target distributions. Its full potential can only be
unleashed when its problem-dependent hyperparameters are tuned well. The
adaptation of one such hyperparameter, trajectory length ($\tau$), has been
closely examined by many research programs with the No-U-Turn Sampler (NUTS)
coming out as the preferred method in 2011. A decade later, the evolving
hardware profile has lead to the proliferation of personal and cloud based SIMD
hardware in the form of Graphics and Tensor Processing Units (GPUs, TPUs) which
are hostile to certain algorithmic details of NUTS. This has opened up a hole
in the MCMC toolkit for an algorithm that can learn $\tau$ while maintaining
good hardware utilization. In this work we build on recent advances along this
direction and introduce SNAPER-HMC, a SIMD-accelerator-friendly adaptive-MCMC
scheme for learning $\tau$. The algorithm maximizes an upper bound on
per-gradient effective sample size along an estimated principal component. We
empirically show that SNAPER-HMC is stable when combined with mass-matrix
adaptation, and is tolerant of certain pathological target distribution
covariance spectra while providing excellent long and short run sampling
efficiency. We provide a complete implementation for continuous multi-chain
adaptive HMC combining trajectory learning with standard step-size and
mass-matrix adaptation in one turnkey inference package. |
---|---|
DOI: | 10.48550/arxiv.2110.11576 |