Global inducing point variational posteriors for Bayesian neural networks and deep Gaussian processes
We consider the optimal approximate posterior over the top-layer weights in a Bayesian neural network for regression, and show that it exhibits strong dependencies on the lower-layer weights. We adapt this result to develop a correlated approximate posterior over the weights at all layers in a Bayes...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We consider the optimal approximate posterior over the top-layer weights in a
Bayesian neural network for regression, and show that it exhibits strong
dependencies on the lower-layer weights. We adapt this result to develop a
correlated approximate posterior over the weights at all layers in a Bayesian
neural network. We extend this approach to deep Gaussian processes, unifying
inference in the two model classes. Our approximate posterior uses learned
"global" inducing points, which are defined only at the input layer and
propagated through the network to obtain inducing inputs at subsequent layers.
By contrast, standard, "local", inducing point methods from the deep Gaussian
process literature optimise a separate set of inducing inputs at every layer,
and thus do not model correlations across layers. Our method gives
state-of-the-art performance for a variational Bayesian method, without data
augmentation or tempering, on CIFAR-10 of 86.7%, which is comparable to SGMCMC
without tempering but with data augmentation (88% in Wenzel et al. 2020). |
---|---|
DOI: | 10.48550/arxiv.2005.08140 |