Learning inducing points and uncertainty on molecular data by scalable variational Gaussian processes
Uncertainty control and scalability to large datasets are the two main issues for the deployment of Gaussian process (GP) models within the autonomous machine learning-based prediction pipelines in material science and chemistry. One way to address both of these issues is by introducing the latent i...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Uncertainty control and scalability to large datasets are the two main issues
for the deployment of Gaussian process (GP) models within the autonomous
machine learning-based prediction pipelines in material science and chemistry.
One way to address both of these issues is by introducing the latent inducing
point variables and choosing the right approximation for the marginal
log-likelihood objective. Here, we empirically show that variational learning
of the inducing points in a molecular descriptor space improves the prediction
of energies and atomic forces on two molecular dynamics datasets. First, we
show that variational GPs can learn to represent the configurations of the
molecules of different types that were not present within the initialization
set of configurations. We provide a comparison of alternative log-likelihood
training objectives and variational distributions. Among several evaluated
approximate marginal log-likelihood objectives, we show that predictive
log-likelihood provides excellent uncertainty estimates at the slight expense
of predictive quality. Furthermore, we extend our study to a large molecular
crystal system, showing that variational GP models perform well for predicting
atomic forces by efficiently learning a sparse representation of the dataset. |
---|---|
DOI: | 10.48550/arxiv.2207.07654 |