Designing accurate emulators for scientific processes using calibration-driven deep models

Predictive models that accurately emulate complex scientific processes can achieve speed-ups over numerical simulators or experiments and at the same time provide surrogates for improving the subsequent analysis. Consequently, there is a recent surge in utilizing modern machine learning methods to b...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Nature communications 2020-11, Vol.11 (1), p.5622-5622, Article 5622
Hauptverfasser:	Thiagarajan, Jayaraman J., Venkatesh, Bindya, Anirudh, Rushil, Bremer, Peer-Timo, Gaffney, Jim, Anderson, Gemma, Spears, Brian
Format:	Artikel
Sprache:	eng
Schlagworte:	639/705/1046 639/705/117 Calibration Computer science Emulators Humanities and Social Sciences Learning algorithms Machine learning MATHEMATICS AND COMPUTING multidisciplinary Noise Optimization Prediction models Science Science (multidisciplinary) Scientific data Simulators Skewed distributions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Predictive models that accurately emulate complex scientific processes can achieve speed-ups over numerical simulators or experiments and at the same time provide surrogates for improving the subsequent analysis. Consequently, there is a recent surge in utilizing modern machine learning methods to build data-driven emulators. In this work, we study an often overlooked, yet important, problem of choosing loss functions while designing such emulators. Popular choices such as the mean squared error or the mean absolute error are based on a symmetric noise assumption and can be unsuitable for heterogeneous data or asymmetric noise distributions. We propose Learn-by-Calibrating, a novel deep learning approach based on interval calibration for designing emulators that can effectively recover the inherent noise structure without any explicit priors. Using a large suite of use-cases, we demonstrate the efficacy of our approach in providing high-quality emulators, when compared to widely-adopted loss function choices, even in small-data regimes. The success of machine learning for scientific discovery normally depends on how well the inherent assumptions match the problem in hand. Here, Thiagarajan et al. alleviate this constraint by allowing the change of optimization criterion in a data-driven approach to emulate complex scientific processes.
ISSN:	2041-1723 2041-1723
DOI:	10.1038/s41467-020-19448-8