The dangers of sparse sampling for the quantification of margin and uncertainty

Activities such as global sensitivity analysis, statistical effect screening, uncertainty propagation, or model calibration have become integral to the Verification and Validation (V&V) of numerical models and computer simulations. One of the goals of V&V is to assess prediction accuracy and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Reliability engineering & system safety 2011-09, Vol.96 (9), p.1220-1231
Hauptverfasser:	Hemez, François M., Atamturktur, Sezer
Format:	Artikel
Sprache:	eng
Schlagworte:	Gaussian process modeling Sparse sampling Statistical emulator
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Activities such as global sensitivity analysis, statistical effect screening, uncertainty propagation, or model calibration have become integral to the Verification and Validation (V&V) of numerical models and computer simulations. One of the goals of V&V is to assess prediction accuracy and uncertainty, which feeds directly into reliability analysis or the Quantification of Margin and Uncertainty (QMU) of engineered systems. Because these analyses involve multiple runs of a computer code, they can rapidly become computationally expensive. An alternative to Monte Carlo-like sampling is to combine a design of computer experiments to meta-modeling, and replace the potentially expensive computer simulation by a fast-running emulator. The surrogate can then be used to estimate sensitivities, propagate uncertainty, and calibrate model parameters at a fraction of the cost it would take to wrap a sampling algorithm or optimization solver around the physics-based code. Doing so, however, offers the risk to develop an incorrect emulator that erroneously approximates the “true-but-unknown” sensitivities of the physics-based code. We demonstrate the extent to which this occurs when Gaussian Process Modeling (GPM) emulators are trained in high-dimensional spaces using too-sparsely populated designs-of-experiments. Our illustration analyzes a variant of the Rosenbrock function in which several effects are made statistically insignificant while others are strongly coupled, therefore, mimicking a situation that is often encountered in practice. In this example, using a combination of GPM emulator and design-of-experiments leads to an incorrect approximation of the function. A mathematical proof of the origin of the problem is proposed. The adverse effects that too-sparsely populated designs may produce are discussed for the coverage of the design space, estimation of sensitivities, and calibration of parameters. This work attempts to raise awareness to the potential dangers of not allocating enough resources when exploring a design space to develop fast-running emulators. ► Statistical emulators developed in high-dimensional spaces can be inaccurate. ► This happens when using too-sparsely populated datasets to develop an emulator. ► The accuracy of Gaussian Process Models (GPM) is studied in a 17-dimension space. ► A mathematical proof explains the root-cause of this phenomenon for GPM emulators.
ISSN:	0951-8320 1879-0836
DOI:	10.1016/j.ress.2011.02.015