Fast Emulation of Expensive Simulations Using Approximate Gaussian Processes

Steven Stetzler, University of Washington

Photo of Steven Stetzler

Fitting a theoretical model to experimental data typically requires evaluation of the model at various points in its input space. When the model is a slow-to-compute physics simulation, it becomes infeasible to evaluate it an arbitrary number of times. This fact makes Bayesian model fitting using Markov chain Monte Carlo methods infeasible, since producing accurate posterior distributions of the best-fit model parameters typically requires thousands (or millions) of model evaluations. To remedy this, a model that predicts the simulation output, an "emulator," can be used in lieu of the full simulation during model fitting. The emulator of choice in previous work is the Gaussian process, a flexible, non-linear model that provides both a predictive mean and variance at each input point. The Gaussian process works well for small amounts of training data (less than 10^3), but becomes slow to train and use for prediction when the data set size becomes large. Various methods can be used to speed up the Gaussian process in the medium-to-large data set regime (greater than 10^5), trading away predictive accuracy for drastically reduced runtimes. In this work, we present an analysis, focusing on the accuracy-runtime tradeoff, of several of these approximate Gaussian process methods in emulating nuclear binding energies predicted by density functional theory (DFT) models using the UNEDF1 and UNEDF2 parameterizations of the Skyrme energy functional. This work allows calibration of the UNEDF model parameters to experimental data in a Bayesian manner to be computationally feasible.

Abstract Author(s): Steven Stetzler, Michael Grosskopf