Invited ReviewKriging metamodeling in simulation: A review
Introduction
Metamodels are also known as response surfaces, surrogates, emulators, auxiliary models, etc. By definition, a metamodel is an approximation of the Input/Output (I/O) function that is implied by the underlying simulation model. Metamodels are fitted to the I/O data produced by the experiment with the simulation model. This simulation model may be either deterministic or random (stochastic). Note that simulation is applied in many different disciplines, so the terminology varies widely; therefore, this article gives several terms for the same concept.
Examples of deterministic simulation are models of airplanes, automobiles, TV sets, and computer chips—applied in Computer Aided Engineering (CAE) and Computer Aided Design (CAD) at Boeing, General Motors, Philips, etc.. Detailed examples are the helicopter test example in [4], the vehicle safety example in [18], and the other examples in [30], [47].
Deterministic simulations give the same output for the same input. However, deterministic simulations may show numerical inaccuracies; i.e., a minor (infinitesimal) change in the input produces a major change in the output. A well-known mathematical example is the inversion of a matrix that is nearly singular (ill-conditioned). Simulation examples are discussed in [12], [15], [16], [50]. These inaccuracies may make deterministic simulation related to random simulation.
Random simulations use Pseudo-Random Numbers (PRNs) inside their models, so simulations of the same input combination give different outputs (unless, the PRN streams are identical; i.e., the PRN seeds are identical). Examples are models of logistic and telecommunication systems. Details are given in textbooks on discrete-event simulation, such as [1], [32]. This article covers both deterministic and random simulation!
Most publications on metamodels focus on low-order polynomial regression. This type of metamodel may be used for the explanation of the underlying simulation model’s behavior, and for prediction of the expected simulation output for combinations of input values that have not yet been simulated (inputs are also called factors; combinations are also called points or scenarios). The final goals of metamodeling may be Validation and Verification (V&V) of the simulation model, sensitivity or what-if analysis of that model, and optimization of the simulated system; see [25], [27], [32].
This article focuses on Kriging metamodels. Typically, Kriging models are fitted to data that are obtained for larger experimental areas than the areas used in low-order polynomial regression; i.e., Kriging models are global (rather than local). These models are used for prediction; the final goals are sensitivity analysis and optimization.
Kriging was originally developed in geostatistics (also known as spatial statistics) by the South African mining engineer called Krige (who is still alive). The mathematics were further developed by Matheron; see his 1963 article [39]. A classic geostatistics textbook is Cressie’s 1993 book [9]. More recent are the references 17 through 21 in [38].
Later on, Kriging models were applied to the I/O data of deterministic simulation models. These models have k-dimensional input where k is a given positive integer (whereas geostatistics considers only two-dimensional input); see Sacks et al.’s classic 1989 article [43]. More recent publications are [24], [44], [49].
Only in 2003, Van Beers and Kleijnen [51] started with the application of Kriging to random simulation models. Although Kriging in random simulation is still rare, the track record that Kriging achieved in deterministic simulation holds great promise for Kriging in random simulation.
Note: Searching for ‘Kriging’ via Google on August 20, 2007 gave 661,000 hits, which illustrates the popularity of this mathematical method. Searching within these pages for ‘Operations Research’ gave 134,000 hits.
The goal of this article is to review the basics of Kriging, and some recent extensions. These basics may convince analysts in deterministic or random simulation of the potential usefulness of Kriging. Furthermore, the review of recent extensions may also interest those analysts who are already familiar with Kriging in simulation.
The rest of this article is organized as follows. Section 2 compares Kriging with linear regression, and covers the basic assumptions and formulas of Kriging. Section 3 presents some relatively new results, including Kriging in random simulation and estimating the variance of the Kriging predictor through bootstrapping. Section 4 includes one-shot and sequential statistical designs for Kriging metamodels, distinguishing between sensitivity analysis and optimization. Section 5 presents conclusions and topics for future research.
Section snippets
Kriging versus linear regression
This section first highlights the differences between classic linear regression—especially low-order polynomial regression—and modern Kriging. This article focuses on a single (univariate, scalar) simulation output, because most published Kriging models also assume such output. In practice, a simulation model has multiple (multivariate, vector) output, but univariate Kriging may then be applied per output variable.
Assuming a single output, a general black-box representation of a simulation
Kriging: New results
This section first summarizes some new results for Kriging applied in random (not deterministic) simulation. Next, it discusses the problems caused by the estimation of the optimal Kriging weights.
Designs for Kriging
Simulation analysts often use Latin Hypercube Sampling (LHS) to generate the I/O simulation data to which they fit a Kriging model. Note that LHS was not invented for Kriging but for Risk Analysis; see [26].
LHS assumes that an adequate metamodel is more complicated than a low-order polynomial such as (2), which is assumed by classic designs such as fractional factorials. LHS, however, does not assume a specific metamodel or simulation model. Instead, LHS focuses on the design space formed by
Conclusions and future research
This article may be summarized as follows.
- •
The article emphasized the basic assumption of Kriging, namely old simulation observations closer to the new point to be predicted, should receive more weight. This assumption is formalized through a stationary covariance process with correlations that decrease as the distances between the inputs of observations increase.
- •
Moreover, the Kriging model is an exact interpolator; i.e., predicted outputs equal observed simulated outputs at old points—which is
Acknowledgements
I thank the anonymous referee for very useful comments on the previous version, including 20 references. I also thank my colleagues Dick den Hertog and Wim van Beers (both at Tilburg University) for their comments on earlier versions of this article.
References (55)
- et al.
Comparison of designs for computer experiments
Journal of Statistical Planning and Inference
(2006) - et al.
A review of design and modeling in computer experiments
- et al.
A methodology for the fitting and validation of metamodels in simulation
European Journal of Operational Research
(2000) - et al.
Robustness of Kriging when interpolating in random simulation with heterogeneous variances: Some experiments
European Journal of Operational Research
(2005) - et al.
Computer experiments
- et al.
Customized sequential designs for random simulation experiments: Kriging metamodeling and bootstrapping
European Journal of Operational Research
(2008) - et al.
Discrete-Event System Simulation
(2005) - B. Bettonvil, E. del Castillo, J.P.C. Kleijnen, Statistical testing of optimality conditions in multiresponse...
- W.E. Biles, J.P.C. Kleijnen, W.C.M. van Beers, I. van Nieuwenhuyse, Kriging metamodels in constrained simulation...
- A.J. Booker, J.E. Dennis, P.D. Frank, D.B. Serafini, V. Torczon, Optimization using surrogate objectives on a...
Bayesian experimental design: A review
Statistical Science
Statistics for Spatial Data: Revised Edition
The correct Kriging variance estimated by bootstrapping
Journal of the Operational Research Society
An Introduction to the Bootstrap
Approximation of computationally expensive and noisy functions for constrained nonlinear optimization
Journal of Mechanisms, Transmissions, and Automation in Design
Update strategies for Kriging models for using in variable fidelity optimization
Structural and Multidisciplinary Optimization
Practical Optimization
Optimization and experiments: A survey
Applied Mechanics Review
Introducing BACCO, an R bundle for Bayesian analysis of computer code output
Journal of Statistical Software
Global optimization of stochastic black-box systems via sequential Kriging meta-models
Journal of Global Optimization
Cited by (1032)
Sparse polynomial chaos expansion for universal stochastic kriging
2024, Journal of Computational and Applied MathematicsEnhancing radial strain uniformity of thin-walled cylinder by the dual asymmetric active counter-roller spinning process
2024, Journal of Materials Processing TechnologyField detection of indoor fire threat situation based on LSTM-Kriging network
2024, Journal of Building EngineeringManaging network congestion with link-based incentives: A surrogate-based optimization approach
2024, Transportation Research Part A: Policy and PracticeRobust parameter design for 3D printing process using stochastic computer model
2024, Simulation Modelling Practice and Theory