Invited Review
Kriging metamodeling in simulation: A review

https://doi.org/10.1016/j.ejor.2007.10.013Get rights and content

Abstract

This article reviews Kriging (also called spatial correlation modeling). It presents the basic Kriging assumptions and formulas—contrasting Kriging and classic linear regression metamodels. Furthermore, it extends Kriging to random simulation, and discusses bootstrapping to estimate the variance of the Kriging predictor. Besides classic one-shot statistical designs such as Latin Hypercube Sampling, it reviews sequentialized and customized designs for sensitivity analysis and optimization. It ends with topics for future research.

Introduction

Metamodels are also known as response surfaces, surrogates, emulators, auxiliary models, etc. By definition, a metamodel is an approximation of the Input/Output (I/O) function that is implied by the underlying simulation model. Metamodels are fitted to the I/O data produced by the experiment with the simulation model. This simulation model may be either deterministic or random (stochastic). Note that simulation is applied in many different disciplines, so the terminology varies widely; therefore, this article gives several terms for the same concept.

Examples of deterministic simulation are models of airplanes, automobiles, TV sets, and computer chips—applied in Computer Aided Engineering (CAE) and Computer Aided Design (CAD) at Boeing, General Motors, Philips, etc.. Detailed examples are the helicopter test example in [4], the vehicle safety example in [18], and the other examples in [30], [47].

Deterministic simulations give the same output for the same input. However, deterministic simulations may show numerical inaccuracies; i.e., a minor (infinitesimal) change in the input produces a major change in the output. A well-known mathematical example is the inversion of a matrix that is nearly singular (ill-conditioned). Simulation examples are discussed in [12], [15], [16], [50]. These inaccuracies may make deterministic simulation related to random simulation.

Random simulations use Pseudo-Random Numbers (PRNs) inside their models, so simulations of the same input combination give different outputs (unless, the PRN streams are identical; i.e., the PRN seeds are identical). Examples are models of logistic and telecommunication systems. Details are given in textbooks on discrete-event simulation, such as [1], [32]. This article covers both deterministic and random simulation!

Most publications on metamodels focus on low-order polynomial regression. This type of metamodel may be used for the explanation of the underlying simulation model’s behavior, and for prediction of the expected simulation output for combinations of input values that have not yet been simulated (inputs are also called factors; combinations are also called points or scenarios). The final goals of metamodeling may be Validation and Verification (V&V) of the simulation model, sensitivity or what-if analysis of that model, and optimization of the simulated system; see [25], [27], [32].

This article focuses on Kriging metamodels. Typically, Kriging models are fitted to data that are obtained for larger experimental areas than the areas used in low-order polynomial regression; i.e., Kriging models are global (rather than local). These models are used for prediction; the final goals are sensitivity analysis and optimization.

Kriging was originally developed in geostatistics (also known as spatial statistics) by the South African mining engineer called Krige (who is still alive). The mathematics were further developed by Matheron; see his 1963 article [39]. A classic geostatistics textbook is Cressie’s 1993 book [9]. More recent are the references 17 through 21 in [38].

Later on, Kriging models were applied to the I/O data of deterministic simulation models. These models have k-dimensional input where k is a given positive integer (whereas geostatistics considers only two-dimensional input); see Sacks et al.’s classic 1989 article [43]. More recent publications are [24], [44], [49].

Only in 2003, Van Beers and Kleijnen [51] started with the application of Kriging to random simulation models. Although Kriging in random simulation is still rare, the track record that Kriging achieved in deterministic simulation holds great promise for Kriging in random simulation.

Note: Searching for ‘Kriging’ via Google on August 20, 2007 gave 661,000 hits, which illustrates the popularity of this mathematical method. Searching within these pages for ‘Operations Research’ gave 134,000 hits.

The goal of this article is to review the basics of Kriging, and some recent extensions. These basics may convince analysts in deterministic or random simulation of the potential usefulness of Kriging. Furthermore, the review of recent extensions may also interest those analysts who are already familiar with Kriging in simulation.

The rest of this article is organized as follows. Section 2 compares Kriging with linear regression, and covers the basic assumptions and formulas of Kriging. Section 3 presents some relatively new results, including Kriging in random simulation and estimating the variance of the Kriging predictor through bootstrapping. Section 4 includes one-shot and sequential statistical designs for Kriging metamodels, distinguishing between sensitivity analysis and optimization. Section 5 presents conclusions and topics for future research.

Section snippets

Kriging versus linear regression

This section first highlights the differences between classic linear regression—especially low-order polynomial regression—and modern Kriging. This article focuses on a single (univariate, scalar) simulation output, because most published Kriging models also assume such output. In practice, a simulation model has multiple (multivariate, vector) output, but univariate Kriging may then be applied per output variable.

Assuming a single output, a general black-box representation of a simulation

Kriging: New results

This section first summarizes some new results for Kriging applied in random (not deterministic) simulation. Next, it discusses the problems caused by the estimation of the optimal Kriging weights.

Designs for Kriging

Simulation analysts often use Latin Hypercube Sampling (LHS) to generate the I/O simulation data to which they fit a Kriging model. Note that LHS was not invented for Kriging but for Risk Analysis; see [26].

LHS assumes that an adequate metamodel is more complicated than a low-order polynomial such as (2), which is assumed by classic designs such as fractional factorials. LHS, however, does not assume a specific metamodel or simulation model. Instead, LHS focuses on the design space formed by

Conclusions and future research

This article may be summarized as follows.

  • The article emphasized the basic assumption of Kriging, namely old simulation observations closer to the new point to be predicted, should receive more weight. This assumption is formalized through a stationary covariance process with correlations that decrease as the distances between the inputs of observations increase.

  • Moreover, the Kriging model is an exact interpolator; i.e., predicted outputs equal observed simulated outputs at old points—which is

Acknowledgements

I thank the anonymous referee for very useful comments on the previous version, including 20 references. I also thank my colleagues Dick den Hertog and Wim van Beers (both at Tilburg University) for their comments on earlier versions of this article.

References (55)

  • K. Chaloner et al.

    Bayesian experimental design: A review

    Statistical Science

    (1995)
  • S.B. Crary, D.M. Woodcock, A. Hieke, Designing efficient computer experiments for metamodel generation, in: Proceedings...
  • N.A.C. Cressie

    Statistics for Spatial Data: Revised Edition

    (1993)
  • D. Den Hertog et al.

    The correct Kriging variance estimated by bootstrapping

    Journal of the Operational Research Society

    (2006)
  • B. Efron et al.

    An Introduction to the Bootstrap

    (1993)
  • J.W. Free et al.

    Approximation of computationally expensive and noisy functions for constrained nonlinear optimization

    Journal of Mechanisms, Transmissions, and Automation in Design

    (1987)
  • S.E. Gano et al.

    Update strategies for Kriging models for using in variable fidelity optimization

    Structural and Multidisciplinary Optimization

    (2006)
  • P.E. Gill et al.

    Practical Optimization

    (2000)
  • A.A. Giunta, J.M. Dudley, R. Narducci, B. Grossman, R.T. Haftka, W.H. Mason, L.T. Watson. Noisy aerodynamic response...
  • A. Giunta, L.T. Watson , A comparison of approximation modeling techniques: Polynomial versus interpolating models. in:...
  • A.A. Giunta, S.F. Wojtkiewicz, M.S. Eldred, Overview of modern design of experiments methods for computational...
  • L. Gu, A comparison of polynomial based regression models in vehicle safety analysis, in: A. Diaz (Ed.), ASME Design...
  • R. Haftka et al.

    Optimization and experiments: A survey

    Applied Mechanics Review

    (1998)
  • R.K.S. Hankin

    Introducing BACCO, an R bundle for Bayesian analysis of computer code output

    Journal of Statistical Software

    (2005)
  • D. Huang, T.T. Allen, W. Notz, R.A. Miller, Sequential Kriging optimization using multiple fidelity evaluation, Working...
  • D. Huang et al.

    Global optimization of stochastic black-box systems via sequential Kriging meta-models

    Journal of Global Optimization

    (2006)
  • R. Jin, W. Chen, A. Sudjianto, On sequential sampling for global metamodeling in engineering design, in: Proceedings of...
  • Cited by (1032)

    • Sparse polynomial chaos expansion for universal stochastic kriging

      2024, Journal of Computational and Applied Mathematics
    View all citing articles on Scopus
    View full text