Skip to main content
Log in

Ensemble of surrogates with recursive arithmetic average

  • Research Paper
  • Published:
Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Abstract

Surrogate models are often used to replace expensive simulations of engineering problems. The common approach is to construct a series of metamodels based on a training set, and then, from these surrogates, pick out the best one with the highest accuracy as an approximation of the computationally intensive simulation. However, because the choice of approximate model depends on design of experiments (DOEs), the traditional strategy thus increases the risk of adopting an inappropriate model. Furthermore, in the design of complex product system, because of its feature of one-of-a-kind production, acquiring more samples is very expensive and intensively time-consuming, and sometimes even impossible. Therefore, in order to save sampling cost, it is a reasonable strategy to take full advantage of all the stand-alone surrogates and then combine them into an ensemble model. Ensemble technique is an effective way to make up for the shortfalls of traditional strategy. Motivated by the previous research on ensemble of surrogates, a new technique for constructing of a more accurate ensemble of surrogates is proposed in this paper. The weights are obtained using a recursive process, in which the values of these weights are updated in each iteration until the last ensemble achieves a desirable prediction accuracy. This technique has been evaluated using five benchmark problems and one reality problem. The results show that the proposed ensemble of surrogates with recursive arithmetic average provides more ideal prediction accuracy than the stand-alone surrogates and for most problems even exceeds the previously presented ensemble techniques. Finally, we should point out that the advantages of combination over selection are still difficult to illuminate. We are still using an “insurance policy” mode rather than offering significant improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Acar E, Rais-Rohani M (2009) Ensemble of metamodels with optimized weight factors. Struct Multidisc Optim 37:279–294

    Article  Google Scholar 

  • Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, New York

    Google Scholar 

  • Clarke SM, Griebsch JH, Simpson TW (2005) Analysis of support vector regression for approximation of complex engineering analyses. Trans ASME J Mech Des 127(6):1077–1087

    Article  Google Scholar 

  • Cresssie N (1988) Spatial prediction and ordinary kriging. Math Geol 20(4):405–421

    Article  MathSciNet  Google Scholar 

  • De Boor C, Ron A (1990) On multivariate polynomial interpolation. Constr Approx 6:287–302

    Article  MathSciNet  MATH  Google Scholar 

  • Fang H, Horstemeyer MF (2006) Global response approximation with radial basis functions. Eng Optim 38(4):407–424

    Article  MathSciNet  Google Scholar 

  • Forrester AIJ, Keane AJ (2009) Recent advances in surrogate-based optimization. Prog Aerosp Sci 45(1–3):50–79

    Article  Google Scholar 

  • Friedman JH (1991) Multivariate adaptive regressive splines. Ann Stat 19(1):1–67

    Article  MATH  Google Scholar 

  • Goel T, Haftka RT, Shyy W, Queipo NV (2007) Ensemble of surrogates. Struct Multidisc Optim 33:199–216

    Article  Google Scholar 

  • Hardy R (1971) Multiquadratic equations of topography and other irregular surfaces. J Geophys Res 76:1905–1915

    Article  Google Scholar 

  • Jin R, Chen W, Simpson TW (2001) Comparative studies of metamodeling techniques under multiple modeling criteria. Struct Multidisc Optim 23(1):1–13

    Article  Google Scholar 

  • Kleijnen JPC, Sanchez SM, Lucas TW, Cioppa TM (2005) A users guide to the brave new world of designing simulation experiments. INFORMS J Comput 17(3):263–289

    Article  Google Scholar 

  • Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Fourteenth international joint conference on artificial intelligence, pp 1137–1143

  • Langley P, Simon HA (1995) Applications of machine learning and rule induction. Commun ACM 38(11):55–64

    Article  Google Scholar 

  • McDonald D, Grantham W, Tabor W, Murphy M (2000) Response surface model development for global/local optimization using radial basis functions. In: The 8th AIAA symposium on multidisciplinary analysis and optimization, Long Beach, CA

  • Meckesheimer M, Barton R, Simpson T, Limayemn F, Yannou B (2001) Metamodeling of combined discrete/continuous responses. AIAA J 39(10):1950–1959

    Article  Google Scholar 

  • Meckesheimer M, Barton R, Simpson T, Booker A (2002) Computationally inexpensive metamodel assessment strategies. AIAA J 40(10):2053–2060

    Article  Google Scholar 

  • Papadrakakis M, Lagaros M, Tsompanakis Y (1998) Structural optimization using evolution strategies and neural networks. Comput Methods Appl Mech Eng 156(1–4):309–333

    Article  MATH  Google Scholar 

  • Perrone M, Cooper L (1993) When networks disagree: ensemble methods for hybrid neural networks. In: Mammone RJ (ed) Artificial neural networks for speech and vision. Chapman and Hall, London, pp 126–142

    Google Scholar 

  • Picard R, Cook R (1984) Cross-validation of regression models. J Am Stat Assoc 79(387):575–583

    Article  MathSciNet  MATH  Google Scholar 

  • Powell M (1987) Radial basis functions for multivariable interpolation: a review. In: Mason JC, Cox MG (eds) Proceedings of the IMA conference on algorithms for the approximation of functions and data. Oxford University Press, London, pp 143–167

    Google Scholar 

  • Queipo NV, Haftka RT, Shyy W, Goel T, Vaidyanathan R, Tucker PK (2005) Surrogate-based analysis and optimization. Prog Aerosp Sci 41:1–28

    Article  Google Scholar 

  • Sacks J, Schiller SB, Welch WJ (1989a) Designs for computer experiments. Technometrics 31(1):41–47

    Article  MathSciNet  Google Scholar 

  • Sacks J, Welch WJ, Mitchell TJ, Wynn HP (1989b) Design and analysis of computer experiments. Stat Sci 4(4):409–435

    Article  MathSciNet  MATH  Google Scholar 

  • Simpson TW, Toropov V, Balabanov V, Viana FAC (2008) Design and analysis of computer experiments in multidisciplinary design optimization: a review of how far we have come or not. In: 12th AIAA/ISSMO multidisciplinary analysis and optimization conference, AIAA20085802, Victoria, BC, Canada

  • Viana FAC, Haftka RT, Steffen V (2009) Multiple surrogate: how cross-validation errors can help us to obtain the best predictor. Struct Multidisc Optim 39:439–457

    Article  Google Scholar 

  • Viana FAC, Gogu C, Haftka RT (2010) Making the most out of surrogate models: tricks of the trade. In: ASME 2010 international design engineering technical conferences and computers and information in engineering conference, DETC2010-8813, Montreal, Canada

  • Wang GG, Shan S (2007) Review of metamodeling techniques in support of engineering design optimization. Trans ASME J Mech Des 129(4):370–381

    Article  MathSciNet  Google Scholar 

  • Yang Y (2003) Regression with multiple candidate models: selecting or mixing? Stat Sin 13(5):783–809

    MATH  Google Scholar 

  • Zerpa L, Queipo N, Pintos S, Salager J (2005) An optimization methodology of alkaline-surfactant-polymer flooding processes using field scale numerical simulation and multiple surrogates. J Pet Sci Eng 47:197–208

    Article  Google Scholar 

Download references

Acknowledgements

The funding provided for this study by the National Science Foundation of China under Grant NO.70931002 and NO.70672088 is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Jian Zhou.

Additional information

Part of the work was presented at the 2010 2nd International Conference on Industrial Mechatronics and Automation (ICIMA 2010), Wuhan, China.

Appendices

Appendix A: Several metamodeling techniques

Here, there are four metamodeling techniques (PRS, RBF, Kriging, SVR) are considered.

1.1 A.1 PRS

For PRS, the highest order is allowed to be 4 in this paper, but the used order in a specific problem is determined by the selected sample set. When the highest order of a polynomial model is 4, it can be expressed as:

$$ \begin{array}{rll} \widetilde{F}(x)&=&a_0 +\sum\limits_{i=1}^N {b_i x_i } +\sum\limits_{i=1}^N {c_{ii} x_{_{i_i } }^2 } +\sum\limits_{ij(i<j)}^ {c_{ij} x_i x_j }\\ &&+\,\sum\limits_{i=1}^N {d_i x_{_{i_i } }^3 } +\sum\limits_{i=1}^N {e_i x_{_{i_i } }^4 } \end{array} $$
(21)

where \(\widetilde{F}\) is the response surface approximation of the actual response function, N is the number of variables in the input vector x, and a,b,c,d,e are the unknown coefficients to be determined by the least squares technique.

Notice that 3rd and 4th order models in polynomial model do not have any mixed polynomial terms (interactions) of order 3 and 4. Only pure cubic and quadratic terms are included to reduce the amount of data required for model construction. A lower order model (Linear, Quadratic, and Cubic) includes only lower order polynomial terms (only linear, quadratic, or cubic terms correspondingly).

1.2 A.2 RBF

The general form of the RBF approximation can be expressed as:

$$ f(x)=\sum\limits_{i=1}^m {\beta _i \varphi \big(\big\| {x-x_i } \big\|\big)} $$
(22)

Powell (1987) considers several forms for the basis function φ(·):

  1. 1.

    \(\varphi (r) = {e^{\left( {{{{ - {r^2}}} \left/ {{{c^2}}} \right.}} \right)}}{\text{Gaussian}}\)

  2. 2.

    \(\varphi (r)=(r^2+c^2)^{\frac{1}{2}}\) Multiquadrics

  3. 3.

    \(\varphi (r)=(r^2+c^2)^{-\frac{1}{2}}\) Reciprocal Multiquadrics

  4. 4.

    \(\varphi (r)=\left( { r} \left/ {c^2} \right. \right)\log \left( {{ {r} \left/ {{{c}}} \right.}} \right)\) Thin-Plate Spline

  5. 5.

    \(\varphi (r)=\frac{1}{1+e^{{r} \!\mathord/{c}}}\) Logistic

where c ≥ 0. Particularly, the multi-quadratic RBF form has been applied by Meckesheimer et al. (2001, 2002) to construct an approximation after Hardy (1971), who used linear combinations of a radically symmetric function based on the Euclidean distance of the form:

$$ \varphi (x)=\beta _0 +\sum\limits_{i=1}^n {\beta \big\| {{\rm {\bf x}}-{\rm {\bf x}}_i } \big\|} $$
(23)

where \(\left\| \cdot \right\|\) represents the Euclidean norm. Replacing φ(x) with the vector of response observations, y yields a linear system of n equations and n variables, which is used to solve β. As described above, this technique can be viewed as an interpolating process. RBF surrogates have produced good fits to arbitrary contours of both deterministic and stochastic responses (Powell 1987). Different RBF forms were compared by McDonald et al. (2000) on a hydro code simulation, and the author found that the Gaussian and the multi-quadratic RBF forms performed best generally.

1.3 A.3 Kriging

For computer experiments, kriging is viewed from a Bayesian perspective where the response is regarded as a realization of a stationary random process. The general form of this model is expressed as:

$$ Y(x)=\sum\limits_{j=1}^k {\beta _j f_j (x)} +Z(x) $$
(24)

Where f j ,j = 1,....,k is assumed as a known vector of function, β j is an unknown constant needed to estimated, and Z(·) is a stochastic process, commonly assumed to be Gaussian, with mean zero and covariance

$$ \begin{array}{rll} Cov(Z(w),Z(u))&=&\sigma ^2R(w,u)\\ &=&\sigma ^2\exp \left\{-\theta \sum\limits_{i=1}^d{(w_i -u_i )^2} \right\} \end{array} $$

where σ 2 is the process variance. In practice, the linear model component in (20) is often reduced to only an intercept b since the inclusion of a more complex linear model does not necessarily yield a better prediction.

1.4 ε-SVR

Given the data set {(x 1 ,y 1 ),......,(x l ,y l )}(where l denotes the number of samples) and the kernel matrix K ij  = K(x i ,x j ), and if the loss function in SVR is ε-insensitive loss function

$$ L_\varepsilon \left( {f\left( {\rm {\bf x}} \right)-y} \right)=\left\{ {{\begin{array}{@{}l} {0, \quad \left| {f\left( {\rm {\bf x}} \right)-y} \right|<\varepsilon } \hfill \\ {\left| {f\left( {\rm {\bf x}} \right)-y} \right|-\varepsilon , \quad \mathit{other}} \hfill \\ \end{array} }} \right., $$
(25)

then the ε-SVR is written as:

$$ \min \mbox{ }\Phi \left( {{\rm {\bf w}},{\rm {\bf \xi }}} \right)=\frac{1}{2}{\rm {\bf w}}^T{\rm {\bf w}}+C\sum\limits_{i=1}^l {\big( {\xi _i^- +\xi _i^+ } \big)} $$
(26)
$$ s.t.\left\{ {{\begin{array}{@{}c} {f\big( {{\rm {\bf x}}_i } \big)-y_i \le \varepsilon +\xi _i^+ } \hfill \\ {y_i -f\big( {{\rm {\bf x}}_i } \big)\le \varepsilon +\xi _i^- } \hfill \\ {\xi _i^- ,\xi _i^+ \ge 0,} \hfill \\ \end{array} }} \right.,\mbox{ }i=1,\cdots l. $$

The Lagrange dual model of the above model is expressed as:

$$ \begin{array}{rll} \mathop {\min }\limits_{{\rm {\bf \alpha }}^{\left( \ast \right)}} &\mbox{}\frac{1}{2}\sum\limits_{i,j=1}^l \left( {\alpha _i -\alpha _i^\ast } \right)\left( {\alpha _j -\alpha _j^\ast } \right)K\left( {{\rm {\bf x}}_i ,{\rm {\bf x}}_j } \right)\\ &-\sum\limits_{i=1}^l {\left( {\alpha _i -\alpha _i^\ast } \right)y_i } +\varepsilon \sum\limits_{i=1}^l {\left( {\alpha _i +\alpha _i^\ast } \right)} \end{array} $$
(27)
$$ s.t.\left\{ {{\begin{array}{@{}c} {0\le \alpha _i ,\alpha _i^\ast \le C\mbox{ },i=1,\cdots ,l,} \hfill \\ {\sum\limits_{i=1}^l {\left( {\alpha _i -\alpha _i^\ast } \right)=0.} } \hfill \\ \end{array} }} \right., $$

where \(K\left( {\cdot ,\cdot } \right)\) is kernel function. After being worked out the parameter α ( ∗ ), the regression function f(x) can be gotten.

Appendix B: Box plots

In a box plot, the box is composed of lower quartile (25%), median (50%), and upper quartile (75%) values. Besides the box, there are two lines extended from each end of the box, whose upper limit and lower limit are defined as follows:

$$ {{\rm low\_limit = max}}\left\{ {{{\rm Q}}_1}{{\rm - 1}}{{\rm .5IQR, X_{minimum}}}\right\} $$
(28)
$$ {{\rm up\_limit = min}}\left\{ {{{\rm Q}}_3}{{\rm + 1}}{{\rm .5IQR, X_{maximum}}}\right\} $$
(29)

where Q1 is the value of the line at lower quartile, Q3 is the value of the line at upper quartile, IQR = Q3 − Q 1, X minimum and X maximum are the minimum and maximum value of the data. Outliers are data with values beyond the ends of the lines by placing a “+” sign for each point.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, X.J., Ma, Y.Z. & Li, X.F. Ensemble of surrogates with recursive arithmetic average. Struct Multidisc Optim 44, 651–671 (2011). https://doi.org/10.1007/s00158-011-0655-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00158-011-0655-6

Keywords

Navigation