Skip to main content
Log in

Deep Gaussian process for multi-objective Bayesian optimization

  • Research Article
  • Published:
Optimization and Engineering Aims and scope Submit manuscript

Abstract

Bayesian Optimization has become a widely used approach to perform optimization involving computationally intensive black-box functions, such as the design optimization of complex engineering systems. It is often based on Gaussian Process regression as a Bayesian surrogate model of the exact functions. Bayesian Optimization has been applied to single and multi-objective optimization problems. In case of multi-objective optimization, the Bayesian models used in optimization often consider the multiple objectives separately and do not take into account the possible correlation between them near the Pareto front. In this paper, a Multi-Objective Bayesian Optimization algorithm based on Deep Gaussian Process is proposed in order to jointly model the objective functions. It allows to take advantage of the correlations (linear and non-linear) between the objectives in order to improve the search space exploration and speed up the convergence to the Pareto front. The proposed algorithm is compared to classical Bayesian Optimization in four analytical functions and two aerospace engineering problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

Data Availibility Statement

The datasets and analytical problems generated and supporting the findings of this article, are obtainable from the corresponding author upon reasonable request.

References

  • Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. (2016) Tensorflow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283

  • Alvarez MA, Rosasco L, Lawrence ND (2011) Kernels for vector-valued functions: a review. arXiv:1106.6251

  • Arias-Montano A, Coello CAC, Mezura-Montes E (2012) Multiobjective evolutionary algorithms in aeronautical and aerospace engineering. IEEE Trans Evolut Comput 16:662–694

    Article  Google Scholar 

  • Auger A, Bader J, Brockhoff D, Zitzler E (2012) Hypervolume-based multiobjective optimization: theoretical foundations and practical implications. Theoret Comput Sci 425:75–103

    Article  MathSciNet  MATH  Google Scholar 

  • Bader J, Zitzler E (2011) HypE: an algorithm for fast hypervolume-based many-objective optimization. Evol Comput 19:45–76

    Article  Google Scholar 

  • Beume N, Naujoks B, Emmerich M (2007) SMS-EMOA: multiobjective selection based on dominated hypervolume. Eur J Oper Res 181:1653–1669

    Article  MATH  Google Scholar 

  • Bradstreet L (2011) The hypervolume indicator for multi-objective optimisation: calculation and use. Ph.D. thesis, University of Western Australia Perth

  • Brevault L, Balesdent M, Hebbal A (2020) Multi-objective multidisciplinary design optimization approach for partially reusable launch vehicle design. J Spacec Rockets 58:1–17

    Google Scholar 

  • Bui T, Hernández-Lobato D, Hernandez-Lobato J, Li Y, Turner R (2016) Deep Gaussian processes for regression using approximate expectation propagation. In: International conference on machine learning, pp 1472–1481

  • Bussemaker JH, Ciampa PD, Nagel B (2020) System architecture design space exploration: an approach to modeling and optimization. AIAA Aviation 2020 Forum. 3172

  • Castellini F, Riccardi A, Lavagna M, Büskens C (2011) Global and local multidisciplinary design optimization of expendable launch vehicles. In: 52nd AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and materials conference

  • Chauhan SS, Martins JRRA (2018) Low-fidelity aerostructural optimization of aircraft wings with a simplified wingbox model using OpenAeroStruct. In: International conference on engineering optimization, pp 418–431

  • Chiplunkar A, Rachelson E, Colombo M, Morlier J (2016) Approximate inference in related multi-output Gaussian process regression. In: International conference on pattern recognition applications and methods, pp 88–103

  • Constantinescu EM, Anitescu M (2013) Physics-based covariance models for Gaussian processes with multiple outputs. Int J Uncertain Quantif 3(1):47–71

  • Cutajar K, Bonilla EV, Michiardi P, Filippone M (2017) Random feature expansions for deep Gaussian processes. In: International conference on machine learning, pp 884–893

  • Cutajar K, Pullin M, Damianou A, Lawrence N, González J (2019) Deep Gaussian processes for multi-fidelity modeling. arXiv:1903.07320

  • Dai Z, Damianou A, González J, Lawrence N (2015) Variational auto-encoded deep Gaussian processes. arXiv:1511.06455

  • Damianou A (2015) Deep Gaussian processes and variational propagation of uncertainty. Ph.D. thesis, University of Sheffield

  • Damianou A, Lawrence N (2013) Deep Gaussian processes. Artif Intell Stat 3(1):47–71

  • De Matthews AG, Van Der Wilk M, Nickson T, Fujii K, Boukouvalas A, León-Villagrá P, Ghahramani Z, Hensman J (2017) GPflow: a Gaussian process library using TensorFlow. J Mach Learn Res 18:1299–1304

    MathSciNet  MATH  Google Scholar 

  • Deb K (2001) Multi-objective optimization using evolutionary algorithms, vol 16. Wiley, New York

    MATH  Google Scholar 

  • Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. In: International conference on parallel problem solving from nature, pp 849–858

  • Deb K, Thiele L, Laumanns M, Zitzler E (2005) Scalable test problems for evolutionary multiobjective optimization. Evolutionary multiobjective optimization. Springer, Berlin, pp 105–145

    Book  MATH  Google Scholar 

  • Durantin C, Marzat J, Balesdent M (2016) Analysis of multi-objective Kriging-based methods for constrained global optimization. Comput Optim Appl 63:903–926

    Article  MathSciNet  MATH  Google Scholar 

  • Emmerich M, Klinkenberg J (2008) The computation of the expected improvement in dominated hypervolume of Pareto front approximations. Rapport Techn Leiden Univ 34:3–7

    Google Scholar 

  • Emmerich M, Giannakoglou KC, Naujoks B (2006) Single-and multiobjective evolutionary optimization assisted by Gaussian random field metamodels. IEEE Trans Evolut Comput 10:421–439

    Article  Google Scholar 

  • Emmerich M, Yang K, Deutz A, Wang H, Fonseca CM (2016) A multicriteria generalization of Bayesian global optimization. In: Advances in stochastic and deterministic global optimization. Springer, pp 229–242

  • Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling: a practical guide. Wiley, New York

    Book  Google Scholar 

  • Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. In: IEEE transactions on pattern analysis and machine intelligence PAMI-6, pp 721–741

  • Hadka D (2015) Platypus-multiobjective optimization in python

  • Haibin Y, Chen Y, Low BKH, Jaillet P, Dai Z (2019) Implicit posterior variational inference for deep Gaussian processes. In: Advances in neural information processing systems, pp 14502–14513

  • Hebbal A (2021) Deep Gaussian processes for the analysis and optimization of complex systems-application to aerospace system design. Ph.D. thesis, Université de Lille

  • Hebbal A, Brevault L, Balesdent M, Talbi E-G, Melab N (2019a) Bayesian optimization using deep Gaussian processes. arXiv:1905.03350

  • Hebbal A, Brevault L, Balesdent M, Talbi E-G, Melab N (2019b) Multi-fidelity modeling using DGPs: improvements and a generalization to varying input space dimensions. Conference: 33rd conference on neural information processing systems, Vancouver

  • Hernández-Lobato D, Hernandez-Lobato J, Shah A, Adams R (2016) Predictive entropy search for multi-objective Bayesian optimization. In: International conference on machine learning, pp 1492–1501

  • Jasa JP, Hwang JT, Martins JRRA (2018) Open-source coupled aerostructural optimization using Python. Struct Multidiscip Optimiz 57:1815–1827

    Article  Google Scholar 

  • Jeong S, Obayashi S (2005) Efficient global optimization (EGO) for multi-objective problem and data mining. In: 2005 IEEE congress on evolutionary computation, pp 2138–2145

  • Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black-box functions. J Glob Optim 13:455–492

    Article  MathSciNet  MATH  Google Scholar 

  • Knowles J (2006) ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans Evol Comput 10:50–66

    Article  Google Scholar 

  • Kullback S (1997) Information theory and statistics. Courier Corporation, North Chelmsford

    MATH  Google Scholar 

  • Kursawe F (1990) A variant of evolution strategies for vector optimization. In: International conference on parallel problem solving from nature, pp 193–197

  • Liu W, Zhang Q, Tsang E, Liu C, Virginas B (2007) On the performance of metamodel assisted MOEA/D. In: International symposium on intelligence computation and applications, pp 547–557

  • Matheron G (1967) Kriging or polynomial interpolation procedures. CIMM Trans 70:240–244

    Google Scholar 

  • Nebro AJ, Durillo JJ, Garcia-Nieto J, Coello CAC, Luna F, Alba E (2009) SMPSO: a new PSO-based metaheuristic for multi-objective optimization. In: 2009 IEEE symposium on computational intelligence in multi-criteria decision-making, pp 66–73

  • Oliver MA, Webster R (1990) Kriging: a method of interpolation for geographical information systems. Int J Geograph Inf Syst 4:313–332

    Google Scholar 

  • Perdikaris P, Raissi M, Damianou A, Lawrence ND, Karniadakis GE (2017) Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling. Proc Roy Soc A Math Phys Eng Sci 473:20160751

    MATH  Google Scholar 

  • Picheny V (2015) Multiobjective optimization using Gaussian process emulators via stepwise uncertainty reduction. Stat Comput 25:1265–1280

    Article  MathSciNet  MATH  Google Scholar 

  • Ponweiser W, Wagner T, Biermann D, Vincze M (2008) Multiobjective optimization on a limited budget of evaluations using model-assisted S-metric selection. In: International conference on parallel problem solving from nature, pp 784–794

  • Salimbeni H, Deisenroth M (2017) Doubly stochastic variational inference for deep Gaussian processes. In: Advances in neural information processing systems, pp 4588–4599

  • Schonlau M, Welch WJ, Jones D (1996) Global optimization with nonparametric function fitting. In: Proceedings of the ASA, section on physical and engineering sciences, pp 183–186

  • Shah A, Ghahramani Z (2016) Pareto frontier learning with expensive correlated objectives. In: International conference on machine learning, pp 1919–1927

  • Stein ML (1999) Interpolation of spatial data: some theory for kriging. Springer, Berlin

    Book  MATH  Google Scholar 

  • Svenson JD, Santner TJ (2016) Multiobjective optimization of expensive-to-evaluate deterministic computer simulator models. Comput Stat Data Anal 94:250–264

    Article  MathSciNet  MATH  Google Scholar 

  • Talbi E-G (2009) Metaheuristics: from design to implementation, vol 74. Wiley, New York

    Book  MATH  Google Scholar 

  • Talbi E-G, Basseur M, Nebro AJ, Alba E (2012) Multi-objective optimization using metaheuristics: non-standard algorithms. Int Trans Oper Res 19:283–305

    Article  MathSciNet  MATH  Google Scholar 

  • Toal DJJ, Keane AJ (2012) Non-stationary kriging for design optimization. Eng Optim 44:741–765

    Article  MathSciNet  Google Scholar 

  • Wang Y, Brubaker M, Chaib-Draa B, Urtasun R (2016) Proceedings of the 19th International conference on artificial intelligence and statistics (AISTATS), vol. 51. JMLR: W&CP, Cadiz, Spain

  • Williams CKI, Rasmussen CE (2006) Gaussian processes for machine learning, vol 2. MIT Press, Cambridge

    MATH  Google Scholar 

  • Yang K, Emmerich M, Deutz A, Bäck T (2019) Efficient computation of expected hypervolume improvement using box decomposition algorithms. J Global Optim 75:3–34

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang Q, Liu W, Tsang E, Virginas B (2009) Expensive multiobjective optimization by MOEA/D with Gaussian process model. IEEE Trans Evol Comput 14:456–474

    Article  Google Scholar 

  • Zhang Q, Liu W, Tsang E, Virginas B (2010) Expensive multiobjective optimization by MOEA/D with Gaussian process model. IEEE Trans Evol Comput 14:456–474

    Article  Google Scholar 

  • Zitzler E, Laumanns M, Thiele L, Fonseca CM, da Fonseca VG (2002) Why quality assessment of multiobjective optimizers is difficult. In: Proceedings of the 4th annual conference on genetic and evolutionary computation, pp 666–674

Download references

Acknowledgements

The work of Ali Hebbal is part of a PhD thesis co-funded by ONERA—The French Aerospace Lab and the university of Lille.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mathieu Balesdent.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: numerical setup

Appendix: numerical setup

  • The Experiments presented in this paper were carried out using the Grid’5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).

  • All experiments have been performed on Grid’5000 using a Tesla P100 GPU.

  • The codes involving GPs and DGPs are based on Tensorflow (Abadi et al. 2016), GPflow (De Matthews et al. 2017) (https://github.com/GPflow/GPflow), and Doubly-Stochastic-DGP (Salimbeni and Deisenroth 2017) (https://github.com/ICL-SML/Doubly-Stochastic-DGP) in Python 3.

  • The data is always normalized and standardized (zero mean and a variance equal to 1).

  • For all DGPs, ARD RBF (Automatic Relevance Determination - Radial Basis Function) kernels are used with a length-scale and variance initialized to 1 if it does not get an initialization from a previous DGP.

  • The number of Gibbs sampling iterations used is 4 and in each iteration 1000 samples are drawn.

  • LMC is used with a coregionalization matrix of rank 2.

  • The Python package Platypus (Hadka 2015) (https://github.com/Project-Platypus/Platypus) is used for NSGA-II. It is used with its default parameters proposed in Deb et al. (2000) with a population of 5 candidates. The population evolves until reaching \(15\times d\) evaluations of the objective functions, where d is the dimension of the input space.

  • For MO-DGP, the inducing inputs at the different layers are initialized to \({{\textbf {X}}}\).

  • The mean of the variational distribution of the inducing variables for the layer i is initialized at \({{\textbf {y}}}_i\).

  • MO-DGP is optimized in three stages. In the first one, 3000 Adam optimization steps are performed while fixing the variational parameters and the inducing inputs. In the second one, the inducing inputs are also optimized using 3000 Adam optimization steps. Then, 15,000 iterations of the algorithm are performed.

Table 7 Total number of hyper-parameters for MO-BO using an ARD RBF kernel and MO-DGP

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hebbal, A., Balesdent, M., Brevault, L. et al. Deep Gaussian process for multi-objective Bayesian optimization. Optim Eng 24, 1809–1848 (2023). https://doi.org/10.1007/s11081-022-09753-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11081-022-09753-0

Keywords

Navigation