Introduction

Compared to conventional optimization algorithms, evolutionary algorithms (EAs) are more apt at handling a number of complex problems in real-world applications [1, 2]. EAs have, therefore, been widely applied in many real-world applications, including drug design [3], control engineering applications [4], and wing configuration design [5]. However, EAs generally require thousands of fitness evaluations to achieve a satisfactory candidate solution. In many engineering optimization computations, a single numerical simulation can take several minutes, hours, or even days to complete. Examples of this include computational fluid dynamics (CFD) simulation, in which performing a single simulation to evaluate a candidate design generally requires several hours. Furthermore, the number of required additional fitness evaluations rises with the dimension of the optimization problem, resulting in high computational costs to run hundreds or thousands of fitness evaluations. To solve such expensive optimization problems, surrogate model-based EAs, in which a surrogate model (also called a meta-model) is applied instead of the expensive original function, are often used.

Over the last few decades, a variety of surrogate-assisted evolutionary algorithms (SAEAs) have been identified. Existing strategies for SAEAs to employ surrogate models can generally be divided into the single-surrogate and multi-surrogate model-based strategies, depending on the number of surrogate models that SAEAs have used. Although many generic machine learning methods have been used to build surrogate models, no specific rule has been proposed to determine the type of model most suitable for use as a surrogate [6]. In general, single-surrogate model-based EAs employ the Gaussian process (GP) model, most likely because this model can predict a candidate solution while providing an estimate of the error of the predicted value. There are several infill sampling criteria that apply GP-provided prediction and error estimation approaches, including the expected improvement infill criterion [7, 8] and lower confidence bound infill criterion [9, 10], have been proposed as guides for achieving promising solutions. Most non-GP models, including polynomial repression surface models [11, 12], artificial neural networks [13, 14], radial basis functions (RBFs) [15,16,17] and many others [18], can only provide predictions and cannot provide error estimates for their predictions. Because of these limitations, multiple model-based EAs involving the production of multiple predictions from multiple models are applied to avoid cases in which the algorithms run into local optima.

For SAEAs, the core problem is how to use surrogate models to guide the optimization process reasonably. When an expensive problem has a high-dimensional decision space, it becomes more challenging for SAEAs to employ surrogate models to guide the optimization process effectively. Firstly, the number of training samples required by the surrogate model grows exponentially as the problem dimensions increase [19]. This implies more experimental evaluations are required, often expensive and infeasible in real applications. Due to the lack of samples on high-dimensional expensive problems, it is difficult to construct a single surrogate model with high accuracy [20]. It is commonly known that the use of inaccurate surrogates might cause an optimization process to be misleading. Second, it needs more time to create a surrogate model as the problem dimensions increase. For example, in the Gaussian process (GP) model, global optimization for high-dimensional acquisition functions is intrinsically a hard problem and can be prohibitively expensive to be feasible [21]. Generally speaking, existing research on extending the SAEAs to high-dimensional expensive problems can be roughly classified into three categories:

The first strategy is to deal with the lack of samples through data processing and data generation methods. DDEA-PES [22] used data perturbation to generate diverse datasets. SAEO [23] trained and activated the surrogate model only after enough data samples were collected. ESAO [24] randomly projected training samples into a set of low-dimensional sub-spaces rather than training in the original high-dimensional space.

The second strategy is to improve the performance of surrogate models. In [25], a GP model was combined with the partial least squares method to solve high-dimensional problems with up to 50 design variables. In our previous work, we proposed a multi-objective infill criterion [26] for GP model management. TR-SADEA [27] employs a self-adaptive GP model for antenna design. The RBF-assisted approach based on granulation was proposed in [28, 29]. In [30], a radial basis function network (RBFN) with a trust-region approach was considered a local model for solving 20-dimensional problems. Wang and Jin [31] employed three widely used models (i.e., PR, RBF, and GP ) to construct both one global ensemble model and a local ensemble model respectively. Li et al. [32] employed two criteria to balance exploitation and convergence to solve medium-scaled computationally expensive problems. MS-RV [33] transferred the knowledge from the coarse surrogate to the fine surrogate in off-line data-driven optimization.

The third strategy is to improve optimization efficiency by multiple swarms. Multiple swarms were used in SA-COSO [34] for solving high-dimensional problems ranging from 30 to 200 dimensions. Pan et al. [35] proposed an efficient surrogate-assisted hybrid optimization (SAHO) algorithm that combines two EAs (TLBO and DE) as the basic optimizer for 100-dimensional problems.

This paper proposes a variable surrogate model-based particle swarm optimization (VSMPSO) algorithm for high-dimensional expensive problems. To the best of our knowledge, VSMPSO is the first attempt to extend a single surrogate-assisted EA to solve the 200 dimension problems. The main contributions of this paper are as follows:

  • The proposed VSMPSO, does not focus on improving the accuracy of surrogate models, but rather relies on the blessing of uncertainty [36], which only employs one RBF model as a single surrogate in combination with the proposed variable surrogate model strategy to explore different promising area in different generation to avoid model misdirection throughout the whole optimization process.

  • The prediction ability of the surrogate model is not only used to predict the current population. For deep mining of the surrogate model prediction information, the most promising point of the surrogate model would be found and transferred into the optimizer population to accelerate the optimization.

  • The proposed algorithm framework of VSMPSO can be applied in any surrogate-assisted evolutionary algorithm irrespective of the surrogate model used.

The remainder of this paper is organized as follows: the next section introduces a brief overview of the related techniques used in this paper. The main framework of the proposed algorithm is then presented in the subsequent section. The penultimate section compares a few state-of-the-art algorithms with widely used benchmark problems with 30, 50, 100, and 200 dimensions. The final section provides the conclusion.

Related techniques in VSMPSO

Particle swarm optimization (PSO)

The canonical PSO, developed by Eberhart and Kennedy in 1995, is a population- or swarm-based intelligent optimisation algorithm inspired by the social behaviours of populations of organisms such as birds (flocking) or fish (schooling) [37]. Eq. (1) and Eq. (2) describe the evolution of \(x_j\) (the position of the jth individual at generation \((t+1)\)) along the dth dimension in the canonical PSO:

$$\begin{aligned} x_{j}^{d}(\textrm{t}+1)= & {} x_{j}^{d}(t)+\varDelta x_{j}^{d}(t+1) \end{aligned}$$
(1)
$$\begin{aligned} \varDelta x_{j}^{d}(t+1)= & {} \omega \varDelta x_{j}^{d}(t)+c_{1} r_{1} \cdot \left( P b e s t_{j}^{d}(t)-x_{j}^{d}(t)\right) \nonumber \\{} & {} +c_{2} r_{2}\cdot \left( G b e s t^{d}(t)-x_{j}^{d}(t)\right) . \end{aligned}$$
(2)

The feature distinguishing the canonical PSO from other EAs such as the genetic algorithm (GA) or differential evolution (DE) is that it converges rapidly but easily falls into local optima. To prevent premature convergence, a variety of modified PSOs have been proposed, including the comprehensive learning PSO [38], distance-based locally informed PSO [39], social learning PSO [40], and competitive swarm optimiser (CSO) [41]. Based on the effective performance of social learning particle swarm optimization (SLPSO), we propose a simplified SLPSO to generate candidate solutions whose primary structure is similar to the SLPSO algorithm proposed by Cheng and Jin. In this simplified SLPSO, individual \(x_j\) are updated using the following formulas:

$$\begin{aligned} x_{j}^{d}(\textrm{t}+1)= & {} \left\{ \begin{array}{ll} {x_{j}^{d}(t)+\varDelta x_{j}^{d}(t+1)} &{} { \text{ if } p_{j}(t) \leqslant \textrm{P}_{j}^{L}} \\ {x_{j}^{d}(t)} &{} { \text{ otherwise } }\end{array}\right. \end{aligned}$$
(3)
$$\begin{aligned} \varDelta x_{j}^{d}(t+1)= & {} r_{1} \cdot \varDelta x_{j}^{d}(t)+r_{2} \cdot \left( x_{k}^{d}(t)-x_{j}^{d}(t)\right) \nonumber \\{} & {} +r_{3} \cdot \varepsilon \cdot \left( \overline{x}_{d}(t)-x_{j}^{d}(t)\right) , \end{aligned}$$
(4)

where \(1\leqslant j<N\), N is the population size, \(1\leqslant d\leqslant D\), and D is the dimension of the search space. In each generation, the population is sorted according to fitness value from bad to good, with \(x_1\) and \(x_N\) representing the worst and best solutions, respectively, at the current generation. \(x_k\) is a randomly chosen demonstrator for \(x_j\), \(j<k\leqslant N\), and \(x_{k}^{d}\left( t \right) \) represents the dth element of \(x_k\). We note that a demonstrator should be chosen for each element of \(x_j\). \(P_{j}^{L}\) is the learning probability, which is inversely proportional to the fitness of \(x_j\), \(p_j\left( t \right) \) is a randomly generated probability for \(x_j\), \(r_1\), \(r_2\), and \(r_3\) are random numbers in the range \(\left[ 0,1 \right] \), and \(\varepsilon \) is the social influence factor that controls the influence of \(\bar{x}_d\left( t \right) \). In Eq. (4), \(\bar{x}_d\left( t \right) \) is the mean position along the dth dimension of the population at generation t. If a uniform sampling method such as Latin hypercube sampling (LHS) is used in the initialisation process, \(\bar{x}_d\left( t \right) \) can be quite close to the \(1\times D\) zero vector \(o=\left[ 0,\ldots 0 \right] \). The function of the global optimum at the zero vector o can easily lead the population toward a promising region, and to avoid this coincidence, we set the parameter \(r_3\) to zero to remove the effect of \({{\bar{x}}_{d}}(t)\). This simplifies Eq. (4) to

$$\begin{aligned} \varDelta x_{j}^{d}\left( t+1 \right) =r_1\cdot \varDelta x_{j}^{d}\left( t \right) +r_2\cdot \left( x_{k}^{d}\left( t \right) -x_{j}^{d}\left( t \right) \right) \end{aligned}$$
(5)

In this study, we generated new swarms using Eq. (3) and Eq. (5), and in the following sections, the variant SLPSO is referred to as ‘PSO’ for brevity.

RBF network

The functionality of an RBF network as a type of neural network was described in detail in [42] and can be represented by the following form:

$$\begin{aligned} \hat{f}(\textrm{x})=\sum _{i=1}^{M} \omega _{i} \phi \left( \left\| \textrm{x}-x_{i}\right\| \right) , \end{aligned}$$
(6)

where \(x\in \mathbb {R}^D\) is an input vector, \(\phi \) is the basic function of the RBF network, \(\Vert \cdot \Vert \) is the 2-norm (also called the Euclidean norm), \(\omega _i\) is the weight vector, M represents both the number of input units in the RBF input layer and the number of samples for building the RBF model. Because the basic function \(\omega _i\) is one of the key factors affecting the performance of the model, many forms of \(\omega _i\) have been developed, including multi-quadric, thin plate spline, Gaussian, and cubic forms. A comparison of different choices of \(\omega _i\) in [43] revealed that the thin plate spline and linear and cubic RBFs theoretically perform better than either the multi-quadric or Gaussian RBF. Additionally, numerical investigation results have demonstrated that the cubic RBF can improve the performance of the thin plate spline and multi-quadric RBFs [44]. Furthermore, cubic RBF-assisted EAs have been successfully used in local function approximation. Based on these previous studies, the proposed method employs cubic basic function to construct RBF network, which is a common machine learning technique for fitness approximation [2, 8]. The basic function of the cubic RBF employed in this study is \(\phi \left( \Vert x-x_i \Vert \right) =\left( \Vert x-x_i \Vert \right) ^3\).

Proposed VSMPSO algorithm

VSMPSO framework

The main framework shown in Fig. 1 presents the overall algorithm for VSMPSO. The solid black arrow lines represent the flow direction of the algorithm, and the green dot arrow dotted line to mark the data flow direction. At the beginning of VSMPSO, the initial individuals are generated, which are all evaluated and then taken into the database DB. In each generation, the variable model management strategy was proposed to decide how to construct the surrogate model for fitness estimation and select promising solutions for fitness evaluation. In the variable model management strategy (Part I) of Fig. 1, the simple random sampling method is used to select samples from the database DB for building an RBF model. Subsequently, the RBF model will be carried out to find the most promising point in the global search space. This is followed by the knowledge transferred from the RBF model to the current population. Then, in the variable model management strategy (Part II) of Fig. 1, the infill criterion used in VSMPSO considers potential optimum points to be evaluated and put into DB.

The details for the main components in VSMPSO have been explained in Algorithm 1. The optimizer used in VSMPSO is a variant of that used in SLPSO. Steps 2–4 of Algorithm 1 describe the steps for generating the initial population (called P(1)) by Latin Hypercube Sampling (LHS) and then creating the database DB by all the initial individuals in P(1) with their real fitness values. As shown in steps 6–11, the main optimization loop contains two main components of the proposed algorithm that will be described in detail in Algorithms 2 and 3. In steps 8 and 10, the variable model management strategy is proposed to design model construction and the infill criterion, which will be described in “Variable model management strategy” section. In step 9, the knowledge from the RBF model will be transferred to P(t) (the population in generation t). Finally, the program outputs the most satisfying solution with its real fitness value and end. Overall speaking, VSMPSO contains two-layer loops. The outer loop uses the variant SLPSO as the optimizer in Algorithm 1, and the inner loop uses the canonical PSO as the optimizer in Algorithm 3. In Algorithm 3, the current RBF model is used as the objective function to drive the canonical PSO iterations, and then, the global best solution found by canonical PSO is transferred to variant SLPSO. In the following subsections, we will detail the two main components of the proposed algorithm, i.e., the variable model management strategy and the knowledge transfer from model to population.

Fig. 1
figure 1

VSMPSO flow diagram

figure f

Variable model management strategy

The key issues influencing the performance of surrogates are mainly model selection and model management. First, for model selection, according to the previous work, on high-dimensional problems, constructing GP models becomes time-consuming, but RBF has been proven to perform better with small samples than other common surrogate models [45], so we determine RBF as a surrogate model. Additionally, for model management, as mentioned in the “Introduction” section, due to the lack of samples on high-dimensional expensive problems, it is not easy to construct a single surrogate model with high accuracy. Since the blessing of uncertainty and the multiple local optima of original expensive problems, an accurate surrogate model is not always necessarily in optimization. So in this work, unlike the previous work focusing on improving the accuracy of models, only one global surrogate model is trained. Furthermore, due to different samples that may construct models toward different promising areas, a global model management strategy inspired by simple random sampling is proposed to enhance the diversity of the single model between each generation and thus avoid persistent misleading in a wrong direction throughout the whole optimization process.

figure g

As observed from the pseudo-code of Algorithm 1, the RBF model is updated during each generation, in step 2 of Algorithm 2, \(\lambda =\left[ M\times 80\% \right] \) samples are selected using simple random sampling, with the number of selected samples \(\lambda \) accounting for 80% of the total sample size based on a common empirical value used in K-fold cross-validation [46] with \(K=5\). Furthermore, the effectiveness of this strategy and sample size will be further verified in “Effects of variable model management strategy” and “Parameter sensitivity analysis”. After estimating the fitness value of P(t) by the current RBF model in step 4, the current population P(t) is sorted according to the estimated fitness value from bad to good in step 5, with \(x_1\) and \(x_N\) representing the worst solution PGworst and the best solution PGbest, respectively. As shown in steps 6–7, two potential optimum points are considered to be evaluated by the original expensive function. The first is PGbest, (the global optimum individual of the current population), and the second is MGbest, (the most promising point of the current surrogate model, obtained from Algorithm 3).

Knowledge transfer strategy

As mentioned in Algorithm 2, in each iteration of the optimization loop, a simple random sampling method selects different samples to construct the RBF model, and those RBF models may contribute toward different promising areas in different generations. So in Step 3 of Algorithm 3, for further data mining of the RBF model, the RBF model is considered as an objective function, which is defined as follows:

$$\begin{aligned}{} & {} \min \quad F_{\textrm{RBF}}\left( \textbf{x} \right) \nonumber \\{} & {} \text {s. t. }{{\textbf{x}}_{l}}\le \textbf{x}\le {{\textbf{x}}_{u}}, \end{aligned}$$
(7)

where \(\textbf{x}\in {{\mathbb {R}}^{D}}\) is the feasible solution set, denotes the same search space as original expensive problem, \(F_{\textrm{RBF}}\left( \textbf{x} \right) \) is the objective function, \({{\textbf{x}}_{l}}\) and \({{\textbf{x}}_{u}}\) are the lower and upper bounds of the decision variables. In steps 4–9, a canonical PSO is employed to find the global optimum of the current RBF model. In step 6, the canonical PSO updates individuals using Eq. (1) and (2). In step 8, Pbest (the personal best of the particle) and Gbest (the global best of the swarm) are updated in each generation. After the iteration is completed, in step 11, Gbest, the final best solution of the RBF model obtained through the canonical PSO, is assigned MGbest, which is considered the most promising point of the RBF model. In the following steps, by substituting MGbest for PGworst in \(P\left( t \right) \), the knowledge transferred from the RBF model to the population is expected to be able to enhance the search performance of VSMPSO by reducing the likelihood of getting stuck in a local optimum. Furthermore, the effectiveness of the knowledge transfer strategy is further verified in “Effects of variable model management strategy” section by comparing VSMPSO with SMPSO.

Table 1 Description of benchmark functions
figure h
Fig. 2
figure 2

Convergence profiles of MGP-SLPSO, CAL-SAPSO, SAHO, SACOSO, VSMPSO(GP) and VSMPSO(RBF) on 30D F1–F7 with 330 FEs

Table 2 Statistical results of MGP-SLPSO, CAL-SAPSO, SAHO, SACOSO and VSMPSO on 30D F1–F7 with 330 FEs
Fig. 3
figure 3

Distribution of solutions of MGP-SLPSO, CAL-SAPSO, SAHO, SACOSO and VSMPSO on 30D F1–F7 with 330 FEs

Experimental studies

A series of empirical studies on eleven commonly-used benchmark functions (for details, see attached Table 1) are designed to verify the effectiveness and optimality of the proposed VSMPSO. These eleven functions have different characteristics, including unimodal, multimodal, and very complex multimodal functions. Most of them are very difficult to optimize. F1–F7 functions are commonly used by other SAEAs [26, 31]. F8–F11 functions are adopted from test suite CEC 2017 [48], which have been recently proposed and are comparatively complex. The dimensions of these eleven functions are tested from 10 to 200. The CEC 2017 functions are only tested with dimension 100 because the highest dimension of the functions in this test suite is 100. The statistical results of the compared algorithms are given on 30-D, 50-D, 100-D, and 200-D problems, respectively. Furthermore, the best results are highlighted in bold. The proposed VSMPSO is compared with several popular and recently proposed SAEAs, SAHO [35], SACOSO [34], CAL-SAPSO [31] and MGP-SLPSO [26] under different dimensions. Moreover, we consider the best fitness value and the running time as the bi-objective problem to verify the performance of VSMPSO.

Table 3 Statistical results (best fitness and cost time) obtained by MGP-SLPSO, CAL-SAPSO, SAHO, SACOSO, VSMPSO(GP) and VSMPSO(RBF) on 30D F1–F7 with 330 FEs
Table 4 Statistical results (best fitness and cost time) obtained by MGP-SLPSO, SAHO, SACOSO, VSMPSO(GP) and VSMPSO(RBF) on 50D F1–F7 with 550 FEs

Experimental setup in details

All experiments are implemented on a high computing capability server with an Intel XEON E5-2620v4 processor with 64GB in RAM. Each algorithm was run 30 times in Matlab 2020B to eliminate the effect of statistical variation. We applied Friedman’s test to determine if there are any significant differences in the best fitness values obtained by the algorithms. The runs were performed using MATLAB’s Statistics Toolbox. The p values were obtained using Friedman’s test on pairwise comparisons between VSMPSO and other comparison algorithms. Normally, the p value threshold for statistical significance is 0.05 [49], and \(p\geqslant 0.05\) showed no significant difference; whereas \(p<0.05\) showed significant difference. “\({+}\)” indicates the labeled algorithm with the mean best value significantly outperformed other algorithms. The RBF model used in VSMPSO and other algorithms was implemented using the MATSuMoTo Toolbox [50]. The parameters of the optimizer SLPSO were set as recommended in [40]. For all the compared algorithms in this paper, the termination condition depended on the number of consumed function evaluations (FEs). The computational budget was less than \(11 \cdot D\) number of function evaluations (NFEs) [31, 32], which means the limited number of fitness evaluations was 11 times the dimension of the problem.

We describe the algorithm performance through two chart types to show convergence profiles and distribution of solutions, such as Figs. 2 and 3. It is noted that the original author provides the source code of SA-COSO and SAHO and the source codes of MGP-SLPSO and CAL-SAPSO are available on the internet. In [31], CAL-SLPSO is examined on problems less or equal to 30-D. So we compare statistical results with MGP-SLPSO, CAL-SAPSO, SAHO, and SACOSO on 30-D problems and then compare statistical results with MGP-SLPSO, SAHO, and SACOSO on 50- and 100-D problems. Considering the time consumption, we compare statistical results with MGP-SLPSO and SACOSO on 200-D problems.

Numerical results on 30- and 50-D F1–F7 functions

The proposed VSMPSO can be applied in any surrogate-assisted evolutionary algorithm irrespective of the surrogate model used. In this paper, we used two widely adopted models, GP and RBF, to predict fitness values, referred to as VSMPSO(GP) and VSMPSO(RBF). VSMPSO(GP) used GP as a surrogate model, while VSMPSO(RBF) used RBF. Both VSMPSO(GP) and VSMPSO(RBF) are all based on the VSMPSO framework in Algorithms 12 and 3; their only difference was the different surrogate models used.

From Table 2, MGP-SLPSO was the best on F1, VSMPSO (GP) the best on F4, VSMPSO(GP) on F7, CAL-SAPSO on F5. SAHO was the best on F2, F3, and F6. So SAHO had better optimization results than other SAEAs; however, from Table 3, SAHO and CAL-SAPSO took hundreds of times longer than other SAEAs. We considered optimization results and the running time as the bi-objective problem. In the first row of Table 3, “Best fitness” means the average value of the best fitness values obtained by 30 independent operations performed by each algorithm, and “Time” means the mean time obtained by 30 independent operations performed by each algorithm. We consider “Best fitness” and “Time” as two objectives, referred to as the non-dominated solution (NDS) and dominated solution (DS). We sorted the comparison algorithm results and marked all non-dominated solutions (NDSs) on the Pareto front with a black hollow pentagram in Fig. 3. From Table 3, we can see that VSMPSO (RBF) achieved NDS on F1, F4, F5, F6 and F7, similar to MGP-SLPSO. From Fig. 3, on F1 and F3, the dominant solution (DS) achieved by VSMPSO (RBF) is the closest to NDS. In Fig. 2, the convergence rate of VSMPSO (RBF) is second only to SAHO on F3 and has the fastest convergence curve on F7. MGP-SLPSO had the shortest running time because Matérn32 function was used as the kernel function. In our previous work [51], when the commonly used squared exponential (SE) kernel function is used, compared with using Matérn32, the running time of GP model-assisted SLPSO even increases over ten times. SE is the most commonly adopted covariance function in GP-assisted optimization algorithms, for example, in [9, 15, 52, 53].

We compared VSMPSO with MGP-SLPSO, SAHO, and SACOSO on 50D functions. In Table 5, we can observe that MGP-SLPSO achieved the best mean value on F1 and F4, and SAHO achieved the best mean value on F2, F3, F5, F6, and F7. VSMPSO(RBF) achieved a similar performance with SAHO on F7. From Table 3 and Fig. 5, VSMPSO(RBF) derived NDS on F3, F5, F6, and F7, which is one less than MGP-SLPSO and SAHO. From Fig. 4, the convergence curve of VSMPSO(RBF) is very close to the SAHO one on F6 and F7. From Fig. 5, the dominant solution(DS) achieved by VSMPSO(RBF) is the closest to NDS on F1, F2, and F4. In Table 4, the running time spent by VSMPSO(GP) is dozens or even hundreds of times that of VSMPSO(RBF); the running time spent by VSMPSO(GP) is dozens of times higher than that of VSMPSO(RBF) on F1, F4, and F6 functions, and more than 100 times higher on F2, F3, F5, and F7. The slow computational efficiency makes VSMPSO(GP) unable to run on higher dimensions. Furthermore, from Table 4 and Fig. 5, VSMPSO(GP) has no NDS on 50D functions, whereas VSMPSO(RBF) has 4 NDSs. The overall performance of VSMPSO(RBF) is better than VSMPSO(GP). Hence, in the next comparative test, we used VSMPSO(RBF) algorithm to compare with other algorithms; it is abbreviated as VSMPSO in the following algorithm comparison.

Numerical results and analysis on 50D F1–F7 functions with 2000 FEs

To further observe the optimization ability of VSMPSO, we extended the computational budget to 2000 FEs. From the aforementioned analysis, VSMPSO (GP) was time-consuming, and its optimal performance was inferior to VSMPSO(RBF). Therefore, in the following algorithm comparison, we only used VSMPSO (RBF), referred to as VSMPSO. Although VSMPSO did not achieve good optimization performance on the 50D with 1000 FEs, it has better performance with 2000 FEs. In Table 6, MGP-SLPSO achieved the best mean value on F1; SAHO achieved the best mean value on F2, F3, F4, and F7. VSMPSO achieved the best mean value on F5 and F6, and has similar performance and no significant difference with SAHO on F7. Furthermore, VSMPSO achieved the second-best mean value behind SAHO on F2 and F3, and the running time of VSMPSO was smaller than SAHO. MGP-SLPSO algorithm took the shortest running time. In Table 7, MGP-SLPSO achieved 7 NDSs owing to the shortest running time. VSMPSO achieved 5 NDSs same as SACOSO, and SAHO achieved 4 NDSs. VSMPSO did not achieve NDS on F1 and F2, but the DS achieved by VSMPSO is closest to the NDS in Fig. 7. The solutions achieved by VSMPSO, MGP-SLPSO and SAHO are very close to the abscissa value (i.e., the minimum fitness value) in Fig. 7. In Fig. 6, the performance of VSMPSO is stable on different functions, and the convergence curves achieved good results on F5, F6, and F7.

Fig. 4
figure 4

Convergence profiles of MGP-SLPSO, CAL-SAPSO, SAHO, SACOSO, VSMPSO(GP) and VSMPSO(RBF) on 50D F1–F7 with 550 FEs

Fig. 5
figure 5

Distribution of solutions of MGP-SLPSO, CAL-SAPSO, SAHO, SACOSO and VSMPSO on 50D F1–F7 with 550 FEs

Fig. 6
figure 6

Convergence profiles of MGP-SLPSO, SAHO, SACOSO and VSMPSO on 50D F1–F7 with 2000 FEs

Fig. 7
figure 7

Distribution of solutions of MGP-SLPSO, SAHO, SACOSO and VSMPSO on 50D F1–F7 with 2000 FEs

Table 5 Statistical results of MGP-SLPSO, SAHO, SACOSO, VSMPSO(GP) and VSMPSO(RBF)on 50D F1–F7 with 550 FEs
Table 6 Statistical results of MGP-SLPSO, SAHO, SACOSO and VSMPSO on 50D F1–F7 with 2000 FEs
Table 7 Statistical results (best fitness and cost time) obtained by MGP-SLPSO, SAHO, SACOSO and VSMPSO on 50D F1–F7 With 2000 FEs

Numerical results on higher dimensional problems

Next, we compared the algorithm performance on higher-dimensional problems. The convergence profiles of VSMPSO, SACOSO, MGP-SLPSO and SAHO on F1–F7 with 100D and 200D are shown in Figs. 8 and 10. VSMPSO is compared with MGP-SLPSO, CAL-SAPSO, SAHO and SACOSO on 100D. In Table 8, MGP-SLPSO obtains the best mean value on F1 and F4. SAHO obtains the best mean value on F2, F3, F5 and F7. VSMPSO performs the best only on F6. However, on F7, the results of VSMPSO are not significantly different from that of SAHO, and the standard deviation is smaller than SAHO. In Table 9, VSMPSO obtains 5 NDSs, which is the highest NDSs, SAHO achieves 4 NDSs, MGP-SLPSO achieves 3 NDSs, and SACOSO achieves 1 NDS. On F1 and F4, VSMPSO did not achieve NDS, but the DS achieved by VSMPSO is closest to the NDS achieved by MGP-SLPSO in Fig. 9.

Because it takes more than a week for SAHO to obtain the optimization results of one function on 200D problems, it is too expensive to compare VSMPSO with SAHO on 200D problems. So VSMPSO is compared with MGP-SLPSO and SACOSO on 200D problems. In Table 10, VSMPSO obtains the best mean value on F2, F3, and F5, and has a similar performance with SACOSO on F7. SACOSO obtains the best mean values F6 and F7. MGP-SLPSO obtains the best mean value on F1 and F4, and VSMPSO obtains the best mean value second only to MGP-SLPSO on F1 and F4. The running time is no order of magnitude difference between VSMPSO and the MGP-SLPSO, and even the time spent by VSMPSO on F5 is less than MGP-SLPSO. Thus, VSMPSO can achieve a better balance between the optimization effect and the time-consuming and obtains better optimization results on higher dimensional problems (Table 11 and Fig. 11).

Table 8 Statistical results of MGP-SLPSO, SAHO, SACOSO and VSMPSO on 100D F1–F7 with 1000 FEs
Fig. 8
figure 8

Convergence profiles of MGP-SLPSO, SAHO, SACOSO and VSMPSO on 100D F1–F7 with 1000 FEs

Table 9 Statistical results (best fitness and cost time) obtained by MGP-SLPSO, SAHO, SACOSO and VSMPSO on 100D F1–F7 with 1000 FEs
Fig. 9
figure 9

Distribution of solutions of MGP-SLPSO, SAHO, SACOSO and VSMPSO on 100D F1–F7 with 1000 FEs

Table 10 Statistical results of MGP-SLPSO, SACOSO and VSMPSO on 200D F1–F7 with 2000 FEs
Fig. 10
figure 10

Convergence profiles of MGPSLPSO, SACOSO and VSMPSO on 200-D F1–F7 with 2000 FEs

Table 11 Statistical results (best fitness and cost time) obtained by MGP-SLPSO, SACOSO and VSMPSO on 200-D F1–F7 with 2000 FEs
Fig. 11
figure 11

Distribution of solutions of MGPSLPSO, SACOSO and VSMPSO on 200-D F1–F7 with 2000 FEs

Effects of variable model management strategy

To test and verify the effectiveness of sample selection strategy in VSMPSO, four different sample selection strategies are compared in empirical studies. Therefore, we design a framework of surrogate model-based PSO (SMPSO), in which the RBF model is used as a single surrogate model and SLPSO as the optimizer. The detailed explanation of SMPSO is located in Algorithm 3. The difference between SMPSO and VSMPSO is the model management strategy. SMPSO only searches for the most promising solution from the current population to evaluate, but VSMPSO takes advantage of the proposed variable model management strategy to search for the most promising solutions from the current population and the current RBF model. SMPSO-RS, SMPSO-AS, SMPSO-FS1 and SMPSO-FS2 are all based on the SMPSO framework; their only difference is in step 7 of the sample selection strategy. The details of different sample selection strategies are as follows:

  1. 1.

    SMPSO-RS: The sample selection strategy is the same as in VSMPSO; simple random sampling (RS) is used for selecting random \(\lambda \) samples in DB.

  2. 2.

    ASMPSO-AS: All samples (AS) in DB are chosen for training model, the same as in [32].

  3. 3.

    SMPSO-FS1: Fixed selection of the newest \(\lambda \) samples in DB, the same as in [9].

  4. 4.

    SMPSO-FS2: Fixed selection of the top \(\lambda \) samples in DB, same as CAL-SAPSO used for training local model.

figure i

SMPSO-FS1 and SMPSO-FS2 chose fixed sampling (FS) as the sample selection strategy, and the fixed sample number was \(\lambda \), the same as the sample number in VMPSO and SMPSO-RS. As noted in “Proposed VSMPSO algorithm” section, the crucial element affecting VSMPSO is the variable model management strategy. The variable model management strategy has two steps: Step 1, a simple random sampling (RS) strategy is used for selecting random \(\lambda \) samples in DB to construct the RBF model in each generation. Step 2, the current RBF model information is deeply mined by finding the minimum of the surrogate model. Two potential optimum points( PGbest and MGbest ) are considered in the variable model management strategy. PGworst, the individual with the worst predicted fitness value in the current population, is then replaced by MGbest.The following experiment results further demonstrate that this is an effective method to improve the diversity of the model and avoid local optimum without using multiple models.

Fig. 12
figure 12

Convergence profiles of SMPSO-RS, SMPSO-AS, SMPSO-FS1, SMPSO-FS2, VSMPSO on 30-D F1–F7 with 330 FEs

Fig. 13
figure 13

Distribution of solutions of SMPSO-RS, SMPSO-AS, SMPSO-FS1, SMPSO-FS2 and VSMPSO on 30-D F1–F7 with 330 FEs

Table 12 Statistical results of VSMPSO-RS, VSMPSO-AS, VSMPSO-FS1, VSMPSO-FS2, VSMPSO on 30-D with 330 FEs
Table 13 Statistical results of VSMPSO-RS, VSMPSO-AS, VSMPSO-FS1, VSMPSO-FS2 and VSMPSO on 50-D F1–F7 with 550 FEs
Table 14 Statistical results (best fitness and cost time) obtained by SMPSO-RS, SMPSO-AS, SMPSO-FS1, SMPSO-FS2 and VSMPSO on 30-, 50-, 100-D F1–F7 with \(11 \cdot D\) FEs
Fig. 14
figure 14

Convergence profiles of SMPSO-RS, SMPSO-AS, SMPSO-FS1, SMPSO-FS2 and VSMPSO on 50-D F1–F7 with 550 FEs

Fig. 15
figure 15

Distribution of solutions of SMPSO-RS, SMPSO-AS, SMPSO-FS1, SMPSO-FS2 and VSMPSO on 50-D F1–F7 with 550 FEs

Table 15 Statistical results of SMPSO-RS, SMPSO-AS, SMPSO-FS1, SMPSO-FS2, VSMPSO on 100-D with 1100 FEs
Fig. 16
figure 16

Convergence profiles of SMPSO-RS, SMPSO-AS, SMPSO-FS1, SMPSO-FS2 and VSMPSO on 100-D F1–F7 with 1100 FEs

Fig. 17
figure 17

Distribution of solutions of SMPSO-RS, SMPSO-AS, SMPSO-FS1, SMPSO-FS2 and VSMPSO on 100-D F1–F7 with 1100 FEs

The contribution of the simple random sampling strategy in VSMPSO is inferred by comparing the results of SMPSO-RS with SMPSO-AS, SMPSO-FS1, and SMPSO-FS2. From Table 12, on 30D functions, SMPSO-RS achieved better optimization results on F3 and F7 than SMPSO-AS, SMPSO-FS1, and SMPSO-FS2, and the optimization results of SMPSO-RS on F2, F4, F5, and F6 were second only to SMPSO-AS. From Fig. 12, the trend of convergence curves is consistent, except that the F1 and F3 convergence curves are slightly dispersed. From Table 12, on 50D functions, SMPSO-RS achieved better optimization results on F2, F4, F6, and F7 than SMPSO-AS, SMPSO-FS1, and SMPSO-FS2, and the optimization results of SMPSO-RS on F1, F3, and F5 were slightly less than SMPSO-AS but better than SMPSO-FS1 and SMPSO-FS2. In addition, from Fig. 14, the trend of convergence curves for SMPSO-RS are very similar to that of SMPSO-AS on most functions.

From Table 15, SMPSO-RS achieved better optimization results on F3, F6, and F7 than SMPSO-AS, SMPSO-FS1, and SMPSO-FS2, and the optimization results of SMPSO-RS on F1, F2, F4, and F5 were slightly lower than SMPSO-AS but better than SMPSO-FS1 and SMPSO-FS2. Moreover, from Figs. 1315 and Table 14, SMPSO-RS achieved non-dominant solutions for both 30D and 50D problems. From Fig. 17, the dominant solution obtained by SMPSO-RS is closest to the non-dominant solution in abscissa value (that is the minimum fitness value) on F1, F2, F4, and F5. Conversely, SMPSO-FS1 and SMPSO-FS2 have similar results on Tables 1213 and 15, and show similar convergence curve on Figs. 1214 and 16. Moreover, both SMPSO-FS1 and SMPSO-FS2 have worse results than SMPSO-RS and SMPSO-AS; hence, the performance of fixed sampling methods is inferior to all sampling methods and the simple random sampling method. The performance of simple random sampling (RS) is significantly better than that of fixed sampling (FS) and SMPSO-RS performs better on complex problems. Conversely, SMPSO-AS takes more time than SMPSO-RS on 30D and 50D functions. When all samples are used for modeling in SMPSO-AS, it is time-consuming.

By comparing VSMPSO with SMPSO-RS, SMPSO-AS, SMPSO-FS1 and SMPSO-FS2, we can see the contribution of Step 2. In Step2, VSMPSO takes advantage of the variable model management strategy to search for the most promising solutions from the current population and the current RBF model to evaluation. This differs from the strategy of all SMPSO algorithms that only searches to evaluate the most promising solution from the current population. From Tables 1213 and 15, the results of VSMPSO are significantly different from those of other algorithms. On 30D functions, VSMPSO obtained the best results on F1, F2, F3 and F7, besides obtaining slightly lower results with SMPSO-RS on 30-D F4, F5 and F6. However, on 50D functions, VSMPSO achieved better optimization results on F1, F2, F3, F4 and F7, slightly less than SMPSO-RS on F4 and F5. Furthermore, on 100D functions, VSMPSO achieved the best optimization results on F1, F2, F3, F4, F5 and F7, besides obtaining slightly lower results with SMPSO-AS on F6. VSMPSO has clear advantages on 100D than 50D and 30D, the improvement becomes more remarkable when the dimension of decision space D increases.

From the results mentioned above, we concluded that combining the two main innovations of the proposed VSMPSO in “Proposed VSMPSO algorithm” contributed to improving the performance of the proposed algorithm. First, from the results of comparing SMPSO-RS and three other algorithms (SMPSO-AS, SMPSO-FS1, and SMPSO-FS2), the random sample selection method is the same as the one we used in VSMPSO, eliminates the weaknesses of the fixed samples. Second, comparing the results of SMPSO-RS and VSMPSO, looking for the most promising solution from two different angles, especially the most promising solution from the current RBF model, quickening the optimal speed.

Parameter sensitivity analysis

The parameters in VSMPSO, like the parameters of the optimizer SLPSO and the number of consumed function evaluations (FEs), may significantly influence the proposed algorithm’s performance. For a fair comparison, the parameters of the optimizer SLPSO are set the same as the recommended parameters in [40]. For all the algorithms compared in this paper, the termination condition depends on the number of consumed function evaluations (FEs). The computational budget is less than \(11\cdot D\) number of function evaluations (NFEs) [32], which means that the limited number of fitness evaluations is 11 times the dimension of the problem. In VSMPSO, the value of \(\lambda \) is 80% of the total sample number. Then, we only need to analyze the parameters \(\lambda \) of VSMPSO, to explore the influence of different sample sizes on optimization performance. In the next experiment, we analyzed the performance comparison of VSMPSO for different sample sizes. The comparison algorithms VSMPSO-50, VSMPSO-60, VSMPSO-70, VSMPSO-80, VSMPSO-90, and VSMPSO-100 represent VSMPSO with different samples, which respectively represent the number of selected samples \(\lambda \) accounting for 50%, 60%, 70%, 80%, 90% and 100% of the total sample size, and VSMPSO-100 means all samples for modeling. Furthermore, all parameters in VSMPSO-80 are the same as those in VSMPSO, meaning the results of VSMPSO-80 are consistent with VSMPSO mentioned in the abovementioned experiments.

Table 16 Statistical results of VSMPSO with different samples on 30-D F1–F7 with 330 FEs
Table 17 Statistical results of VSMPSO with different samples on 50-D F1–F7 with 550 FEs
Table 18 Statistical results of VSMPSO with different samples on 100-D F1–F7 with 1100 FEs
Fig. 18
figure 18

Convergence profiles of VSMPSO with different samples on 30-D with 330 FEs

Fig. 19
figure 19

Distribution of solutions for VSMPSO with different samples on 30-D F1–F7 with 330 FEs

Fig. 20
figure 20

Convergence profiles of VSMPSO with different samples on 50-D F1–F7 with 550 FEs

Fig. 21
figure 21

Distribution of solutions for VSMPSO with different samples on 50-D F1–F7 with 550 FEs

Fig. 22
figure 22

Convergence profiles of VSMPSO with different samples on 100-D F1–F7 with 1100 FEs

Fig. 23
figure 23

Distribution of solutions of VSMPSO with different samples on 100-D F1–F7 with 1100 FEs

Table 19 Obtained solutions of VSMPSO with different samples on 30-, 50-, 100-D F1–F7 with \(11 \cdot D\) FEs
Fig. 24
figure 24

Convergence profiles of SAHO and VSMPSO on 100-D F8–F11 with 1100 FEs

Fig. 25
figure 25

Distribution of solutions of SAHO and VSMPSO on 100-D F8–F11 with 1100 FEs

Table 20 Obtained solutions for VSMPSO and SAHO on 100-D F8–F11 with 1100 FEs

From Figs. 181920212223 and Tables 161718, VSMPSO-80 performs best on 30D F1 and F3, 50D F3 and has no significant difference from VMPSO-100 on 100D functions. From Table 19 that, VSMPSO-80 obtained 3 non-dominant solutions on 30D functions, 4 non-dominant solutions on 50D functions, 5 non-dominant solutions on 100D functions. From Figs. 1921 and 23, most non-dominant solutions of VSMPSO-80 are basically knee points. Based on the aforementioned results, in general, the larger the sample size taken by VSMPSO, the better is the algorithm performance except F6. However, in terms of the time spent by the VSMPSO in different sample sizes, the larger the sample size, the longer is the running time of VSMPSO. By making a compromise between optimization performance and time-consumption, we select 80% of the total sample number as the number of training samples for modelling in VSMPSO.

Numerical results on complex problems

To further compare algorithm performance, we compared VSMPSO with SAHO, which has the best optimization results from the previous comparison experiment, for test suite CEC 2017 [48], which has been recently proposed and is relatively complex. As can be seen from Fig. 24 curve convergence diagram, the performance of VSMPSO on F9 and F11 with 50D was slightly worse than SAHO; however, on other functions, especially on all 100D functions, VSMPSO obtained similar convergence curves to SAHO. However, from Fig. 25 and Table 20, the average time spent by SAHO is dozens or even hundreds of times that of VSMPSO. However, there is minimal difference between the optimal solutions obtained by VSMPSO and SAHO. It follows that, even on CEC 2017, the more complex benchmark functions, VSMPSO can achieve a better balance between the optimization effect and the time consumption, and obtains better optimization results on 200D dimensional problems.

Conclusions

In this paper, a single surrogate-assisted evolutionary algorithm, called VSMPSO, has been proposed for high-dimensional expensive optimization problems. We have considered both optimization results and optimization time consumption as bi-objectives when comparing algorithm performance. The proposed VSMPSO has shown promising performance on high-dimensional test problems with dimensions up to 200. It overcomes the shortcoming of using a single model in SAEAs, trapping in local optimum easily, and saves on training model time while improving on performance. Experimental results show that VSMPSO performed well on high-dimensional problems. In the future, we are interested in improving the performance of the proposed algorithm by considering the relationship between the candidate solutions and surrogate-management strategies and then extending it to higher-dimensional or multi-objective optimization problems.