A Novel Gaussian Process Surrogate Model with Expected Prediction Error for Optimization under Constraints

: Optimization, particularly constrained optimization problems (COPs), is fundamental in engineering, influencing various sectors with its critical role in enhancing design efficiency, reducing experimental costs, and shortening testing cycles. This study explores the challenges inherent in COPs, with a focus on developing efficient solution methodologies under stringent constraints. Surrogate models, especially Gaussian Process Regression (GPR), are pivotal in our approach, enabling the approximation of complex systems with reduced computational demand. We evaluate the efficacy of the Efficient Global Optimization (EGO) algorithm, which synergizes GPR with the Expected Improvement (EI) function, and further extend this framework to Constrained Expected Improvement (CEI) and our novel methodology Constrained Expected Prediction Error (CEPE). We demonstrate the effectiveness of these methodologies by numerical benchmark simulations and the real-world application of optimizing a Three-Bar Truss Design. In essence, the innovative CEPE approach promises a potent balance between solution accuracy and computational prowess, offering significant potential in the broader engineering field


Introduction
In the fields of science and engineering, optimization problems, especially those with constraints, play a crucial role.Constrained optimization problems (COPs) are vital in a wide range of applications, including process control [1,2], reactor design [3], and production scheduling [4].Among these, specific challenges such as tension/compression coil spring design [5] highlight the intricate nature of these problems.The aim of addressing COPs is to enhance design efficiency, minimize experimental costs, and reduce testing cycles.These problems hold intrinsic value across various engineering disciplines, including chemical, mechanical, and electrical engineering [6][7][8], as they seek optimal solutions within the boundaries set by constraints, be they equalities or inequalities [9].To solve these complex issues, a range of methodologies have been developed, from gradient-based methods [10], known for their efficiency in differentiable problems, to population-based strategies like Genetic Algorithms [11], which are capable of addressing non-differentiable and multi-modal challenges, albeit often at a higher computational cost [12].COPs are typically represented through mathematical formulations, underscoring their quantitative nature and the systematic approach required for their resolution.We define the formulation min y = f (x), st : l ≤ g(x) = (g 1 (x), g 2 (x), . . ., g m (x)) ≤ u, where l = (l 1 , l 2 , . . ., l m ), u = (u 1 , u 2 , . . ., u m ), x = (x 1 , x 2 , . . ., x n ) ∈ X, X = {x|x l ≤ x ≤ x u }, x l = (x l 1 , x l 2 , . . ., x l n ), x u = (x u 1 , x u 2 , . . ., x u n ), (1) where x is a solution vector within the solution space X, g(x) represents the constraints, and l and u are the lower and upper bounds of the constraints, respectively.There are two types of scenarios for the solution to this formulation: feasible and infeasible.A solution is considered feasible if the solution vector x ∈ X satisfies all the constraints g(x).Conversely, a solution is deemed infeasible if x ∈ X but does not satisfy all the constraints g(x).At times, the feasibility ratio (the percentage of feasible solutions out of the total number of feasible and infeasible solutions) can be very low, which is equivalent to having very stringent constraints.In such cases, it is imperative to find a solution that satisfies the constraints as efficiently as possible.
For solving complex optimization problems, especially those with constraints, the use of surrogate models has emerged as a highly effective strategy.Surrogate models, pivotal in the realm of optimization, provide simplified yet powerful approximations of complex systems, enabling efficient exploration and optimization of design spaces where direct evaluations are prohibitively expensive.These models, also referred to as metamodels, span a diverse array of methodologies, each tailored to capture the intricacies of different problem domains [13].To encapsulate these varying surrogate model methodologies and their application in constrained optimization problems, Table 1 provides a comprehensive summary of the key literature, highlighting techniques, tested datasets, and their respective advantages and disadvantages.Despite these advancements, a comprehensive review of the literature underscores certain limitations in the current surrogate modelling approaches, particularly when applied to constrained optimization problems [18].A notable challenge is the difficulty in accurately capturing and adapting to the complex constraint boundaries that often define feasible regions in optimization landscapes [19].Many existing surrogate models excel in unconstrained scenarios or situations with simple constraints but may fall short when faced with complex, nonlinear, or high-dimensional constraints [20].Additionally, there is a paucity of methodologies that effectively incorporate constraint handling mechanisms within the surrogate model itself, often relying on external penalty functions or constraint relaxation techniques that may not always yield optimal solutions.This gap in the literature highlights the need for more sophisticated surrogate models that can inherently deal with complex constraints while maintaining the balance between exploration and exploitation in the search space.These insights into the limitations of current surrogate models in handling constrained optimization problems have motivated the development of our proposed methodologies.By addressing these identified gaps, we aim to contribute to the advancement of surrogate-assisted optimization strategies, providing more robust and efficient solutions for complex constrained optimization challenges.
Among the diverse array of surrogate modelling techniques, Gaussian Process Regression (GPR) stands out for its efficiency and effectiveness [17].GPR is renowned for its robustness in capturing the underlying trends of the data with a quantifiable measure of uncertainty, making it particularly suitable for optimization problems where uncertainty plays a critical role [21].The historical development of surrogate models in optimization was notably advanced in 1998 when Jones et al. [22] introduced the Efficient Global Optimization (EGO) algorithm.This algorithm integrates GPR with the Expected Improvement (EI) function, a pivotal concept in Bayesian optimization.The EI function is designed to systematically identify the global optimum of computationally expensive black-box functions, which are often subject to inherent uncertainties [23].This approach prioritizes sampling in regions where the anticipated improvement in the function's value is maximized, thereby enhancing the model's accuracy in critical areas.This methodology exemplifies an active learning strategy, dynamically refining the model with each new data point to efficiently navigate the complex landscape of the optimization problem.However, when dealing with optimization problems that include stringent constraints, the standard EI approach may require modifications to accommodate these limitations.In response to this challenge, recent developments have introduced variants such as the Constrained Expected Improvement (CEI), which adapt the EI principle to handle constraints effectively [24,25].
In addressing the challenges inherent in constrained optimization, this study introduces significant innovations that extend the current state-of-the-art methodologies.Firstly, we propose a novel approach to compute the Expected Prediction Error (EPE) [26] at an untested point by leveraging the cross-validation error from a nearby tested point.This method enhances the prediction accuracy of our surrogate model, particularly in the sparse regions of the design space where traditional interpolation methods might falter.Secondly, we introduce the Constrained Expected Prediction Error (CEPE) criterion, a pioneering metric designed to navigate the intricate landscape of constrained optimization problems efficiently.By integrating CEPE within a Differential Evolution (DE) [27] framework augmented with Gaussian Process (GP) surrogates, our approach not only capitalizes on the strengths of evolutionary algorithms in exploring complex solution spaces but also harnesses the predictive prowess of GP models to make informed decisions during the optimization process.The synergy between these innovations presents a robust framework that promises improved optimization performance, especially in scenarios characterized by stringent constraints and expensive function evaluations.
The remainder of this paper is structured as follows.In Section 2, we introduce the theoretical knowledge about Gaussian Process, Expected Improvement and Expected Prediction Error.In Section 3, we discuss methodologies for Constrained Expected Improvement, introduce our novel method Constrained Expected Prediction Error, and provide the GP surrogate-assisted Differential Evolution (DE) algorithm.In Section 4, we evaluate the efficacy of CEI and CEPE using benchmark problems, and in Section 5, we illustrate their application in a real-world problem.Finally, we conclude this paper with discussions and remarks in Section 6.

Background and Concepts
In this section, we provide a brief theoretical overview of Gaussian Process Regression (GPR), Expected Improvement (EI), and Expected Prediction Error (EPE).

Gaussian Process Regression
We aim to use the Gaussian Process Regression (GPR) framework to model an unknown function f (x), by assuming that f (x) is a sample from a Gaussian process denoted as y(x).Within this context, for any input point x, the function value f (x) is viewed as a sample from a Gaussian random variable y(x) and y(x) is distributed as N (µ, σ 2 ), where µ and σ 2 are constants independent of x.The covariance of y(x) with another random variable y(x ′ ), where x ′ represents another point in the input space, is given by cov[y(x), y(x ′ )] = σ 2 Corr(y(x), y(x ′ )). ( The correlation function Corr(y(x), y(x ′ )) is defined as where p is the dimension of the input space and θ k are the hyperparameters.Considering a set of N input points x 1 , x 2 , . . ., x N , let f N = ( f (x 1 ), f (x 2 ), . . ., f (x N )) T and y N = (y(x 1 ), y(x 2 ), . . ., y(x N )) T .The parameters µ, σ 2 , θ 1 , . . ., θ p can be estimated by maximizing the joint Gaussian probability density function of where R is an N × N matrix with elements R i,j = Corr(y(x i ), y(x j )), and 1 is an Ndimensional column vector of ones.The log-likelihood function is then Now, let us differentiate the log-likelihood with respect to µ: Setting this derivative to zero and solving for µ gives the maximum likelihood estimate μ = Next, let us differentiate the log-likelihood with respect to σ 2 Setting this derivative to zero and solving for σ 2 gives the maximum likelihood estimate There is no analytical form for the estimates of θ 1 , . . ., θ p , so the maximization of the log-likelihood function with respect to θ 1 , . . ., θ p is usually completed by conjugate descent.
With the maximum likelihood estimates of the parameters, we can now derive expressions for predicting f (x) at a new point x.Let r be the vector of correlations between the new point and the N data points, where the i-th element of r is Corr(y(x), y(x i )).The best linear unbiased predictor (BLUP) of f (x) can be written as: where μ is the estimate of the mean in (7) and R is the correlation matrix calculated using the estimates of θ i .
Similarly, the prediction variance can be derived as: This forms the basis for Gaussian Process Regression, where we model the unknown function f (x) as a Gaussian process and make predictions at new points by combining information from the observed data.

Expected Improvement
In this section, we introduce a widely-adopted infill sampling method known as Expected Improvement (EI), developed by Jones et al. [22] for the optimization of expensive black-box functions.Expected Improvement is particularly useful once a surrogate model, typically Gaussian Process Regression, is constructed.
As previously defined in Equation ( 1), let f (x) denote the objective function with a surrogate model y(x) that follows a Gaussian process.Consider a set of test inputs {x 1 , x 2 , . . ., x N } with corresponding observed function values We define f min as the best observed value of the function among the test inputs, i.e., f min = min(f N ).The improvement of f (x) at a new input point x is defined as Expected Improvement is then defined as the expected value of this improvement, given the observed data Through some non-trivial mathematical manipulations, we can express the Expected Improvement in terms of the predictive mean and standard deviation of the surrogate model as follows: where f (x) and s(x) are the predictive mean and standard deviation of the surrogate model at x, computed according to Equations ( 10) and ( 11), and ϕ(•) and Φ(•) denote the probability density function (PDF) and cumulative distribution function (CDF) of the standard normal distribution, respectively.

Expected Prediction Error
In this section, we introduce the Expected Prediction Error (EPE), an active learning strategy employed to evaluate the accuracy of a predictive model.EPE quantifies the discrepancy between the predicted and actual values.Specifically, it is the expected value of the squared prediction error over a sample of data.EPE is sometimes referred to as the mean squared error (MSE).
Within the context of the GPR methodology, the posterior distribution of the response at a given location x is denoted as ŷ(x), which follows a Gaussian distribution expressed as ŷ(x) ∼ N (µ x , σ 2 x ).In this setting, the mean µ x symbolizes the predictive estimate given by f (x), while the variance σ 2 x reflects the uncertainty associated with the prediction, denoted as s 2 x .Given this framework, we can define the prediction error, often referred to as the loss function, L, for a point x as Here, y(x) embodies the true underlying response, perturbed by inherent observational noise.
Then, the overall generalization error of the GPR-based surrogate model is given by where E[L(x)] denotes the expected value of the prediction error, and D is a subset of R p .This expectation can be decomposed into where the first term represents the squared bias, capturing the average difference between the predicted and observed responses; the second term is the prediction variance of the surrogate model; and the third term is the variance of the noise Var(ϵ), which is often negligible.
The EPE quantifies the discrepancy between the GPR model's predictions and the true function values.In this context, f (x) denotes the true function we aim to approximate, and f (x) represents the GPR model's prediction at x. Thus, EPE at a point x is given by However, the true response f (x) is unknown in the bias term.Following the approach by Liu et al. [26], we employ leave-one-out cross-validation for estimation.Initially, we estimate the cross-validation errors at all training sample locations: where x i represents the i-th training sample, and f −i denotes the GP model trained using all training samples except (x i , f (x i )).Next, for any point x, we locate the nearest training sample to x, denoted as x i , and assign its associated cross-validation error to e 2 CV (x): Finally, incorporating the cross-validation error, we obtain a refined expression for the EPE: The Expected Prediction Error (EPE) plays a crucial role in assessing the accuracy and reliability of our surrogate model, particularly in constrained optimization problems.By calculating the EPE, we can identify regions in the design space where the model's predictions are less reliable, informing us where additional sampling might be beneficial.Moreover, as EPE comprises both bias and variance, it aids in striking a balance between the two, ensuring that the model is neither too simplistic nor too complex.This information is vital for efficient exploration of the search space and making informed decisions in the optimization process.

Methodology
In this section, we first present the CEI method proposed by Jiao et al. [25], then we introduce a novel method CEPE for solving constrained optimization problems (COP).To solve a COP, the objective is to minimize the objective function subject to several constraints.Before delving into the details of our proposed method, we will define some notations in the following subsection.
In this work, we treat x as an untested point, while x 1 , x 2 , . . ., x N are considered tested points.Let y(x) denote the objective function, and G i (x) represent the constraint functions for i = 1, 2, . . ., m.For a given point x, we define a Gaussian random vector G(x) as We introduce G(x) as a Gaussian random vector to represent the constraints' evaluations at the point x.This probabilistic modelling approach allows us to account for and manage the uncertainties associated with constraint evaluations, especially when exact determinations of these constraints are not feasible or when they are subject to variability due to measurement errors, model inaccuracies, or other sources of uncertainty.
For a set of N tested points x 1 , x 2 , . . ., x N , we define the following Gaussian random vectors: , and G mN represent the function values at the tested points.In contrast, y(x) and G(x) are the quantities to be predicted.The evaluated function values are , and G mN .Essentially, by using the known function values, we can predict y(x) and G i (x) for i = 1, 2, . . ., m at the tested points x 1 , x 2 , . . ., x N without additional function evaluations.

Constrained Expected Improvement
In the exploration of constrained optimization problems, we often encounter a combination of both feasible and infeasible solutions, as categorized by Jiao et al. [25].The feasibility of a solution x ∈ X is determined by whether it satisfies all constraints g(x).An essential aspect of optimization lies in efficiently transitioning from an infeasible to a feasible situation to reduce computational costs.In our methodology, we initially identify infeasible solutions by evaluating them against all defined constraints G i (x).
In the following, we introduce the concept of Constrained Expected Improvement to aid this process.For a solution x, we measure the extent of constraint violation by defining where l i and u i denote the lower and upper constraint boundaries, respectively.We represent the constraint violation of a solution as the maximum violation across all constraints, This value is zero for feasible solutions and positive for infeasible ones.A solution x is deemed infeasible if it violates any of these constraints, i.e., if G + i (x) > 0 for at least one constraint.This preliminary step is crucial for distinguishing between solutions that require improvement to meet feasibility criteria and those that are already feasible.
Let g +N min denote the smallest observed constraint violation among all evaluated points up to the current iteration, effectively representing the 'best' constraint violation scenario.This value is crucial for assessing the relative improvement of constraint satisfaction for new candidate solutions compared to the existing ones.To quantify the improvement in constraint violation at an untested point x relative to the current best solution, we define the constrained improvement for infeasible solutions as Here, I c,N (x) quantifies how much the constraint violation at an untested point x improves relative to the current best solution.
We can now derive the solution x as follows: where p G + (x)|g mN (•) represents the probability density function of the random variable conditional on G mN = g mN , and P G + (x)|g mN (•) denotes the cumulative distribution function under the same condition.For z ≤ 0, and for z > 0, the computation is as follows: where p G(x)|g mN (g 1 , . . ., g m ) is the joint Gaussian probability function of G(x) conditioned on g mN .Since the objective and all constraints are mutually independent, p G(x)|g mN (g 1 , . . ., g m ) can be computed by where ĝi (x) and s g i (x) (i = 1, . . ., m) are the expectations and variances of the constraints respectively.By substituting the result p G(x)|g mN (g 1 , . . ., g m ) obtained from Equation (29) into Equation ( 28), we obtain the cumulative distribution function for P G + (x)|g mN (•) under the condition G mN = g mN as follows Finally, the constrained expected improvement for a solution x is given by with ) )] calculated as mentioned above.In situations where a feasible point has already been identified, it is desired to maximize the Expected Improvement (EI) of the objective, ensuring that the feasibility conditions are met.This is essentially a quest for a feasible solution that provides the best possible objective value.This strategy is denoted as the Constrained Expected Improvement (CEI), represented by E[I c,N (x)|f N , g mN ].We define the improvement in the objective function subject to the satisfaction of constraints as I c,N (x).Assuming that y(x) and G i (x) for i = 1, 2, . . ., m are mutually independent, the constrained expected improvement based on y N = f N and G mN = g mN is given by where represents the probability of feasibility (PF), and E[I(x)|f N ] denotes the expected improvement at the point x.
In the CEI method, the constrained improvement can be defined as Here, f min is the best value of y(x) over all the test values f N , and Using this, we can now define the Constrained Expected Improvement as Here, E[I(x)|f N ] is the same with Equation ( 14), and the CEI of a solution x is given by As previously defined in Equation ( 1), the special case for COP formula has a lower constraint bound of l = −∞ and an upper constraint bound of u = 0. Equations ( 31) and (36) for CEI can be summarized as follows: For an infeasible situation: For a feasible situation: ) . (38)

Constrained Expected Prediction Error
Building upon the preceding subsection, which discussed the workings of Constrained Expected Improvement (CEI) under both infeasible and feasible situations, we now move our focus towards the Constrained Expected Prediction Error (CEPE).It is our novel method for constrained optimization problems.
In terms of infeasible situations, our approach with CEPE aligns with the methods used in the preceding CEI subsection.In terms of feasible situations of the CEPE method, we introduce the concept of prediction error, denoted as PE c,N (x), which is defined as follows: Here, f (x * ) is the observed output sample at the sample point x * , which is the nearest evaluated point to the candidate point x.This approach assumes that f (x) exhibits moderate continuity in the vicinity of x * , an assumption that is generally valid for smooth functions commonly encountered in engineering and optimization applications.
In this formulation, f (x * ) is the surrogate model's prediction at the point x * , and f (x) is the model's prediction at the candidate point x.The term s 2 (x) represents the prediction variance at x, reflecting the model's uncertainty.
Then we can obtain the formulation of the CEPE of a solution x below, This formulation takes into account the constraints by incorporating the product of the cumulative distribution functions for the constraints.It effectively combines the expected prediction error with the feasibility of the solution, guiding the optimization process towards regions of the design space where the model uncertainty is high and potential improvements in the objective function are likely, while still adhering to the constraints.
In summary, the CEPE method is underpinned by the assumption of moderate function continuity near x * , allowing the use of surrogate model predictions at nearby points to estimate the prediction error at untested points.This approach enables the efficient exploration of the design space, especially in the context of expensive function evaluations and stringent constraints.
Similarly, as previously defined in Equation ( 1), the special case for COP formula has a lower constraint bound of l = −∞ and an upper constraint bound of u = 0. Equations ( 31) and ( 41) for CEPE can be summarized as follows: For an infeasible situation: For a feasible situation:

The GP Surrogate-Assisted DE Algorithm
In this section, we outline and describe the steps of the GP surrogate-assisted Differential Evolution (DE) algorithm.The algorithm applies to problems with any number of dimensions n and any number of constraints m.The steps are as follows: • Step 1: Design of Experiments (DOE) Initialize the algorithm by assigning samples using the Latin Hypercube Design (LHD).The steps are as follows: 1.
Divide each dimension into popsize intervals of equal probability.

2.
Randomly assign one value to each interval within each dimension.

3.
For each dimension, permute the vector of assigned values randomly.

4.
Combine the permuted vectors of assigned values across dimensions into a popsize × n matrix.

•
Step 2: Identify the Best Point and Assess Feasibility Extract the objective function values and constraint violation values from the initial samples.Identify the feasible solutions (those with no constraint violations) and select the one with the optimal objective function value as the best feasible solution.If no feasible solutions exist, select the solution with the least constraint violation as the best infeasible solution.Update the data structure with information about the best solution discovered thus far, and return this information along with a feasibility flag.

•
Step 3: Clustering-Organize Tested Points into Clusters Given N tested points x 1 , x 2 , . . ., x N , if N ≤ L 1 (where L 1 is the maximum number of points a local model can contain), use all the points to build a single local model.If N > L 1 , apply fuzzy clustering to divide the points into c size clusters, where and L 1 > L 2 (where L 2 is the number of points for adding one more local model).The clustering minimizes the following function J: where α is a constant greater than 1, v j is the centre of the cluster j, u ij represents the membership degree of x i in cluster j, and ∥ • ∥ denotes the Euclidean norm.v j can be calculated as . ., N and j = 1, 2, . . ., c size , and set t = 0. Compute u t+1 ij using terminate and output v j and u ij = u t+1 ij ; otherwise, increment t and recalculate v j and u t+1 ij until the stopping criterion ϵ is satisfied.

•
Step 4: Modelling-Build Local GP Surrogate Models For each cluster, construct local Gaussian Process (GP) surrogate models for the objective function and the constraint functions separately.

•
Step 5: Differential Evolution-Generate and Evaluate Candidate Points Use Differential Evolution (DE) [27] to generate NP candidate points and evaluate them using the surrogate models.DE is a population-based optimization algorithm that aims to find the global minimum of a given objective function within a highdimensional search space.The algorithm employs mutation, crossover, and selection strategies to evolve the population towards the optimal solution.During the mutation step, new candidate solutions are generated by combining existing solutions with a weighted difference vector, where the scaling factor is denoted by F. In the crossover step, these candidates are combined with the original population to form a trial population, with a defined crossover probability CR.
Incorporate the Constrained Expected Improvement (CEI) and Constrained Expected Prediction Error (CEPE) criteria, derived from the Gaussian Process (GP) surrogate models, into the evaluation process of candidate points.By computing the CEI and CEPE values for each candidate point, the algorithm leverages these metrics to guide the selection process within DE.Candidate points are prioritized based on higher CEI values, indicating a greater expected improvement, or lower CEPE values, signifying a lower prediction error.This strategic prioritization ensures the exploration of regions in the search space with the highest potential for improvement while adhering to the constraints.This iterative process continues until a termination criterion is met, such

Simulation
In this section, we illustrate the efficacy of the proposed CEI (Constrained Expected Improvement) and CEPE (Constrained Expected Prediction Error) methods for addressing constrained optimization problems.We begin by introducing traditional benchmark problems, and then undertake a comparison and evaluation of the two methods by analyzing statistics for each problem.

Benchmark Problem Description
We employ the classic set of problems presented in CEC 2006 by Liang et al. [28], which was proposed as part of the IEEE Congress on Evolutionary Computation in 2006, and all problems we solve including the modified parts can be found by Chaiyotha and Krityakierne [24].This suite encompasses 25 real-parameter optimization problems, rep-resenting a diverse array of characteristics.GP (Gaussian Process) is advantageous in low-dimensional settings (i.e., when the number of dimensions n ≤ 10) [29].Consequently, we select problems with suitable dimensions n and numbers of constraints m, as higher dimensions would entail significantly extended computation times with our GP surrogateassisted DE algorithm.Nine problems are chosen, including two that we modify.The characteristics of these problems can be found in Table 2, where Prob denotes the name of the problems, Dim represents the number of decision variables or dimensions, f (x * ) signifies the best-known results, and ρ is the approximate ratio of feasible solutions within the search space (inclusive of both infeasible and feasible solutions).

Experimental Settings
As discussed in Section 3.3 regarding the GP surrogate-assisted DE algorithm, we will specify certain parameters for solving these numerical simulations, follows closely the methodologies and values reported in existing literature, particularly the work by Jiao et al. [25].This reference provided a foundational basis for our initial parameter settings, ensuring consistency and comparability with established practices in the field.The problems have n dimensions and m constraints.In the Latin Hypercube Design (LHD), we generate (11n − 1) initial samples, which serve as our population size (popsize).The parameters utilized in the fuzzy clustering are L 1 = 80, L 2 = 20, α = 2, and ϵ = 0.5.In the Differential Evolution (DE) process, we set the population size (NP) to 30, the number of generations to 500, the crossover probability (CR) to 0.9, and the scaling factor (F) to 0.5.In the CEPE method, the number of candidates is set to 50 × n.The number of function evaluations (FEs) is also set to 50 × n.Each benchmark problem is repeated 100 times using both methodologies (CEI and CEPE) with identical parameters to ensure fairness in the comparison.
The experiments were conducted on a personal computer with an Intel(R) Core(TM) i5-8265U CPU @ 1.60 GHz (turbo up to 1.80 GHz).The Intel Corporation is based in Santa Clara, California, USA.The computational analysis and simulations were performed using MATLAB version 2020b.This setup provided a consistent and controlled environment for all computational experiments, ensuring the reliability and reproducibility of our results.

Results
Figure 2 showcases box plots for nine different benchmark problems.Each box plot displays the distribution of values obtained from 100 independent experiments using two methods: CEI and CEPE.Additionally, Table 3 lists the mean, standard deviation (sd), best, and worst values of the objective function obtained by these methods after 100 iterations.The mean and standard deviation values indicate the quality of the best feasible solutions across the 100 independent experiments.The best and worst entries signify the lowest and highest values of the best feasible solution obtained from the 100 independent experiments for each method, respectively.Examining Table 3, we observe that in six of the benchmark problems (G02mod, G04, G06, G11, G12, and G24), the CEPE method yields lower mean values compared to the CEI method, indicating better performance.Conversely, the CEI method performs better in the other three problems.Moreover, the standard deviation values mostly align with the mean values, with seven benchmark problems (G02mod, G03mod, G04, G06, G08, G12, and G24) exhibiting lower standard deviations for the method that has the lower mean.Regarding the best values, CEPE outperforms or matches CEI in eight out of nine problems, with the exception being G11, where CEI has a slight edge.For the worst values, CEPE is superior in seven out of nine problems, with CEI prevailing only in G08 and G11.
To statistically compare the performance differences between CEI and CEPE, a nonparametric Wilcoxon rank-sum test was employed.This test is particularly suitable when the data is not normally distributed, and it assumes equal variances between groups.The null hypothesis posits that the two methods yield the same distribution of values, whereas the alternative hypothesis contends that the distributions differ.A test statistic representing the sum of the ranks of the values is computed.A significant test statistic indicates a notable difference between the two methods, and a p-value below 0.05 is deemed statistically significant.The notation X < Y indicates that method X is significantly better than method Y at the 5% significance level, whereas X ≈ Y signifies that there is no significant difference between the two methods.The results are compiled in Table 4.The table reveals that for most benchmark problems, CEI and CEPE exhibit similar performance, with CEPE being significantly better than CEI only for problem G02.

Application
In this section, we will apply the GP surrogate-assisted DE algorithm to the Three-Bar Truss Design problem.The Three-Bar Truss Design problem [30,31] is a classical optimization problem that is widely studied within the engineering and mathematical communities.The objective is to design a truss structure, composed of three bars, capable of withstanding a specified load while minimizing weight.This problem is particularly significant in the design of lightweight and efficient structures in fields such as aerospace and civil engineering.Moreover, it serves as a quintessential example of how optimization algorithms can address complex societal and civilizational challenges.The problem can be formulated as follows: minimize f (x) = (2 subject to : where 0 ≤ x 1 ≤ 1 and 0 ≤ x 2 ≤ 1; l = 100 cm, P = 2 KN/cm 2 , and σ = 2 KN/cm 2 . For the experiment, we employ the same parameter settings as in the numerical simulations.The problem has two dimensions and three constraints.Using Latin hypercube design (LHD), we generate 21 initial samples for our population size.The parameters in the fuzzy cluster are set to L 1 = 80, L 2 = 20, α = 2, and ϵ = 0.5.In Differential Evolution (DE), the population size is assumed to be NP = 30, the number of generations is set to 500, the crossover probability is CR = 0.9, and the scaling factor is F = 0.5.In the CEPE method, we set the candidate number to 100.The number of function evaluations (FEs) is 100.The experiment is repeated 100 times using two different methodologies (CEI and CEPE), and the parameters are kept the same to ensure a fair comparison.
Figure 3 depicts box plots that display the distribution of the objective function values obtained using the CEI and CEPE methods across 100 independent experiments.Table 5 lists the mean, standard deviation (sd), best, and worst of the objective function values obtained by both methods over 100 iterations.The best-known result for this problem is approximately 263.8958.From the table, it is evident that CEPE performs better in terms of mean, standard deviation, and worst objective function values, while CEI marginally outperforms CEPE in the best objective function value.
Additionally, we employ the Wilcoxon rank-sum test to statistically compare the performance of the CEI and CEPE methods.With a p-value of 0.0538%, the test indicates that the CEPE method is significantly superior to the CEI method.

Discussion and Remarks
In this paper, we introduced the formulation of constrained optimization problems, the methodologies of Constrained Expected Improvement (CEI) and Constrained Expected Prediction Error (CEPE) in conjunction with the Gaussian Process (GP) surrogate-assisted Differential Evolution (DE) algorithm, and presented numerical benchmark simulations and applications.Additionally, we would like to emphasize the potential application of these methodologies in the field of engineering, where optimization plays a crucial role.
as reaching the maximum number of generations Maxt or achieving a predetermined level of convergence.The flowchart of the DE algorithm, highlighting the integration of CEI and CEPE, is depicted in Figure 1.• Step 6: Evaluation-Use Original Objective and Constraint Functions Evaluate the NP candidate points using the original objective and constraint functions.This step can also be considered as the final selection step of Differential Evolution.• Step 7: Iteration-Update Data and Repeat Steps 3 to 6 Incorporate new data from the observations into the model, and repeat Steps 3 to 6 until the termination condition is reached.The termination condition is typically based on the number of function evaluations, denoted as FEs.

Figure 1 .
Figure 1.Flowchart of the Differential Evolution Algorithm.

Figure 2 .
Figure 2. Box plots of the best feasible solution after 100 repetitions by CEI and CEPE in nine benchmark problems.

Figure 3 .
Figure 3. Box plots representing the distribution of the best feasible solutions obtained after 100 repetitions using the CEI and CEPE methods in the Three-Bar Truss Design problem.

Table 1 .
Summary of surrogate model methodologies in constrained optimization problems.

Table 2 .
Summary of benchmark characteristics.

Table 3 .
Function values (mean, standard deviation, best, and worst) achieved by CEI and CEPE methods after 100 independent repetitions.Values in bold indicate the superior results across comparisons.

Table 4 .
Wilcoxon signed-ranks test results at the 5% significance level.

Table 5 .
Function values (mean, standard deviation, best and worst) achieved by CEI and CEPE methods after 100 independent repetitions.Values in bold indicate the superior results across comparisons.