Dimensionality Reduction, Modelling, and Optimization of Multivariate Problems Based on Machine Learning

: Simulation-based optimization design is becoming increasingly important in engineering. However, carrying out multi-point, multi-variable, and multi-objective optimization work is faced with the “Curse of Dimensionality”, which is highly time-consuming and often limited by computational burdens as in aerodynamic optimization problems. In this paper, an active subspace dimensionality reduction method and the adaptive surrogate model were proposed to reduce such computational costs while keeping a high precision. In this method, the active subspace dimensionality reduction technique, three-layer radial basis neural network approach, and polynomial ﬁtting process were presented. For the model evaluation, a NASA standard test function problem and RAE2822 airfoil drag reduction optimization were investigated in the experimental design problem. The efﬁcacy of the method was proved by both the experimental examples in which the adaptive surrogate model in a dominant one-dimensional active subspace is given and the optimization efﬁciency was improved by two orders. Furthermore, the results show that the constructed surrogate model reduced dimensionality and alleviated the complexity of conventional multivariate surrogate modeling with high precision.


Introduction
Optimal design usually involves multiple disciplines, multidimensional variables, and complex and time-consuming calculation models [1,2]. Thus, carrying out multipoint, multi-variable, and multi-objective optimization work would suffer the "Curse of Dimensionality" [3]. Combining big data analysis and machine learning, the intelligent optimization methods based on variable parameter dimensionality reduction and their application is the current development trend, which can reduce time and space complexity, save the overhead of unnecessary features, and improve the optimization effect [4][5][6][7].
In complex physical systems, scientists and engineers study the relationships between the models' inputs and outputs. They employ computer models to estimate the parameters and their effects on the system. However, the process becomes intricate-if not impossible-when the simulation is expensive and the model has several inputs. To enable such studies, the engineers may attempt to reduce the dimension of the model's input space.
Active subspaces are an emerging set of dimension reduction tools that identify important directions in the parameter space. Reducing the dimension can enable otherwise infeasible parameter studies [8]. Hence, the research contribution of this paper is based on active subspaces to explore and use the dimensionality reduction feature structure existing in the optimization problems. The internal main active features were extracted by using 2 of 20 the samples of input and output, transforming the high-dimensional optimization problem into a low-dimensional subspace to be handled. Artificial Neural Networks (ANNs) and other machine learning algorithms were then employed to build high-efficiency and highprecision surrogate models, which can both ensure the design precision and boost the optimization speed. The main contribution is to establish a lower-dimensional high-fidelity surrogate with ANN and polynomial fitting for efficient optimization design.
The rest of the paper is structured as follows: Section 2 presents related works and essential backgrounds. Section 3 presents the details of the proposed mechanism for constructing the surrogate model based on the radial basis function (RBF) with its pre and post-phases. The experimental design is presented in Section 4, and the obtained results and discussions are listed under Section 5. Finally, Section 6 includes the paper's conclusion.

Related Work
Surrogate models are used to perform simulations of complex systems. The cost of constructing accurate surrogate models with several input parameters increases exponentially, especially for time-consuming optimization problems. The dimensionality of the input sample will make the surface fitting process computationally difficult to achieve in some situations, leading to an efficiency bottleneck.
Dimensionality reduction of variables [9] and corresponding surrogate model installation in reduced dimensions need to be studied by transforming high-dimensional problems into low-dimensional problems. A simple dimensionality reduction method could be carried out using sensitivity analyses [10] to determine which design parameters have a larger influence on the system response. The less insensitive parameters are neglected to save the dimensionality considered in the surrogate regardless of some input variables that may lead to a low-fidelity model [11]. The low-fidelity surrogate model cannot satisfy the precision of nonlinear multivariate problems, e.g., the transonic airfoil [12].
Another dimensionality reduction strategy established by an effective surrogate model is to find an active subspace in a lower dimension in the entire variable space, which represents the maximum change directions of the system response and input variables. Compared with traditional dimensionality reduction methods, this approach can hold the maximum influential effects of design variables on the objective, thus providing an equivalent high-fidelity surrogate model of a much lower dimension.
The concept of the active subspace was introduced by Russi [13] and formalized by Constantine et al. [14]. Due to the efficiency of these dimensionality reduction techniques, active subspaces can be used and studied in some engineering and mathematical problems [4,8,[15][16][17]. Recently, many studies have been proposed to tackle the real world and industrial problems such as the optimization of the industrial copper burdening system [18], the surrogate model for low Reynolds number airfoil based on transfer learning [19], and parameter reduction of composite load model proposed by the Western Electricity Coordinating Council [20]. Moreover, In the research proposed by Wang et al. [21], twenty-two dimensional functions related to airfoil manufacturing errors were approximated through the response surface in the one-dimensional active subspace. The work demonstrated that the uncertainty of the aerodynamic performance can be significantly reduced using a measurement information function that selects a small number of inspection points on the airfoil surface.

Active Subspace
Based on active subspaces, parameter dimensional reduction can take place to reduce the time and space complexity, save the overhead of unnecessary features, and improve the optimization effect. The mapping from input samples to output models can be seen as a multivariate function. Mostly, the engineering models contain several parameters. The active subspace consists of a low-dimensional subspace compared with the input sample space. The low-dimensional subspace represents the majority of the variability in the objective function. However, active subspaces seem to identify the low-dimensional subspace compared with the space of inputs. In other words, active subspaces identify a set of important directions. Through these directions, simulation prediction changes fastest where other directions will not affect the prediction of the simulation so that it can be neglected.
From the dimensional analysis perspective, the result is that when the measurement units are changed, the relationships between physical quantities do not change. For example, the relationship between the speed at which an object falls to the ground and the height at which the object falls does not depend on whether the height is in feet or meters. Many different learning algorithms can achieve this relationship where each specific learning algorithm will generate a model with a specific inductive bias. The inductive bias of this learning algorithm is very important since different learning algorithms have different inductive biases. Consequently, with the same training data, different learning algorithms will generate different results. Therefore, the selection of the learning algorithm depends on the real problem to be solved.
There are some requirements to determine whether the simulation model is qualified to use an active subspace. The simulation model and the inputs should be well defined, where each input should have a range and enough resources to run the simulation model. These requirements include the following: Normalized inputs: a vector x should include m components, each component ∈ [−1, 1]. The purpose of the normalization is to remove the possibility that some components may be too large, which can greatly affect the result, and some may be too small so that their effect could be ignored [8]. x is a vector with m components which represents the lower bounds of vector x, and x u represents the upper bounds of vector x. The computation method to normalize the input is as follows Sampling density ρ: checking the dimension reduction sampling randomly is very useful in high dimensional space. Engineers must choose ρ, where different ρ values represent different results. Determining which ρ value is better depends on the suitability of each ρ to represent the parameter variability in each situation. There is no universal prescription for choosing ρ in the active subspace whereas the only constraint is that ρ must be a probability density. Active subspace mostly comes out from the gradient ∇f(x) concept, and it is necessary to have the ability to evaluate the gradient. As in (2), the formula produces eigenvectors and eigenvalues. The eigenvectors are used to represent the active subspace dimension, and the eigenvalues that correspond to the eigenvectors are used to determine the active subspace [8]. First, it computes each x i corresponding gradient ∇f(x i ). Then it calculates the eigenvalue decomposition. W represents the orthogonal matrix of eigenvectors, and Λ = diag(λ 1 , · · · , λ n ) represents the diagonal matrix of nonnegative eigenvalues.
From the above, eigenvectors are used to represent the active subspace dimension, and the eigenvalues correspond to the eigenvectors, which are used to determine the active subspace. So, finding the gap in eigenvalues and choosing the first k eigenvalues to represent the active subspace is needed.
In complex systems, the active subspace works well provided that they have two features. The first is where the output changes monotonously with respect to the parameters when scientists informally describe the influence of parameters on the model. The second is to give the model a nominal parameter value in theoretical or actual measurement situations. Back-of-the-envelope estimates of the variation of the input parameters usually adopt a perturbation form of 5-10% relative to the nominal value. This perturbation Symmetry 2022, 14, 1282 4 of 20 often attributes little change to the model. If the input perturbations are within the range, the linear model can represent the global trends well. These features are based on some complex models where many complex models do not have those features. Hence, using active subspace is not helpful in such systems [22].
In the quick-and-dirty check procedure, the least-squares method is required to come up with the coefficients of the linear regression model. The method of solving the model based on the minimization of the mean square error is called the least-squares method [23]. First, the data set D = {(x 1 , y 1 ), (x 2 , y 2 ), . . . ,(x m , y m )}, x i = (x i1 ; x i2 ; . . . ; x id ), y i ∈ R, the sample has d attribute description. Then the coefficient of this equation could be calculated as follows To find the most suitable coefficient, EŴ = (y − Xŵ) T (y − Xŵ) should be the smallest where , y = (y 1 ; y 2 ; . . . ; y m ).
To minimize EŴ, let ∂EŴ Then the suitable coefficient can be estimated using the least-squares method.

Surrogate Models
The surrogate models are very helpful for optimization problems that require multiple evaluations of the objective function. When each of the objective function evaluations takes a long time, the optimization itself becomes tricky. In particular, population-based evolution technology can use black-box simulators in parallel. However, in the presence of hardware constraints, there is a better method using a surrogate model to approximate the objective function first and then to use the surrogate model to perform the optimization process [24].
The surrogate model is considered as using the known independent and dependent variables' data to construct the mathematical relation expression. It is different from finding the unknown parameter in the mathematical relation expression. Therefore, the relational assumption can obtain an accurate surrogate model. The complete mathematical relation expression must be set up in the end where, generally, the mathematical relation expression is continuous. Although there are many discontinuous changes, the surrogate model is more suitable to analyze continuous change problems. The surrogate model is used to predict unknown sample points based on known sample points. Under different assumptions, the construction of the surrogate model can be different. Then the surrogate model can be divided into two types, namely, a regression model and an interpolation model.
The sample points usually contain a certain "noise" in the regression model so that this kind of surrogate model does not need to go through each sample point since in that way it cannot obtain the real mathematical relation. The regression model can not only predict the value of the unknown point but can also filter the noise of the sample data. In contrast, the interpolation model goes through each sample point and assumes each sample point to be correct. The value of the unknown point can be generated through the interpolation method. As the surrogate model is considered as an estimation of the known point to the unknown point, it has the error term. Hence, while constructing the surrogate model, the different assumptions and requirements of error can lead to different surrogate models [25]. The aerodynamic airfoil optimization process is very important in aircraft design. With the development of Computational Fluid Dynamics (CFD) techniques and the vast research on optimization algorithms in recent years, researchers have made great progress in aerodynamic shape optimization [25]. Using high-performance computers, the design period is especially shortened. However, useful parameterizations of airfoil or wing shapes for engineering models often result in high-dimensional design variables, so it is a great challenge to search for the optimum design due to the large solution space [9,26]. Aerodynamic designers now demand a quick and efficient approach to airfoil selection in burdensome engineering projects [10,11]. The key issue is to construct a high-fidelity efficient surrogate model that can obtain the desired solution as soon as possible [12].
There are some aspects to evaluate the performance of a surrogate model such as the generalization ability, the ability of the trained prediction model to correctly reflect samples that do not appear in the training set, ease of implementation, and the training speed. While training the surrogate model, both the prediction accuracy of the training set and the generalization ability are very important. However, these two features are not directly proportional. In some situations, such as for the regression surrogate model, when the prediction accuracy of the training set reaches a certain level, if it increases continuously, the accuracy of the sample point outside the training set will decrease. Therefore, the generalization ability will decrease.
A simple way to improve the generalization ability is to determine the parameters of the surrogate model many times to improve the prediction accuracy of the training set, compare it with the prediction accuracy of samples outside the training set, and eventually choose the better surrogate model between the trials. Another method is to cross-validate the sample points while training the agent model. Cross-validation means that when constructing a surrogate model, the sample points are divided into the model-building part and the verification prediction error part. For the same sample set, different divisions are performed to obtain different prediction errors and processed to obtain the prediction accuracy of the sample points in the sample set. Through cross-validation, more stable and reliable sample points can be obtained [25].

Artificial Neural Networks
One of the most popular surrogate models is the application of Artificial Neural Networks (ANNs) [27]. ANN methods became famous as universal function approximators because they provide good results on unseen data and their evaluation process is relatively computationally inexpensive. The back propagation (BP) neural network is the essence of the current ANNs. It is a network that contains multiple hidden layers that can deal with linear inseparable problems, but it is easy to fall into a local optimal solution and has a strong sensitivity to the initial weights. On the other hand, ANNs have strong learning capabilities, fault tolerance, large-scale parallel computing capabilities, fast computing speed, distributed storage capabilities, and very strong nonlinear mapping capabilities, etc. The radial basis function (RBF) method is a traditional method for multivariate function interpolation. Broomhead and Lowe [28] applied the RBF to the design of neural networks to construct RBF neural networks. It simulates a neural network structure in the human brain. For a certain local area of the network input space, only a few nodes will affect the output of the network, which is a local approximation network. Compared with other types of ANNs, RBF neural networks have a deep physiological basis, fast learning ability, excellent approximation performance, simple network structure, and have been widely used in many fields [29,30].

Finding an Active Space
To reduce the design variables' dimensionality and accelerate the optimizers, the methodology introduces the active subspace fundamentals. Active subspace can be used as an effective dimensional reduction strategy. It includes the positive semidefinite matrix which consists of the average products of partial derivatives of the objective interest and the eigenvectors of a symmetric. If a lower dimension active subspace can be found, the active input samples' coordinates can be reduced to construct the equivalent surrogate model. The main feature of the active subspace is that it consists of important directions of the input space so that it has a lower dimension compared with the input dimension. The input samples create the largest change in these directions if the active subspaces exist in a multivariate problem since it is useful to find reduced coordinates. The average derivative of f is obtained as follows where x ∈ [−1, 1] n includes all the normalized input variables, ρ is the sampling density on the input space, ∇ x f is the gradient of f with respect to x, Λ = diag(λ 1 , · · · , λ n ) is the diagonal matrix of nonnegative eigenvalues, and W is the orthogonal matrix of eigenvectors.
The eigenvalues measure how f changes on average [31] as follows Any x can be represented as x = W 1 y + W 2 z where y represents the active variables and z are the inactive variables. If Λ 1 contains r < n comparatively larger eigenvalues, i.e., there exists a large gap between λ r and λ r+1 , W 1 represents the first r eigenvectors and W T 1 x is the reduced coordinate y, i.e., the active variable [13]. Then, the approximation can be expressed as When the computational fluid dynamics (CFD) simulations for aerodynamic design have gradient capabilities (e.g., adjoint-based derivatives or algorithmic differentiation [32]), the formula in (6) and its eigenpairs can be obtained numerically using the Monte Carlo method or quadrature rules [8] as follows For more common optimization designers that do not have subroutines for gradients, a diagnostic regression surface with least squares is employed for uncovering onedimensional activity. M simulation samples are drawn independently based on the density ρ andâ = argmin whereâ is the coefficient describing the subspace combination structure. There are no eigenvalues computed in (12), but the vector a −0 = [â 1 , . . . ,â n ] T actually approximates a one-dimensional active subspace linearly as When a linear one-dimensional active subspace exists, this regression surface method with least squares is much more effective than the finite-difference in (10), which comes without huge computational cost and solver oscillation issues.

Equivalent Surrogate Construction
The RBF neural network is a single hidden-layer forward neural network. The input layer is composed of some sensing units, which transfer the information to the hidden layer. A radial basis function is used as the activation function for the hidden layer, which has a larger number of neurons in the nonlinear conversion from the input space to the hidden layer space. The output layer acts as the linear combination of the hidden layer's output. As shown in Figure 1, there is an RBF neural network structure with I inputs, K hidden layer neurons, and M outputs. x p = [x 1p , x 2p , . . . , x Ip ] T is the p th input sample, where p denotes the number of all input samples and I represents the input dimensionality.
T ∈ p×1 is the output vector of the k th hidden layer neuron corresponding to the p th input sample, where c k is the center of the neuron. Additionally, k = 1, . . . , K, W = [w 1 , w 2 , . . . , w M ] ∈ (K+1)×M represents the output weight matrix, in which w m = [w 0m , w 1m , . . . , w Km ] T represents the connection weight between the hidden layer and the m th output node. Y = [y 1 , y 2 , . . . , y P ] T is the output matrix corresponding to the p input samples and ∑ represents the linear activation function.
eigenvalues computed in (12), but the vector a = [a , … , a ] actually approximates a one-dimensional active subspace linearly as C ≈ a a ρ dx = a a , w = a /‖a ‖.
When a linear one-dimensional active subspace exists, this regression surface method with least squares is much more effective than the finite-difference in (10), which comes without huge computational cost and solver oscillation issues.

Equivalent Surrogate Construction
The RBF neural network is a single hidden-layer forward neural network. The input layer is composed of some sensing units, which transfer the information to the hidden layer. A radial basis function is used as the activation function for the hidden layer, which has a larger number of neurons in the nonlinear conversion from the input space to the hidden layer space. The output layer acts as the linear combination of the hidden layer's output. As shown in Figure 1, there is an RBF neural network structure with I inputs, K hidden layer neurons, and M outputs. x = [x , x , … , x ] is the p input sample, where p denotes the number of all input samples and I represents the input dimensionality. φ = [φ(x , c ), … , φ(x , c )] ∈ ℜ × is the output vector of the k hidden layer neuron corresponding to the p input sample, where c is the center of the neuron. Additionally, k = 1, . . . , K , W = [w , w , … , w ] ∈ ℜ ( )× represents the output weight matrix, in which w = [w , w , . . . , w ] represents the connection weight between the hidden layer and the m output node. Y = [y , y , . . . , y ] is the output matrix corresponding to the p input samples and ∑ represents the linear activation function. The RBF network model is a flexible, reliable, and short-calculation-time surrogate model that can solve high-dimensional and high-order nonlinear problems well. The nature of the RBF network model is determined by the selected basis function. The Gaussian function is employed, which has the following formula: The RBF network model is a flexible, reliable, and short-calculation-time surrogate model that can solve high-dimensional and high-order nonlinear problems well. The nature of the RBF network model is determined by the selected basis function. The Gaussian function is employed, which has the following formula: where σ k is the width of the k th hidden layer neuron. x − c k is the geometry norm of x − c k , denoting the radial distance of the sample x and c k . Therefore, considering all the input samples, the m th node output is calculated as follows and the actual output matrix of RBF is obtained as Y = ΦW.

Polynomial Fitting
The least-squares method, which is the most common method to generate a polynomial equation from a given data set, is used. The concept of polynomial fitting is to use a polynomial expansion to fit all the input samples in a specified analysis area and determine the expansion coefficient. This method is useful in some simple models and easy to use. A polynomial function has the form as where the residual is expressed as To find a suitable expansion coefficient that can minimize the residual, partial derivatives can be used. Making each result equal to zero for each direction, the expansion coefficient as in (17), leads to finding the best value to minimize the residual.
Then, by performing some matrix transformations [33], a simplified Vandermonde matrix can be obtained and expressed as X T y = X T Xa. Finally, the expansion coefficients have an equation a = (X T X) −1 X T y.

The Standard Test Function
For experimental design, the proposed method was tested with a standard test function problem and RAE2822 airfoil drag reduction optimization. The standard test function is analytical and is a well-known standard test problem in the NASA Langley MDO Test Suite with seven design variables [22]. The mathematical formulation is as follows where the variable bounds for the minimization problem are 2.6 ≤ x 1 ≤ 3.6, 0.7 ≤ x 2 ≤ 0.8, 7.3 ≤ x 4 ≤ 8.3, 7.3 ≤ x 5 ≤ 8.3, 2.9 ≤ x 6 ≤ 3.9, and 5.0 ≤ x 7 ≤ 5.5. Figure 2 shows what each of the x variables refers to. We can run a quick-and-dirty check to find whether this standard test function has a one-dimensional active subspace or not by the following steps: After the dimensional reduction has been performed, the equivalent surrogate model construction is the next step. It starts by initializing m to 6, which represents the variables number, and N as the number of samples. Usually, N is initialized randomly within the range for each input sample variable x , x , x , x , x , x . All variables are represented by an N × 6 matrix that represents the total initialization value set where x ℓ represents the lower bound and x represents the upper bound of the sample value. After initialization, X is obtained as the value set after the normalization process where each element in X belongs to [−1,1]. Then, the value of f (x) corresponding to each input sample is calculated. The next step is to generate the coefficients of the linear

Finding One-Dimensional Active Subspace
We can run a quick-and-dirty check to find whether this standard test function has a one-dimensional active subspace or not by the following steps: where the dot operation is a component-wise multiplication.
(4) Then, come up with the coefficients of the following linear regression model using least squares where x j and the second parameter is f j [22].
After the dimensional reduction has been performed, the equivalent surrogate model construction is the next step. It starts by initializing m to 6, which represents the variables number, and N as the number of samples. Usually, N is initialized randomly within the range for each input sample variable x 1 , x 2 , x 4 , x 5 , x 6 , x 7 . All variables are represented by an N × 6 matrix that represents the total initialization value set where x represents the lower bound and x u represents the upper bound of the sample value. After initialization, X is obtained as the value set after the normalization process where each element in X belongs to [−1, 1]. Then, the value of f(x) corresponding to each input sample is calculated. The next step is to generate the coefficients of the linear regression model using the least-squares method, the basic technique that has already been discussed earlier. The coefficient equation is as followŝ Lastly, plot the figure where x = X * w, y = f. The result of X * w is an N × 1 matrix. Therefore, the input x which has six dimensions is converted to one dimension. If the figure shows the linear relationship between x and y, the one-dimension active subspace is found.

The RBF Network Model
The RBF network model is a surrogate model that can solve high-dimensional and highorder nonlinear problems well. In general, two steps to are used to train an RBF network. The first is to determine the center of the neuron c i . Next is to use the Back Propagation (BP) algorithm to determine the parameters w i and β i . It is worth mentioning that with this standard test function, it is not easy to use the RBF function. Hence, f(j) = sin(x 1 (j) + x 2 (j)) is employed in the dimensional reduction process and then the RBF is used to construct the surrogate model. From Figure 3, the black cross represents the input sample and the red line with points is the surrogate model. regression model using the least-squares method, the basic technique that has already been discussed earlier. The coefficient equation is as follows w * = (X X) X y = (X) (X ) X y = (X) y Lastly, plot the figure where x = X * w, y = f. The result of X * w is an N × 1 matrix. Therefore, the input x which has six dimensions is converted to one dimension. If the figure shows the linear relationship between x and y, the one-dimension active subspace is found.

The RBF Network Model
The RBF network model is a surrogate model that can solve high-dimensional and high-order nonlinear problems well. In general, two steps to are used to train an RBF network. The first is to determine the center of the neuron c . Next is to use the Back Propagation (BP) algorithm to determine the parameters w and β . It is worth mentioning that with this standard test function, it is not easy to use the RBF function. Hence, f(j) = sin(x (j) + x (j)) is employed in the dimensional reduction process and then the RBF is used to construct the surrogate model. From Figure 3, the black cross represents the input sample and the red line with points is the surrogate model.

Performing a Polynomial Fit
After finding the active subspace, the next step is to construct the equivalent surrogate model. The Co = polyfit(yy, f, 5) function in Matlab is used to perform a polynomial fit. Both yy and f are the results of the one-dimension active subspace procedure. Co = polyfit(yy, f, 5) finds the coefficients for a polynomial f of degree five and it is the best fit for the data. The coefficients in Co are in descending powers, and the length of Co is equal to 6. After that, the obtained coefficients and the input yy are used to obtain the corresponding f using the polynomial function y = a + a x+. . . +a x . Eventually, the surrogate model can be plotted as in Figure 3, which shows that the surrogate model achieved a perfect fit.

Optimization Using the Genetic Algorithm
Lastly, the optimization using the genetic algorithm is performed. The goal of the standard test function is to find the maximum of f(x), within the specified limited range. First, combine the dimensional reduction part with the polynomial fit part to be a func-

Performing a Polynomial Fit
After finding the active subspace, the next step is to construct the equivalent surrogate model. The Co = polyfit(yy, f, 5) function in Matlab is used to perform a polynomial fit. Both yy and f are the results of the one-dimension active subspace procedure. Co = polyfit(yy, f, 5) finds the coefficients for a polynomial f of degree five and it is the best fit for the data. The coefficients in Co are in descending powers, and the length of Co is equal to 6. After that, the obtained coefficients and the input yy are used to obtain the corresponding f using the polynomial function y = a 0 + a 1 x + . . . + a k x k . Eventually, the surrogate model can be plotted as in Figure 3, which shows that the surrogate model achieved a perfect fit.

Optimization Using the Genetic Algorithm
Lastly, the optimization using the genetic algorithm is performed. The goal of the standard test function is to find the maximum of f(x), within the specified limited range. First, combine the dimensional reduction part with the polynomial fit part to be a function called a fitness function. This step is about performing the optimization based on the surrogate model. Compared with the process of optimization based on the original function, using the surrogate model will reduce the computational cost. The fitness function can be used to tell the selection function, which sample x has a high possibility to remain, and which x needs to be abandoned. Through multiple iterations and selections, the final result x and the corresponding function value (fval) can be generated using the genetic algorithm. The default global optimal solution for the genetic algorithm is to generate the global minimum while the standard test function aims to find the maximum of f(x). Therefore, negating the objective function and finding the minimum value is equivalent to the maximum value of the original objective function.

Rae2822 Airfoil Drag Reduction Optimization
RAE2822 airfoil drag reduction optimization is more complex, and a general optimization workflow was designed and is shown in Figure 4. A Latin hypercube sampling (LHS) [22] strategy based on the maximum criterion was employed to generate the training samples for identifying active subspaces and constructing an RBF neural network in a reduced dimension. To evaluate the aerodynamic performance, the baseline airfoil was fed as an input and then the class-shape transformation (CST) parameterization was used to obtain the parameter needed. After that, an NS solver was used to obtain a high-fidelity C-mesh. This experimental optimization formulation is as follows where the optimization objective is to minimize the coefficient of drag C d and the airfoil thickness of the maximum t max is taken as the constraint. For the design, in the optimization process, one/two-step optimization, i.e., gradient-based optimizers combined with the Multi-Island Genetic Algorithm (MIGA) [22], can be chosen and the constraints are handled by punishment [34]. The MIGA has the feature that each set of probable solutions is divided into several probable solutions called "islands". On each island, the selection, crossover, and mutation operations can be performed separately. During this process, some of the individuals are selected from each island and some of the individuals are migrated to different islands in one iteration and roulette is used to carry out the selection. The advantage of the MIGA is that it can avoid the local maximum/minimum solution and suppress the chance of premature convergence. In the design, this work also considers a sequential least-squares quadratic program (SLSQP) algorithm for the optimum search. SLSQP has the property to solve nonlinear programming problems. The feasible optimum solution can be obtained efficiently by using the lower-dimensional RBF neural network and the two optimization algorithms as shown in Figure 5, which fully verifies the efficacy of the proposed method.

Finding One-Dimensional Active Subspace and Polynomial Fitting
To evaluate the aerodynamic performance, the baseline airfoil is fed as an input and then the class-shape transformation (CST) parameterization is applied to obtain the required parameters. After that, using an NS solver, a high-fidelity C-mesh is obtained. Then, there are thirty groups of airfoil samples, each X has ten variables, and a corresponding drag coefficient fm.
A function was designed to find the active subspace and the polynomial fit to construct the surrogate model. It is worth mentioning that using RBF only will not be sufficient and it will not provide a good fit. The process starts by loading the data shown in Table 1 and initializing m and N that represent the number of the original dimension and the number of the samples, respectively. Next is to initialize both the lower and upper bounds of X. Notably, the code is the same as the standard test function part where the only difference is the data. Symmetry 2022, 14, x FOR PEER REVIEW 12 of 20

Finding One-Dimensional Active Subspace and Polynomial Fitting
To evaluate the aerodynamic performance, the baseline airfoil is fed as an input and then the class-shape transformation (CST) parameterization is applied to obtain the required parameters. After that, using an NS solver, a high-fidelity C-mesh is obtained. Then, there are thirty groups of airfoil samples, each X has ten variables, and a corresponding drag coefficient fm.
A function was designed to find the active subspace and the polynomial fit to construct the surrogate model. It is worth mentioning that using RBF only will not be suffi-

Finding One-Dimensional Active Subspace and Polynomial Fitting
To evaluate the aerodynamic performance, the baseline airfoil is fed as an input and then the class-shape transformation (CST) parameterization is applied to obtain the required parameters. After that, using an NS solver, a high-fidelity C-mesh is obtained. Then, there are thirty groups of airfoil samples, each X has ten variables, and a corresponding drag coefficient fm.
A function was designed to find the active subspace and the polynomial fit to construct the surrogate model. It is worth mentioning that using RBF only will not be suffi-

Optimization with Constraint
The goal of this optimal process is to use the genetic algorithm with a defined constraint ∆t max || 0,max . Hence, in the genetic algorithm function, a constraint is added to satisfy this condition. The function is designed to obtain the maxthick from the function geometry with an inequality constraint. Furthermore, In the geometry function, the data of basic airfoil and data of thickness is loaded to generate the maxthick based on the input x. The first standard test function had only one objective function. However, in the airfoil, the thickness needs to be considered as the second objective. Under this constraint, finding the optimal result can be more complex and costly.

Results and Discussion
Firstly, the aim of the speed reducer design problem (as in (17)) is to minimize the weight of f(x). Fifty training samples were used for active subspace identification. A scatter plot of w T 1 x versus f is given in Figure 6, which reveals a dominant one-dimensional active subspace.  Figure 7 shows the blue hollow points which represent the input sample after dimensional reduction, and the red line represents the polynomial fitting that generates the surrogate model. This reveals the good fitting ability for polynomial fitting. In addition, it confirms the generation of a good surrogate model after fitting.  Figure 7 shows the blue hollow points which represent the input sample after dimensional reduction, and the red line represents the polynomial fitting that generates the surrogate model. This reveals the good fitting ability for polynomial fitting. In addition, it confirms the generation of a good surrogate model after fitting.
After optimization, the obtained results are shown in Table 2. Ten results for f(x) and the corresponding x 1 , x 2 , x 4 , x 5 , x 6 , x 7 values were recorded. It is concluded from the data that the minimum of f(x) was around 3915.55, and through calculating the variance of each variable x i , the result shows that the variance was less than 0.05, which means that the optimization result is very stable.
Secondly, RAE2822 airfoil optimization is considered the lift-constrained drag minimization. The optimization objective is to minimize the coefficient of drag C d . The optimization is performed with conditions such as Ma = 0.729, Re = 7.0 × 10 7 , α = 2 • .
The optimization formulation is as in (23) where the variation of t max being within 5% is also taken as a constraint. In the experiment, 66,521 grids and 30 training samples were generated by a high-fidelity C-mesh of RAE2822 airfoil. These samples solved the problems of active subspace identification. A scatter plot of w T 1 x versus f is given in Figure 8, which reveals a one-dimensional active subspace to enable a high-fidelity quadratic response surface model. The LOOCV was also employed for checking the approximation error, and the obtained result was a small MAE of 1.14E-4 as shown in Table 3. Furthermore, the optimization results of the drag coefficient, thickness, and the corresponding variable x are shown in Table 4. The input baseline conforms to the standard RAE2822 airfoil whose CFD solution is shown in Figure 9.  Figure 7 shows the blue hollow points which represent the input sample after dimensional reduction, and the red line represents the polynomial fitting that generates the surrogate model. This reveals the good fitting ability for polynomial fitting. In addition, it confirms the generation of a good surrogate model after fitting. After optimization, the obtained results are shown in Table 2. Ten results for f(x) and the corresponding x , x , x , x , x , x values were recorded. It is concluded from the data that the minimum of f(x) was around 3915.55, and through calculating the variance of each variable x , the result shows that the variance was less than 0.05, which means that the optimization result is very stable. Secondly, RAE2822 airfoil optimization is considered the lift-constrained drag minimization. The optimization objective is to minimize the coefficient of drag C . The optimization is performed with conditions such as Ma = 0.729, Re = 7.0 × 10 , α = 2 ∘ .The optimization formulation is as in (23) where the variation of t max being within 5% is also taken as a constraint. In the experiment, 66,521 grids and 30 training samples were generated by a high-fidelity C-mesh of RAE2822 airfoil. These samples solved the problems of active subspace identification. A scatter plot of w x versus f is given in Figure 8, which reveals a one-dimensional active subspace to enable a high-fidelity quadratic response surface model. The LOOCV was also employed for checking the approximation error, and the obtained result was a small MAE of 1.14E-4 as shown in Table 3.    Table 3. Approximation and optimized result of RAE2822 airfoil.  Furthermore, the optimization results of the drag coefficient, thickness, and the corresponding variable x are shown in Table 4. The input baseline conforms to the standard RAE2822 airfoil whose CFD solution is shown in Figure 9.  The convergence plot by the GA yields an optimized airfoil illustrated in Figure 9b. The optimality criterion was set as 10 −8 . Besides, the surface pressure coefficient distributions and shapes of the RAE2822 baseline and optimized airfoil are compared in Figures 10 and 11, respectively. Concerning the objective, the drag coefficient was optimized from 0.01123 to 0.0082 with approximately 27% reduction. Table 4 compares the CFD simulation and surrogate prediction of the optimized airfoil and verifies the design accuracy. As the design space [−0.01, 0.01] 10 is not sufficiently large, the shock wave effects were not optimized to the minimum. However, the discovered active subspace improved the optimization efficiency by more than two orders compared with the traditional optimization using several hundreds of CFD simulations. The convergence plot by the GA yields an optimized airfoil illustrated in Figure 9b. The optimality criterion was set as 10 . Besides, the surface pressure coefficient distributions and shapes of the RAE2822 baseline and optimized airfoil are compared in Figures 10 and 11, respectively. Concerning the objective, the drag coefficient was optimized from 0.01123 to 0.0082 with approximately 27% reduction. Table 4 compares the CFD simulation and surrogate prediction of the optimized airfoil and verifies the design accuracy. As the design space [−0.01,0.01] is not sufficiently large, the shock wave effects were not optimized to the minimum. However, the discovered active subspace improved the optimization efficiency by more than two orders compared with the traditional optimization using several hundreds of CFD simulations. Lastly, the work aimed to find the existing active subspace and employ the reduced dimension sample to generate the surrogate model. The optimization result was obtained for the multivariate problem. In some other research works, a sensitivity analysis was traditionally employed [10] to determine which design parameters have a larger influence on the system response, and the less influential parameters are ignored. However, in the proposed method, all design variables were considered to generate a one-dimension active subspace, which has comparatively higher precision. In addition, other approaches usually combine the dimensional reduction with the polynomial fit. In this work, the RBF was formerly used to reduce the problem dimension and then to generate the surrogate model.

MAE
To further investigate the proposed method's performance, Table 5 lists a comparison between some of the latest methods and the proposed one. These results have been interpreted in the way they were reported in each work. The reported results prove that the proposed method has higher accuracy rates in optimizing the problem coefficients.  The results show that both the standard test function and RAE2822 airfoil drag reduction optimization had a one-dimensional active subspace, which enabled an efficient surrogate model and accelerated the optimization design. The high precision of the second case was proved, and the result was improved after the optimization process. Therefore, combining big data analysis and machine learning, the intelligent optimization method based on the adaptive surrogate model was constructed in reduced dimensionality and its application is the current development trend. Hence, this proposed technique has great significance, which can reduce the time and space complexity, save the overhead of unnecessary features, and improve the optimization effect.

Conclusions and Recommendations
In this paper, the high effectiveness of the surrogate model usage in active subspaces was proved in both standard test function and RAE2822 airfoil drag reduction optimization. Compared with the traditional optimization using hundreds of time-consuming simulations, the optimization efficiency is expected to be improved by two orders, which would be greatly significant for existing real-world applications. The approximation error was rather small, which means that this process generates a high precision. The surrogate model constructed in reduced dimensionality also largely alleviates the complexity of conventional multivariate surrogate modelling. Future work can extend the advantages of the proposed method to more aerodynamic shape optimization or other related problems. Though a quick-and-dirty check for a one-dimensional active subspace has limitations, the average derivative of the f and the eigenvalues can be computed for r-dimensional active subspaces, which would be widely applied. This method is of great significance to the current high-dimensional reduction field and also provides a beneficial reference for other methods in tackling such multivariate problems more effectively.