An Efﬁcient Kriging Modeling Method Based on Multidimensional Scaling for High-Dimensional Problems

: Kriging-based modeling has been widely used in computationally intensive simulations. However, the Kriging modeling of high-dimensional problems not only takes more time, but also leads to the failure of model construction. To this end, a Kriging modeling method based on multidimensional scaling (KMDS) is presented to avoid the “dimensional disaster”. Under the condition of keeping the distance between the sample points before and after the dimensionality reduction unchanged, the KMDS method, which mainly calculates each element in the inner product matrix due to the mapping relationship between the distance matrix and the inner product matrix, completes the conversion of design data from high dimensional to low dimensional. For three benchmark functions with different dimensions and the aviation ﬁeld problem of aircraft longitudinal ﬂight control, the proposed method is compared with other dimensionality reduction methods. The KMDS method has better modeling efﬁciency while meeting certain accuracy requirements.


Introduction
With the improvement of design levels and application requirements [1][2][3], the optimal performance of products has attracted much attention. Although the computing power of computers continues to increase, both the computer simulation and analysis of complex problems are still very time consuming. The traditional modeling method requires more expensive simulation evaluation, so it is not suitable for solving black-box problems. Therefore, using Kriging surrogate models with higher approximation accuracy instead of complex simulation sequential modeling methods has become a research hotspot in the field of product design and optimization [4][5][6][7][8][9]. Approximate construction and space exploration are mainly completed through experimental design in Kriging models, given their potential to estimate information. The Kriging model, which could also help to shorten the design cycle, reduce research and development costs, and improve design accuracy, is suitable for solving complex black-box problems.
Although Kriging has demonstrated its great advantages in many fields [10][11][12][13][14], there are still bottlenecks that need to be overcome when using Kriging to solve highdimensional modeling problems. There are two main solutions. One is the use of parallel computing [15][16][17]; the other is to use dimensionality reduction methods to reduce the dimensionality of Kriging modeling, thereby improving the efficiency of Kriging modeling. In this work, we prefer the latter.
Depending on the different commonly used benchmark functions, it takes about 2000 s to 3500 s to complete the modeling of the 80-dimensional problem by sampling 200 training design points through sequential Kriging optimization according to our test. Therefore, the construction of a Kriging surrogate under high-dimensional problems is time consuming. Most of the time is spent on the inversion of the covariance correlation matrix and the solving of the Kriging correlation parameter (also known as the hyper-parameter).
Kriging modeling [18][19][20] not only considers the relationship between the location of the estimated point and the known data location, but also considers the spatial correlation of the variables. The relationship between positions is mainly coordinated and improved by the distance between data points. Moreover, the spatial correlation of variables [21,22], which is the key to constructing the covariance correlation matrix, is manifested in the Kriging correlation parameters obtained by maximizing the likelihood function through the pattern search algorithm. For Kriging sequential modeling with an increasing number of data points, the dimensions of the correlation parameter vectors are consistent with the dimensions of the design variables without considering dimensionality reduction, while the optimal correlation parameters are obtained by maximizing the complex multimodal likelihood functions. The estimation of the likelihood function in each loop demands the inversion of the covariance matrix. In addition, the dimensionality of the covariance matrix increases with the increase in the amount of sample data, which will cause the complexity of the inversion operation of the covariance matrix to also increase, so that the time spent obtaining the optimal correlation parameters will also increase. The above observations may explain why the Kriging modeling of high-dimensional problems is time consuming.
If the dimensionality of the design variables is high, the optimization of the correlation parameters becomes a complex and multimodal high-dimensional optimization problem, which will exceed our preset calculation amount. Many of the latest works have been focused on solving the "curse of dimensionality" problems that affect Kriging modeling. There are usually three ways to deal with such high-dimensional problems: one is to simplify the calculation of the inverse of the correlation matrix by dividing the high-dimensional space into several subspaces, and then build a high-dimensional model through integration; one is to improve the efficiency of Kriging modeling by reducing the calculation time of hyper-parameters under the condition that the dimensionality does not change; the other way is to establish a Kriging approximation model after dimensionality reduction.
Applying gradient enhancement Kriging (GEK) [23][24][25][26] using cheap gradient information to solve high-dimensional problems can effectively enhance the accuracy of the Kriging model. To this end, solving the large correlation matrices problem that affects the GEK models applied to high-dimensional problems enables GEK to be efficient and accurate. For example, the weighted gradient enhancement Kriging method [27] reduces the time spent decomposing large correlation matrices in high-dimensional gradient enhancement Kriging by establishing a series of sub-models with smaller correlation matrices and integrating into these sub-models appropriate weight coefficients. Chen et al. proposed a screening-based gradient-enhanced Kriging modeling method [28], which uses feature selection technology to evaluate and rank the impact of each input variable on the output, and proposes empirical screening criteria to select the gradient, and finally uses the selected partial gradient to construct the Kriging model, thereby improving the modeling efficiency of high-dimensional problems. By combining recursive decomposition and sequential sampling for the establishment of accurate Kriging problems in high-dimensional spaces, Kang et al. [29] perform variable decomposition using an interactive estimation of the full-dimensional Kriging meta-model, and divide the whole space into several subspaces according to the result of variable decomposition. The conversion from high-dimensional space sampling to low-dimensional space sampling is realized in each subspace.
Bouhlel et al. [30,31] proposed a method (KPLS) whereby the partial least squares (PLS) technique is combined with the Kriging model. This method uses the least squares dimensionality reduction method in the process of establishing the Kriging model, which reduces the number of hyper-parameter calculations of the model to be consistent with the number of principal components retained by the PLS, thereby accelerating the construction of the Kriging model. Boublel et al. [32] combined KPLS and GEK to propose a new GE-KPLS method. It adds gradient information to the correlation matrix. Although the size of the correlation matrix is slightly increased, it greatly reduces the number of hyperparameter calculations. Zhao et al. [33] proposed a new Kriging modeling method (KMIC) that combines Kriging and the maximum information coefficient (MIC). This method uses the MIC to estimate the relative size of hyper-parameters according to the characteristics of the hyper-parameters after optimization. Then, the maximum likelihood estimation is reconstructed to reduce the calculation amount of hyper-parameter estimation. A distance correlation-based Kriging modeling method (KDIC) [34], which is somewhat similar to the KMIC method, uses distance correlation (DIC) to estimate the relative size of the hyperparameters in the new correlation function, and transforms the hyper-parameter tuning process into a one-dimensional optimization problem.
For the dimensionality reduction methods used to improve modeling efficiency, Liu et al. [35] used Sammon mapping [36] to transform the training data from the original space to the lower dimensional space by minimizing the distance error between the sample points in the two data sets before and after dimensionality reduction to achieve the purpose of dimensionality reduction. However, this mapping method does not consider the relationship between input and response when reducing dimensionality. Zhou et al. [37] proposed a modeling method that combines Kriging and sliced inverse regression (SIR). The method uses SIR to construct a new projection vector to achieve dimensionality reduction. The projection vector reduces the dimensionality of the original input vector without losing the basic information of the model response. In low-dimensional space, a new Kriging correlation function is constructed using the tensor product of several correlation functions in each projection direction.
The above-mentioned dimensionality reduction method reduces the modeling time while ensuring that certain model accuracy requirements are met. After all, things have two sides. The improvement of modeling efficiency will lead to the loss of accuracy to a certain extent. Therefore, how to improve the modeling efficiency while minimizing the loss of precision still needs further research.
For this purpose, a new Kriging modeling method (KMDS) based on multidimensional scaling (MDS) is proposed in this work. The basic idea is as follows: firstly, highdimensional design data are converted to low-dimensional data through MDS; then, the reduced dimensional data are used to establish a low-dimensional Kriging model. After that, new high-dimensional design points can be obtained via Latin hypercube sampling and estimated using expensive functions, and then the sampling points can be added to the high-dimensional data set. Finally, we can go back to the first step and repeat the process until the number of sample points meets the stopping criterion.
The remaining arrangements for this work are as follows. The background of the Kriging model is introduced Section 2. The third states the key technology and outlines the realization process of the proposed method. The benchmark functions are tested and analyzed in Section 4. The fifth section is the study of the simulation examples, which are based on aircraft longitudinal flight control data. Conclusions are given in Section 6.
The KMDS algorithm addresses the difficulties of achieving high-dimensional modeling based on the Kriging model, and greatly improves the modeling efficiency on the basis of meeting certain accuracy requirements. It guides a new direction for the modeling of high-dimensional engineering problems. Therefore, it is suitable for publication in the Algorithms journal.

Kriging Model
Kriging is widely used to build surrogate models in applied mathematics, statistics, and machine learning. The introduction of its concise theory is given below.
For d-dimensional design data, X = [x 1 , . . ., x m ] T X ∈ m×d and objective response Y = [y 1 , . . ., y m ] T , Y ∈ m×1 , the Kriging model is expressed as follows: The formula consists of two parts. The first part µ(x) is the trend function (also called the expectation function). When µ(x) is equal to 0, µ, and ∑ p i=1 β i f i (x), the models are called simple Kriging, ordinary Kriging, and standard Kriging models, respectively, where the function f i (x) is the regression model, and the parameter β i is the coefficient of the i-th regression model. Standard Kriging uses the regression function to determine the change in the process mean [38].
The second part in Formula (1) is a stochastic process model established by observing the data and quantifying the correlation of the data. It is often considered to be the realization of the stochastic process Z(x) with a mean of 0 and a variance of σ 2 . The covariance between design points is shown in Formula (2): where process variance σ 2 is a scalar and θ is a relative parameter (or called hyperparameter) of the kernel function. By optimizing θ, the correlation between the design points can be adjusted adaptively. The expression of the spatial correlation function R θ, x i , x j is shown in Equation (3): There are many options for spatial correlation functions. However, the most widely used correlation function is the Gaussian function. The expression of the Gaussian model is shown in Equation (4): Under the above assumptions, the best linear unbiased estimate of y(x) is: Parameter r and R in Formula (5) are: Among them, the least square estimationβ of β can be calculated using Equation (7): In addition, the predicted mean square errorŝ(x) at any point x can be obtained by calculating the formula in Equation (8): It can be seen from Equations (3), (7) and (8)   on the maximum likelihood estimation theory, the unconstrained optimization algorithm is used to maximize Equation (10) to obtain the optimal θ value [39]: In fact, the best approximation curve is not necessarily obtained from the optimal value. As long as the obtained theta value is close to the optimal value, the Kriging model will have a good approximation result [39].

Multidimensional Scaling (MDS) Method
In most cases, although the sample data that people observe or collect are highdimensional, they may only be low-dimensionally distributed and closely related to the learning task. Generally speaking, a good dimensionality reduction method can usually greatly reduce the computational cost while preserving most of the function information. Dimensionality reduction means transforming the original high-dimensional attribute space into a low-dimensional "subspace" through a certain mathematical transformation. Multidimensional scaling (MDS) [40][41][42][43] can alleviate the problems associated with sparse sample data and the difficulty of distance calculations in high-dimensional situations. Different from the idea of linear dimensionality reduction, multidimensional scaling pays more attention to the internal characteristics of high-dimensional data. It focuses on preserving the "similarity" of information in the high-dimensional space, and the "similarity" is usually defined by the Euclidean distance. That is to say, the main idea of this work is to keep the distance between samples in the original space unchanged in the low-dimensional space.
Assume that the distance matrix of m samples in the original space is matrix T ∈ R m×m (Equation (11)), where the element t ij in the i-th row and j-th column is the distance from sample x i to sample x j .
The goal of dimensionality reduction is to obtain the representation Z ∈ R d ×m of the sample in the d (d ≤ d) dimensional space, and the Euclidean distance of any two samples in the d dimensional space is equal to the distance in the original space; that is, where B is the inner product matrix of the sample data after dimensionality reduction, and the calculation formula of each element in B is b ij = z T i z j . Therefore, the relationship between T and B can be stated by Equation (12): Center the sample Z after dimensionality reduction; that is, ∑ m i=1 z i = 0 (0 ∈ R d is a vector of all zeros). Obviously, the sum of all rows and the sum of all columns in matrix B are both zero, i.e., ∑ m i=1 b ij = ∑ m j=1 b ij = 0. Under this condition, Equations (13)-(15) can be generated from Equation (12): Algorithms 2022, 15, 3 where the symbol tr(·) refers to the trace of the matrix, and the trace of the inner prod- Assume that the following Formulas (16)-(18) hold, Equation (19) can be obtained due to Equations (12)- (18): The inner product matrix B can be obtained through the derivation of Equation (19) and the distance matrix T, which remains unchanged before and after dimensionality reduction.
The eigenvalue decomposition of matrix B is performed in Equation (20) to complete the dimensionality reduction: where parameter Λ = diag(λ 1 , λ 2 , . . ., λ d ) is the diagonal matrix composed of eigenvalues and According to Equation (12) and B = Z T Z , the Equations (21) and (22) can be formed: Assuming that there are d * non-zero eigenvalues, they form a diagonal matrix Λ * = diag(λ 1 , λ 2 , . . ., λ d * ). Let V * denote the corresponding eigenvector matrix, then Z can be expressed as follows: In practical applications, in order to reduce the dimensionality effectively, it is often only necessary that the distance in the space after dimensionality reduction is as close as possible to the distance in the original space, and does not have to be strictly equal. Next, arrange the d * non-zero eigenvalues in order from largest to smallest (it is worth noting that the order has been arranged during eigenvalue decomposition in MATLAB), and take the first d (d << d * ) eigenvalues as a low-dimensional diagonal matrix Λ = diag(λ 1 , λ 2 , . . ., λ d ). At this time, let V denote the corresponding eigenvector matrix, then Z can be expressed by Equation (24): Based on the above analysis, the MDS method can be simply described in Algorithm 1.

Determination of the Space Dimension d after Dimensionality Reduction
Multidimensional scaling can freely determine the low-dimensional space dimension d when performing dimensionality reduction on high-dimensional data. In this work, the dimension d = 3 is selected. For specific reasons, we use the 50-dimensional Griewank function as an example to explain. The expression of the Griewank function is shown in Equation (25): Table 1 is the modeling result of the 50-D Griewank function through Kriging and KMDS. Among them, KMDS-1, KMDS-2, KMDS-3, KMDS-4 and KMDS-5 refer to reducing the high-dimensional data to 1-D (one dimension), 2-D, 3-D, 4-D and 5-D data, respectively. During the modeling process, the Latin hypercube sampling (LHS) method is used to generate N (N = 10 or N = 20) initial sample points. The modeling time and accuracy of Kriging and KMDS methods when adding 10N sample points are compared. Each method was run 10 times separately, and finally the average value was taken as the result so as to ensure the accuracy of the results. The RMSE (root mean square error) values in Table 1 are obtained by using leave-oneout cross-validation [44]. The concrete expression for finding the RMSE is as follows: The parameter k here represents the number of samples in the current data. If the Kriging model is used to estimate the variance of point x i , we first need to reconstruct the Kriging model with the remaining k − 1 sampling points, except point x i . Then, we calculate the estimated varianceŝ 2 i of point x i by using the newly built Kriging model and Formula (8). After repeating k times to complete the variance estimation of these k sampling points, the average value can be calculated to obtain the RMSE following Equation (26).
To visually prove the stability and effectiveness of KMDS, the Griewank function was tested 10 times with different sample points, and the results are shown in the box plots in Figures 1 and 2. The middle line (solid red line) of each box plot shows the median value of the 10 data. The blue solid line above the box is the upper quartile line, and the blue solid line below the box is the lower quartile line. The top horizontal line is drawn on the basis of the upper quartile line at a distance of 1.5 times the interquartile range (the upper quartile minus the lower quartile) or the limit of the data (if the limit of the data is within 1.5 times the interquartile range). The bottom horizontal line is drawn in the same way. Abnormal data points outside the horizontal line are represented by "+". Judging from the median time (red solid line) and the size of the box (the area formed by the upper quartile and the lower quartile), the median value of the KMDS method is very low, and the box area is also very small. It can be seen from the RMSE box plot that the RMSE of the other four methods are ideal except for the larger median value and frame area of the KMDS-1 and KMDS-2 methods. This proves that the KMDS method can guarantee accuracy and has a greater advantage with regard to speed. In addition, this method has very little fluctuation in terms of the time spent on these ten modeling scenarios, and the error convergence of KMDS-3, KMDS-4 and KMDS-5 is relatively stable. The following characteristics can be further explained on the basis of the data shown in Algorithm 1 and Figures 1 and 2. (1) The running time of the KMDS method is greatly shortened compared to Kriging modeling. When adding 100 sample points, its modeling time is already 10-60 times faster than the Kriging method. When 200 sample points are added, the time spent during the KMDS method is about 18-94 times shorter than that of Kriging method. (2) In the low-dimensional space after dimensionality reduction, the The following characteristics can be further explained on the basis of the data shown in Algorithm 1 and Figures 1 and 2. (1) The running time of the KMDS method is greatly shortened compared to Kriging modeling. When adding 100 sample points, its modeling time is already 10-60 times faster than the Kriging method. When 200 sample points are added, the time spent during the KMDS method is about 18-94 times shorter than that of  The following characteristics can be further explained on the basis of the data shown in Algorithm 1 and Figures 1 and 2. (1) The running time of the KMDS method is greatly shortened compared to Kriging modeling. When adding 100 sample points, its modeling time is already 10-60 times faster than the Kriging method. When 200 sample points are added, the time spent during the KMDS method is about 18-94 times shorter than that of Kriging method. (2) In the low-dimensional space after dimensionality reduction, the higher the dimensionality, the longer the modeling time required by the KMDS method. For example, KMDS-5 took 15.5642 s to model the 220 sampling points, while KMDS-1 only took 2.9883 s. However, compared with the 279.4006 s required by the Kriging method, the actual time they took is not worth mentioning. (3) In terms of error, it is obvious that as the dimensionality of the low-dimensional space increases, the number of errors produced by the KMDS model becomes lower and lower. The error score of KMDS-1 and KMDS-2 is very large compared with the Kriging model, but when d is 3, the error score of KMDS-3 is very close to that of the Kriging model. Even though the RMSE will become lower and lower as the dimension d increases, the rate of decline is almost invisible. In addition, these figures also show that KMDS-1 and KMDS-2 not only have large number of errors, but also have poor stability. However, KMDS-3, KMDS-4 and KMDS-5 not only produce fewer errors, but they are also more stable in comparison to KMDS-1 and KMDS-2.
Based on the above examples and analysis, and balancing the requirements of accuracy and efficiency, it is reasonable to reduce the high-dimensional problem to a threedimensional problem. The reason is that when d is 3, the model error score of KMDS-3 is very close to that of the Kriging model, which can meet the accuracy requirements, and the time taken is dozens of times faster than the Kriging model. When d is greater than 3, the KMDS-4 and KMDS-5 methods do have fewer errors, but compared to KMDS-3, the error score is only reduced a little, while the time has doubled.
In addition, Sammon mapping [36] was used to further verify the dimensionality reduction effect of the KMDS method in this work. Sammon mapping achieves dimensionality reduction by minimizing the distance difference between corresponding points in the two spaces of the original space and the low-dimensional space; that is, by minimizing the error function of Equation (27): The above error is calculated for the original data and low-dimensional data of each modeling process, and the average error is taken as the final error. Accordingly, Table 2 shows the results of the error given by Equation (27). It can be seen from Table 2 that when d is 3, the range of the distance error score between the two spaces is 0.4 to 0.5, which is very small. It also proves the effectiveness of the KMDS dimensionality reduction method.

Flow Chart and Specific Steps of the KMDS Method
The flowchart of the KMDS method is shown in Figure 3, and the specific process is completed in the following seven steps.
Step 1: Initial experimental design. Set the sets X, X 1 and Y to be empty sets. The N initial sample points x i (i = 1, · · · , N) are generated using the Latin hypercube sampling (LHS) method, and they are added to the set X. Estimate the expensive function of the initial sample points to obtain y i (i = 1, · · · , N), which is added to the set Y.
Step 2: Dimensionality reduction. The multidimensional scaling method is used to reduce the dimension of data set X. Section 3.1 shows that the ideal low-dimensional space dimension d after dimensionality reduction is 3; that is, the original high-dimensional data are reduced to three-dimensional data. See Section 3.1 for the specific dimensionality reduction process. The low-dimensional data are stored in the set X 1 .
Step 3: Build the Kriging model. The Kriging model is constructed by using new sample data sets X 1 and Y. At this time, the Kriging model is three-dimensional.
Step 4: Obtain a new sample point. Obtain a new sample point x i through the LHS, and calculate its expensive function value y i .
Step 5: Determine whether the stopping criterion is met. In this work, the estimated number of sampling points is used as the stop criterion, and the maximum number of evaluations is set to N max . If the number of evaluations is less than N max , proceed to Step 6; if the number of evaluations is greater than or equal to N max , skip to Step 7.
Step 6: Update the data set. The new sample point x i is added to the sample set X, and y i is added to the sample set Y. Go back to Step 2.
Step 7: End. Output the Kriging model.  Step 1: Initial experimental design. Set the sets X , 1 X and Y to be empty sets.
The N initial sample points are generated using the Latin hypercube sampling (LHS) method, and they are added to the set X . Estimate the expensive function of the initial sample points to obtain , which is added to the set Y .
Step 2: Dimensionality reduction. The multidimensional scaling method is used to reduce the dimension of data set X . Section 3.1 shows that the ideal low-dimensional space dimension d′ after dimensionality reduction is 3; that is, the original high-dimensional data are reduced to three-dimensional data. See Section 3.1 for the specific dimensionality reduction process. The low-dimensional data are stored in the set 1 X .
Step 3: Build the Kriging model. The Kriging model is constructed by using new sample data sets and Y . At this time, the Kriging model is three-dimensional.

Numerical Test
The KMDS method is compared with the ordinary Kriging model and the KPLS method to prove the effectiveness of the KMDS method. The Kriging model has been explained in detail in the second section. The KPLS method proposed by Bouhlel in 2016 [30,31] is a combination of Kriging and the partial least square method. The core idea is to improve the modeling speed by reducing the complexity of hyper-parameter calculations. Therefore, the comparison between the KPLS method and the KMDS method will further demonstrate and verify the effectiveness of the KMDS method from the perspective of dimensionality reduction.
We use three different types and different dimensions of benchmark functions to test the effectiveness of the KMDS method. Different test conditions are used for each benchmark function to meet most of the situations encountered by the Kriging surrogate model in the modeling field. All tests are performed in MATLAB 2019a using a RedmiBook Pro machine with an AMD Ryzen 7700U 1.8 GHz CPU and 16 GB RAM.

Griewank Funciton Test
The first test function used is the Griewank with greater complexity, and its specific expression is shown in Equation (28): For the case where the dimensions of the function are 30 and 70, 110 and 220 sampling points are used for testing in order to effectively verify the effect and stability of the KMDS method in different dimensions and with different sample sizes. The Kriging method, KMDS, KPLS-1, KPLS-2 and KPLS-3 were used to test the modeling time and convergence accuracy of the Griewank function. Table 3 shows the average modeling time and its standard deviation (STD). Furthermore, the average modeling error and its standard deviation were obtained by testing the Griewank function in five different modeling methods, each of which represents the average of 10 test results. The test results show that the KMDS method always takes the shortest time to build the Kriging model under the conditions of different dimensions and different samples. With the increase in the number of dimensions and the number of sample points, the efficiency advantage of the KMDS method becomes more obvious. The main reason is because the KMDS method transforms high-dimensional modeling into three-dimensional modeling, so the amount of time spent calculating the correlation matrix and hyper-parameters in the modeling process is greatly reduced, thereby improving the modeling efficiency of Kriging. In terms of accuracy, the accuracy of the KMDS method is very close to that of ordinary Kriging. Compared with the KPLS method, when the condition of the test function is 30D and 110 samples, the test time of the KMDS method is only 0.1188 s faster than that of the KPLS-1 method, but the error score of KPLS-1 is 0.707 lower than that of KMDS. The KPLS-1 method is better at this time. The reason may be that the KPLS method considers the relationship between design variables and response variables to make the accuracy higher. In addition, retaining a principal component greatly reduces the amount of calculation of hyper-parameters, thereby improving the efficiency of Kriging modeling. However, when the Griewank function is 30D and 220 sample points, KMDS is faster than KPLS-1 by 4.077 s, but the difference in the error score is only 1.2166. For the Griewank function of 70D and 110 sample points, KMDS is 2.3 times faster than KPLS-1, and the difference in the error score is only 0.1328. Therefore, with the increase in the number of sample points or the increase in the number of test dimensions, the test time of the KPLS-1 method is much longer than that of the KMDS method. At this time, the advantage of the KMDS method in terms of time is far greater than the disadvantage in terms of accuracy. For KPLS-2 and KPLS-3, although the accuracy of these two methods is a bit higher than that of KMDS, the modeling efficiency cannot match the KMDS method. The reason may be that the calculation burden of the correlation matrix and hyper-parameters in the KMDS method is greatly reduced during the modeling process, while the KPLS method mainly reduces the calculation cost of hyper-parameters to improve efficiency. Therefore, the efficiency of the KPLS method is lower than that of KMDS. Therefore, the KMDS method can effectively improve modeling efficiency while maintaining a small loss of accuracy. In addition, the standard deviation also reflects the stability of the KMDS method, especially in terms of modeling efficiency. Figure 4 shows the average change trend of the modeling time and modeling error in ten tests of five different modeling methods for the 70D Griewank function. For each test, the Kriging model is firstly established through 10 initial sample points, and the corresponding modeling time is used as the first time. In the next iteration process, whenever a new sampling point is added, the Kriging model is updated again, and the total time spent executing the algorithm is recorded. When the number of newly added sampling points reaches 100, the entire iterative process is stopped. Figure 4 clearly shows how as the number of sample points increases, the modeling error scores of the five modeling methods gradually decrease, and the time it takes for KMDS is much less than that of the other four methods. In addition, when the modeling accuracy of the five different methods gradually stabilized, their RMSE values differed slightly, while the RMSE obtained by the KMDS method was generally small at first. Therefore, the KMDS method is superior to the other four methods in terms of efficiency under the premise of ensuring accuracy for high-dimensional problems.   The red and blue curves in Figures 5 and 6 are error bar graphs for the ten test results. The point on the solid line is the average of the test results of each method. The upper end of the error bar is the maximum value of the test result of each method. Similarly, the lower end of the error bar is minimum of the test result of each method. The green and purple curves are the standard deviations of ten tests with different methods. Compared with Kriging, KPLS-1, KPLS-2 and KPLS-3, the proposed method uses the least modeling time, and the error accuracy obtained is not much different from that of KPLS1, KPLS-2 and KPLS-3, which is exactly what we expect. Moreover, the standard deviation chart also reflects the stability of the proposed method.  The red and blue curves in Figures 5 and 6 are error bar graphs for the ten test results. The point on the solid line is the average of the test results of each method. The upper end of the error bar is the maximum value of the test result of each method. Similarly, the lower end of the error bar is minimum of the test result of each method. The green and purple curves are the standard deviations of ten tests with different methods. Compared with Kriging, KPLS-1, KPLS-2 and KPLS-3, the proposed method uses the least modeling time, and the error accuracy obtained is not much different from that of KPLS1, KPLS-2 and KPLS-3, which is exactly what we expect. Moreover, the standard deviation chart also reflects the stability of the proposed method.

Rothyp Funciton Test
The second test function is the Rothyp function, whose expression is shown in Equation (29):  (29) For the 50-dimensional and 100-dimensional problems of this function, 110 and 220 sampling points were used for testing. Its testing process is consistent with that of the first function. Table 4 shows the average modeling time and its standard deviation and the average modeling error score and its standard deviation, which were obtained by testing the Rothyp function under different modeling methods. Each result is the average of 10

Rothyp Funciton Test
The second test function is the Rothyp function, whose expression is shown in Equation (29): For the 50-dimensional and 100-dimensional problems of this function, 110 and 220 sampling points were used for testing. Its testing process is consistent with that of the first function. Table 4 shows the average modeling time and its standard deviation and the average modeling error score and its standard deviation, which were obtained by testing the Rothyp function under different modeling methods. Each result is the average of 10 experiments. Figure 7 shows the average change trend for the modeling time and modeling error in the ten-time modeling process of the five different modeling methods for 100D Rothyp function. Figures 8 and 9 are the error bar graphs and standard deviation curve graphs of the ten test results.     From Table 4 and Figures 7-9, the following characteristics can be observed. (1) The advantage of the modeling time for the KMDS method is particularly obvious. For example, in the case of 100 dimensions, the modeling time of KPLS-3 is approximately 33.28 times that of KMDS. Even the KPLS-1 method that retains only one principal component takes 5.19 times as much time as the proposed method, which fully illustrates the advantages of the KMDS method in terms of modeling efficiency. (2) Although the accuracy of the KMDS method is not as good as that of the KPLS methods, it is already very close to them. After all, any method has two sides. Increased modeling speed usually leads to lower accuracy. Significantly, this work has greatly improved the modeling efficiency under the condition of close accuracy, which is what we expect. (3) From the perspective of the error bar graph for modeling time and modeling accuracy and the trend chart for the standard deviation, the KMDS method also has good stability and effectiveness compared to the other four methods. This shows that the proposed dimensionality reduction method is effective in actual tests.

Michalewicz Function Test
The third test function is the Michalewicz function, which is expressed in Equation (30). This function is tested in 40 and 80 dimensions with 110 and 220 data points. The testing process of the third function is consistent with the testing process of the first function. Table 5 shows the average modeling time and its standard deviation and the average modeling error and its standard deviation, which were obtained by testing the Michalewicz function under different modeling methods. Each result is the average of 10 experiments. Figure 10 shows the average change trend for the modeling time and modeling error in the ten-time modeling process for the different modeling methods for the 40D Michalewicz function. Figures 11 and 12 are error bar graphs and standard deviation curve graphs of the ten test results.  From Table 5, it is easy to see that the test results for the Michalewicz function are similar to those of the Griewank function and the Rothyp function. However, in the case of 40D and 110 sampling points, the test results of the Michalewicz function are slightly different: the modeling time of KMDS is still the shortest, while the modeling error of the ordinary Kriging method is slightly smaller than that of the KMDS method. The complexity of the function itself and the uniformity of the proportions of each dimension may ordinary Kriging method is slightly smaller than that of the KMDS method. The complexity of the function itself and the uniformity of the proportions of each dimension may cause this problem. Figures 10-12 also show the advantages of the KMDS method. Comparing KMDS and KPLS, it can be seen that KMDS is better than KPLS in terms of modeling time and its standard deviation. In some cases, the model accuracy or the average accuracy of the proposed method is better than the KPLS method, which further illustrates the effectiveness and stability of the KMDS method. In summary, it can be seen from the test results of all benchmark functions that the modeling efficiency of the KMDS method is the highest. In terms of modeling accuracy, test results for the KMDS method are superior to those of Kriging method in most cases. Compared with the KPLS methods, the accuracy of KMDS method is slightly worse, but it is also relatively close, and even better than them in some cases. Moreover, with the increase in the dimensionality and sample size, the KMDS method can show its absolute advantage in terms of time spent with a small loss of precision. Therefore, the KMDS method is effective at solving high-dimensional problems.

Test on Aircraft Longitudinal Flight Control Problem
Since the advent of the aircraft, flight safety has always been the highest goal of aircraft design, and all the design goals of aircraft must obey this highest goal. A good flight control system should be able to ensure the safe flight of the aircraft. The aircraft functions and flight control requirements that modern aircraft need to meet are becoming more and more complex, resulting in more and more complex flight control systems. Therefore, flight simulation systems, as an important tool for aircraft design and simulated flight training, have received more and more attention. As the core component of modern aircraft and flight simulators, the flight control system is used to transmit control signals from the pilot. Through the control system, the control panel of the aircraft is deflected according to the signal law, so as to realize the control and stability of the various functions of the aircraft.
The aircraft longitudinal flight control system [45,46] can stabilize and control the From Table 5, it is easy to see that the test results for the Michalewicz function are similar to those of the Griewank function and the Rothyp function. However, in the case of 40D and 110 sampling points, the test results of the Michalewicz function are slightly different: the modeling time of KMDS is still the shortest, while the modeling error of the ordinary Kriging method is slightly smaller than that of the KMDS method. The complexity of the function itself and the uniformity of the proportions of each dimension may cause this problem. Figures 10-12 also show the advantages of the KMDS method. Comparing KMDS and KPLS, it can be seen that KMDS is better than KPLS in terms of modeling time and its standard deviation. In some cases, the model accuracy or the average accuracy of the proposed method is better than the KPLS method, which further illustrates the effectiveness and stability of the KMDS method.
In summary, it can be seen from the test results of all benchmark functions that the modeling efficiency of the KMDS method is the highest. In terms of modeling accuracy, test results for the KMDS method are superior to those of Kriging method in most cases. Compared with the KPLS methods, the accuracy of KMDS method is slightly worse, but it is also relatively close, and even better than them in some cases. Moreover, with the increase in the dimensionality and sample size, the KMDS method can show its absolute advantage in terms of time spent with a small loss of precision. Therefore, the KMDS method is effective at solving high-dimensional problems.

Test on Aircraft Longitudinal Flight Control Problem
Since the advent of the aircraft, flight safety has always been the highest goal of aircraft design, and all the design goals of aircraft must obey this highest goal. A good flight control system should be able to ensure the safe flight of the aircraft. The aircraft functions and flight control requirements that modern aircraft need to meet are becoming more and more complex, resulting in more and more complex flight control systems. Therefore, flight simulation systems, as an important tool for aircraft design and simulated flight training, have received more and more attention. As the core component of modern aircraft and flight simulators, the flight control system is used to transmit control signals from the pilot. Through the control system, the control panel of the aircraft is deflected according to the signal law, so as to realize the control and stability of the various functions of the aircraft.
The aircraft longitudinal flight control system [45,46] can stabilize and control the aircraft's pitch angle, altitude, speed, etc. When pulling the control stick backwards, the elevator deflects upwards by an angle, which produces a downward pitching force on the horizontal tail, forming a pitching operating moment on the center of gravity of the aircraft, forcing the nose to tilt up, and the angle of attack increases. Similarly, when pulling the control stick forward, the elevator deflects an angle downward, generating upward momentum on the horizontal tail, forming a pitching operating moment on the center of gravity of the aircraft, forcing the nose to descend, and the angle of attack increases. Since the flight process of the aircraft is affected by the pitch angle, angle of attack, lift, drag, acceleration of gravity, sideslip angle, aircraft gravity, etc., it is necessary to continuously adjust the pitch angle of the aircraft to keep the aircraft in balance.
MATLAB simulation software is widely used in the simulation of flight control systems to aid their design. This paper uses the Simulink simulation software in MATLAB to simulate the aircraft longitudinal flight control system. What the aircraft longitudinal flight control system needs to complete is the simulation of the pilot stick force, and at the same time calculate the corresponding pitch angle of the aircraft in real time. The pitch angle of the aircraft is closely related to the force of the control stick, the flight state of the aircraft, the control mode and the structural parameters of the control system. These parameters (see Figure 13 for details) are used as the design variables of the KMDS modeling method, and the aircraft longitudinal flight control simulation is thereby combined with KMDS. In the KMDS modeling process, each sample point is used as the parameter value of the aircraft's longitudinal flight control, so that the simulation result can be obtained with Simulink. The simulation results are time dependent. In order to better illustrate the superiority of the KMDS method, under the same parameter conditions, we take the pitch angle of the aircraft at 3s as the simulation result and output it to the MATLAB workspace. We use the data obtained from the simulation to establish Kriging and record the modeling time and modeling error through the KMDS modeling method. In addition, the Kriging model is directly established with the data obtained from the simulation, and the modeling time and modeling error are also recorded.
with Simulink. The simulation results are time dependent. In order to better illustr superiority of the KMDS method, under the same parameter conditions, we take th angle of the aircraft at 3s as the simulation result and output it to the MATLAB work We use the data obtained from the simulation to establish Kriging and record the ing time and modeling error through the KMDS modeling method. In additio Kriging model is directly established with the data obtained from the simulation, a modeling time and modeling error are also recorded.  Figure 14 shows the amount of time spent on and the model error scores for th eling process. In order to better compare the time for the KMDS method to es Kriging and the direct establishment of the Kriging model, the time in Figure 14 moved the time used for simulation. In this modeling process, there are 10 initial points, and the corresponding expensive estimates of the sample points are ob through simulation. The Kriging model is established via two methods, and the mo time (excluding the time for simulation estimation) is recorded as the first-time va each iteration process, a sample point is added, and the corresponding expensive es is obtained through simulation, and the modeling time at this time (excluding the the simulation estimate) is recorded as a time value. The iterative process is repeate the final number of sample points is 110, and then the iterative process is stopped.  Figure 14 shows the amount of time spent on and the model error scores for the modeling process. In order to better compare the time for the KMDS method to establish Kriging and the direct establishment of the Kriging model, the time in Figure 14 has removed the time used for simulation. In this modeling process, there are 10 initial sample points, and the corresponding expensive estimates of the sample points are obtained through simulation. The Kriging model is established via two methods, and the modeling time (excluding the time for simulation estimation) is recorded as the first-time value. In each iteration process, a sample point is added, and the corresponding expensive estimate is obtained through simulation, and the modeling time at this time (excluding the time of the simulation estimate) is recorded as a time value. The iterative process is repeated until the final number of sample points is 110, and then the iterative process is stopped. The following two conclusions can be drawn from Figure 14. (a) It can be seen from Figure 14 that the time required for KMDS to establish Kriging and to directly establish the Kriging model gradually increases as the number of sample points increases. However, the time required to directly establish the Kriging model is more and more than the time required for KMDS modeling with the increase in the number of sampling points, and it finally reaches a result where the time difference is more than ten times. (b) The figure shows that the model error is gradually decreasing as the number of sample points increases. However, the modeling error of KMDS to establish Kriging has always been smaller than that of directly establishing Kriging, which shows that the KMDS method The following two conclusions can be drawn from Figure 14. (a) It can be seen from Figure 14 that the time required for KMDS to establish Kriging and to directly establish the Kriging model gradually increases as the number of sample points increases. However, the time required to directly establish the Kriging model is more and more than the time required for KMDS modeling with the increase in the number of sampling points, and it finally reaches a result where the time difference is more than ten times. (b) The figure shows that the model error is gradually decreasing as the number of sample points increases. However, the modeling error of KMDS to establish Kriging has always been smaller than that of directly establishing Kriging, which shows that the KMDS method has higher modeling accuracy. In summary, for the simulation of an aircraft longitudinal flight control system, the KMDS method can improve the modeling efficiency and modeling accuracy of the Kriging model.

Conclusions
The high dimensionality and complexity of engineering design problems leads to expensive calculation costs. The Kriging model can alleviate this bottleneck problem. However, the Kriging model is also very time consuming when approximating highdimensional problems, solutions to some of which it even fails to construct. The main cost is in the inversion of the covariance correlation matrix and the solution of the hyperparameters of the Kriging model. To this end, this work proposes a Kriging modeling method based on multidimensional scaling to deal with high-dimensional problems. Firstly, the dimensionality of high-dimensional sample data was reduced via the MDS method to convert high-dimensional problems to low-dimensional problems. Then, a new sample point in the high-dimensional space was selected through Latin hypercube sampling upon which to perform an evaluation. The new sample point was added to the high-dimensional sample data X, and the dimensionality of the sample set was reduced through MDS again. Finally, a Kriging model was established using the new sample data X 1 after dimensionality reduction. The test results from the benchmark functions and simulation problem prove that the proposed method can greatly improve the modeling efficiency of Kriging while ensuring certain accuracy requirements.
Unluckily, the method proposed in this work sometimes loses some model accuracy when solving high-dimensional problems. For further research, we will start with the following three aspects: (1) based on the existing modeling efficiency, we will study how to choose more promising samples to further improve the modeling accuracy of the Kriging model; (2) in order to improve the accuracy of the model after dimensionality reduction, we will study how to combine the characteristic parameters of the Kriging model with mutual entropy or a t-test to select the dimension with the largest amount of information that is worthy of in-depth study; (3) for non-stationary, high-dimensional engineering problems, we will study how to achieve dimensionality reduction through the non-stationary random Kriging model while ensuring the accuracy requirement is also one of the future research directions.