A Multi-Fidelity Uncertainty Propagation Model for Multi-Dimensional Correlated Flow Field Responses

: Given the randomness inherent in fluid dynamics problems and limitations in human cognition, Computational Fluid Dynamics (CFD) modeling and simulation are afflicted with non-negligible uncertainties, casting doubts on the credibility of CFD. Scientifically and rigorously quantifying the uncertainty of CFD is paramount for assessing its credibility and informing engineering decisions. In order to quantify the uncertainty of multidimensional flow field responses stemming from uncertain model parameters, this paper proposes a method based on Gappy Proper Orthogonal Decomposition (POD) for supplementing high-fidelity flow field data within a framework that leverages POD and surrogate models. This approach enables the generation of corresponding high-fidelity flow fields from low-fidelity ones, significantly reducing the cost of high-fidelity flow field computation in uncertainty propagation modeling. Through an analysis of the impact of uncertainty in the coefficients of the Spalart–Allmaras (SA) turbulence model on the distribution of wall friction coefficients for the NACA0012 airfoil and pressure coefficients for the M6 wing, the proposed multi-fidelity modeling approach is demonstrated to offer significant advancements in both accuracy and efficiency compared to single-fidelity methods, providing a robust and efficient prediction model for large-scale random sampling.


Introduction
With the continuous advancements in mathematical models, numerical algorithms, mesh technologies, and high-performance computing, CFD has emerged as a pivotal tool in numerous critical engineering fields, including aerospace, energy and power, transportation, and beyond.Nevertheless, the credibility of CFD has been a subject of contention due to the non-negligible uncertainties inherent in its models, parameters, and numerical solutions [1].A comprehensive and rigorous quantification of these uncertainties is imperative for assessing and enhancing the credibility of CFD.
Parameters serve as a significant source of uncertainty in CFD.Due to limitations in human cognition or the inherent randomness in fluid problems, model parameters or inflow conditions may possess a certain degree of uncertainty.To quantify the uncertainty introduced by parameters, scholars have developed various methods, such as Monte Carlotype random sampling and polynomial chaos [2].While Monte Carlo-type methods are straightforward, they require a large number of samples to obtain stable and accurate statistical results.Polynomial chaos methods rely on orthogonal polynomial expansions, with coefficients obtained through numerical integration or regression analysis.However, as the dimension of input parameters and the order of expansion increase, the number of samples required for polynomial chaos methods grows drastically, leading to the "curse of dimensionality".For complex engineering problems, each CFD simulation is computationally expensive, limiting the practical application of these methods due to high computational

Uncertainty Propagation Model Framework and Key Algorithms
In the field of CFD, using different grid densities, time step sizes, etc., can generate computational data of varying fidelities.To reduce the overall modeling cost of the uncertainty propagation model, this study adopts a multi-fidelity modeling framework that combines flow field reduction and surrogate modeling.The overall implementation process is shown in Figure 1.The left part of the figure represents the main algorithms used in each stage, and the right part is the model training process.Below, we focus on introducing the model training process.

Uncertainty Propagation Model Framework and Key Algorithms
In the field of CFD, using different grid densities, time step sizes, etc., can generate computational data of varying fidelities.To reduce the overall modeling cost of the uncertainty propagation model, this study adopts a multi-fidelity modeling framework that combines flow field reduction and surrogate modeling.The overall implementation process is shown in Figure 1

Model Training Process
The entire model training process can be implemented using the following six steps: (1) In the input parameter space, Latin hypercube sampling is employed to generate the inputs for the ntrain training samples.(2) Based on the inputs of the training samples, the exchange algorithm [12] based on the Morris-Mitchell criterion [13] is used to divide the ntrain training samples into two parts: M complete samples and ntrain m incomplete samples.

Model Training Process
The entire model training process can be implemented using the following six steps: (1) In the input parameter space, Latin hypercube sampling is employed to generate the inputs for the n train training samples.(2) Based on the inputs of the training samples, the exchange algorithm [12] based on the Morris-Mitchell criterion [13] is used to divide the n train training samples into two parts: M complete samples and n train m incomplete samples.(3) For the M complete samples, their corresponding high-and low-fidelity outputs are obtained through CFD calculations.For the n train m incomplete samples, only their corresponding low-fidelity outputs are obtained through CFD calculations, while their high-fidelity outputs are considered unknown.Of course, we also obtain the high-fidelity outputs of n train m incomplete samples through CFD calculations.However, these data are not used for model training but only for testing the prediction ability of the Gappy POD method described below.(4) Using the Gappy POD method [14,15], the high-fidelity outputs of n train m incomplete samples are predicted.At this point, the high-fidelity output results for all n train training samples can be obtained.The low-fidelity output results of these samples will no longer be used hereinafter.( 5) POD [16,17] is performed on the high-fidelity outputs of the n train training samples to obtain an orthogonal basis function space.Through projection, the basis function coefficients for each training sample are obtained, which are considered the new outputs.(6) Based on the n train training samples, a Kriging model [18,19] is constructed between the input parameters and the basis function coefficients.Since the orthogonal basis functions obtained through POD are mutually orthogonal, an individual model can be constructed for each basis function coefficient.When given new input parameters, the corresponding basis function coefficients can be predicted based on the models, and the complete flow field response can be reconstructed using the bidirectional expression of POD.

Model Testing Process
In the input parameter space, Latin hypercube sampling is again employed to obtain the inputs for the n test training samples.The CFD program is run to obtain high-fidelity sample outputs, which are used to evaluate the prediction accuracy of the overall model.

Exchange Algorithm
After obtaining n train training samples, we select M samples from them to perform complete high-and low-fidelity simulations using CFD.Theoretically, the M samples should ideally have good space-filling property in the parameter space.Here, the criterion proposed by Morris and Mitchell [13] is adopted to measure the space-filling property of the sample set, i.e., a good space-filling design should maximize the minimum intersite distance: max : min where d ij is the Euclidean distance between the two samples x i and x j , and n represents the number of samples in the sample set.
To achieve a better space-filling property, the exchange algorithm proposed by Cook and Nachtsheim [12] is employed.Specifically, M samples are randomly selected from the training sample set as the complete sample set X e , with the remaining samples forming the incomplete sample set X r .The minimum inter-site distance of X e is calculated.We then exchange the first sample in X e (denoted as x (1) e ) with each of the samples in X r and retain the exchange that maximizes the minimum inter-site distance of X e .The process is repeated for each sample in X e (x ).It should be emphasized that the exchange algorithm involves a degree of randomness, necessitating multiple repeated trials to verify its reliability and stability.

POD
POD is a powerful mathematical technique used for analyzing and reducing the complexity of high-dimensional systems.This method is widely applied in various fields such as fluid dynamics, structural dynamics, and signal processing, enabling the efficient extraction of important characteristics, the reduction in computational costs, and the enhancement of simulation efficiency.POD provides a valuable tool for the analysis and optimization of complex systems.
The POD method was proposed by Lumley [16], and its basic principle is to identify a new set of bases for the space spanned by a group of vectors in high-dimensional space, such that the projection of the original vectors onto the set of basis functions is as large as possible.It involves projecting data onto a set of orthogonal basis functions, allowing for the extraction of dominant modes and features.
In this paper, the input and output of the k-th sample are denoted as θ k and s k separately.The output of M samples, {s k } M k=1 , constitutes a snapshot set.The average value and fluctuation part of the sample are defined as: The POD algorithm finds a set of optimal orthogonal bases Here, α k is the vector composed of orthogonal polynomial expansion coefficients.
To improve the efficiency and accuracy of POD, Sirovich [17] proposed the snapshot method for the efficient extraction of orthogonal basis vectors.This method assumes that the basis functions are linear combinations of sample snapshots, i.e.: Constructing the snapshot covariance matrix C with element C ij is the inner product of s ′ i and s ′ j , that is: Solving its eigenvalues and eigenvectors, M non-negative eigenvalues, In the context of POD analysis, the significance of a basis mode is gauged by the magnitude of its eigenvalue.The generalized energy of the first l basis modes is formulated as: The generalized energy criterion is commonly used to retain the prevailing basis modes, specifically considering the first l basis modes whose generalized energy just exceeds a predefined threshold as the dominant basis modes.Upon determining the dominant basis modes and truncating the basis function space, any given sample snapshot can be approximated as a linear combination of these prevailing basis modes, expressed as: where is obtained through least-squares method, which means: Through POD, the original high-dimensional response is reduced to an l-dimensional response, effectively alleviating the challenge of constructing a surrogate model.In subsequent surrogate modeling, the output is no longer the original response s k , but a low-dimensional vector → α composed of basis mode coefficients.It is noteworthy that Formula (3) facilitates a bidirectional process, enabling not only the dimensionality reduction in the known s k to obtain → α but also the utilization of → α to reconstruct the original output response s k .

Gappy POD
The Gappy POD method enables the generation of corresponding high-fidelity flow fields from low-fidelity ones, avoiding time-consuming high-fidelity CFD calculations.
Regarding the i-th sample in the complete sample set, we denote its model input parameters as θ i , its low-fidelity output vector as T with a dimension of n L , and its high-fidelity output vector as T with a dimension of n H .The high-and low-fidelity outputs are combined into a multi-fidelity sample snapshot denoted as Classical POD decomposition is performed for the multi-fidelity snapshots {s i } M i=1 .Based on the generalized energy criterion, truncation is performed to obtain an orthogonal space Φ = [φ 1 , φ 2 , • • • , φ n ], which is composed of n orthogonal basis functions.Therefore, any multi-fidelity snapshot can be represented as: For the j-th sample in the incomplete sample set, we denote its model input parameters as θ j and its multi-fidelity sample snapshot as s j = s L j s H j .Here, its low-fidelity output T is obtained with CFD, while its high-fidelity output vector s H j is regarded as unknown and needs to be predicted.The projection operator is defined as follows: where I n L is the unit matrix with dimension n L , and the expansion coefficient vector → β of the incomplete sample snapshot in the orthogonal basis function space is obtained by solving the following extremal problem: After simple matrix operations, we can obtain: The (n L + 1)-th to (n L + n H )-th elements of the vector Φ → β are the prediction of s H j .

Kriging Model
The Kriging model, a surrogate modeling approach widely used in academia and industry, is founded on the theory of random processes.It assumes the presence of spatial correlation among the response values at different inputs and interpolates the response value at prediction location using the existing observations.
The Kriging model can be expressed as the summation of a linear regression model f T (x)β and a stochastic process Z(x), that is: is the unknown regression parameters.f T (x)β contributes a global model, and Z(x) accounts for local deviations.
It is usually assumed that Z(x) is a Gaussian random process with zero mean.The covariance between Z(x) and Z(w) is denoted as: For a given experimental design, denoted as S = [s 1 , • • • , s N ], the corresponding ob- servation is denoted as y s = [y(s 1 ), • • • , y(s N )].The maximum likelihood estimates (MLEs) of the unknown parameters β and σ 2 , derived from these sample data, are as follows: where the regression design matrix F and the correlation function matrix R are, respectively, defined as: The best linear unbiased predictor for the prediction location x is given by: where α is defined as α = R −1 (y s − F β), and the correlation vector, denoted by r, represents the correlation between the training samples S and the prediction sample x, that is:

Case Description
To assess the efficacy of the proposed approach, this paper examines the influence of uncertainty in the coefficients of the SA turbulence model on the prediction of airfoil aerodynamics.The SA model is extensively utilized in aerospace and related fields [20].The computation presumes fully turbulent flow and disregards the transition term in the original model, resulting in nine coefficients denoted as c b1 , σ, c b2 , κ, c w2 , c w3 , c v1 , c t3 , c t4 .It is assumed that the model coefficients are subject to epistemic uncertainty, which can be characterized using probabilistic methods from a credibility perspective.The model parameters are assumed to follow uniform distributions with the intervals specified in Table 1, with parameter ranges referenced from the pertinent literature [21].It is worth noting that the mathematical representation of model uncertainty parameters, including distribution types and parameters, necessitates thorough consultation with model developers.The present study primarily focuses on the propagation of uncertainty given a mathematical description of the uncertain parameters, rather than on the precise quantification of uncertainty for the model parameters.Numerous investigations have been conducted on the quantification of uncertainty in SA turbulence model parameters, primarily employing polynomial chaos methods, with an emphasis on overall aerodynamic outputs such as airfoil or aircraft forces [21,22].The two numerical examples investigated are the distribution of the wall friction coefficient in the low-speed flow over an NACA0012 airfoil and the distribution of the wall pressure coefficient in the transonic flow over an M6 wing.These two cases are presented separately in the following subsections.

Low-Speed Flow around an NACA0012 Airfoil
The NACA0012 airfoil, with a symmetric profile and 12% thickness, is considered under the computational condition of M ∞ = 0.15, α = 5.0 • , T ∞ = 288.15K,Re ∞ = 6 × 10 6 .Two sets of grids with different densities, shown in Figure 2, are utilized for the generation of high-and low-fidelity samples.The number of cells is 3584 and 57,344, respectively.The coarse grid has a grid size that is one-sixteenth of the dense grid, and typically requires fewer iteration steps to reach convergence.However, for the purpose of uniformly assessing computational cost, it is assumed that the low-fidelity samples incur one-sixteenth of the computational cost of the high-fidelity samples.

Low-Speed Flow around an NACA0012 Airfoil
The NACA0012 airfoil, with a symmetric profile and 12% thickness, is considered under the computational condition of sets of grids with different densities, shown in Figure 2, are utilized for the generation of high-and low-fidelity samples.The number of cells is 3584 and 57,344, respectively.The coarse grid has a grid size that is one-sixteenth of the dense grid, and typically requires fewer iteration steps to reach convergence.However, for the purpose of uniformly assessing computational cost, it is assumed that the low-fidelity samples incur one-sixteenth of the computational cost of the high-fidelity samples.The wall friction coefficient distribution obtained using standard parameters of the SA model is presented in Figure 3.The high-and low-fidelity results exhibit similar trends, characterized by local abrupt variations on the upper and lower surfaces of the airfoil leading edge, followed by a gradual stabilization.However, significant differences in magnitude are observed, particularly on the upper surface where the dense grid prediction is notably higher.Accurately capturing these local abrupt variations in friction coefficients poses a significant challenge and demands high accuracy from the prediction model.The wall friction coefficient distribution obtained using standard parameters of the SA model is presented in Figure 3.The high-and low-fidelity results exhibit similar trends, characterized by local abrupt variations on the upper and lower surfaces of the airfoil leading edge, followed by a gradual stabilization.However, significant differences in magnitude are observed, particularly on the upper surface where the dense grid prediction is notably higher.Accurately capturing these local abrupt variations in friction coefficients poses a significant challenge and demands high accuracy from the prediction model.trends, characterized by local abrupt variations on the upper and lower surfaces of the airfoil leading edge, followed by a gradual stabilization.However, significant differences in magnitude are observed, particularly on the upper surface where the dense grid prediction is notably higher.Accurately capturing these local abrupt variations in friction coefficients poses a significant challenge and demands high accuracy from the prediction model.

Transonic Flow around an M6 Wing
The transonic flow over the M6 wing serves as a benchmark test case for assessing transonic flow solvers.The wing features a root chord of approximately 0.8 m and a half-span of approximately 1.2 m, with a symmetric airfoil section.Two sets of grids with different densities, shown in Figure 4, are utilized for the generation of low-and high-fidelity samples.The number of cells is 990,360 and 3,594,863, respectively.Therefore, it is assumed that the acquisition cost of one high-fidelity sample is 3.5 times that of one low-fidelity sample.The output of this case constitutes a vector comprising the pressure coefficients of all grid points covering the entire wing surface, with a dimensionality of 13,638 for the coarse grid and 29,684 for the dense grid.

Transonic Flow around an M6 Wing
The transonic flow over the M6 wing serves as a benchmark test case for assessing transonic flow solvers.The wing features a root chord of approximately 0.8 m and a halfspan of approximately 1.2 m, with a symmetric airfoil section.Two sets of grids with different densities, shown in Figure 4, are utilized for the generation of low-and high-fidelity samples.The number of cells is 990,360 and 3,594,863, respectively.Therefore, it is assumed that the acquisition cost of one high-fidelity sample is 3.5 times that of one lowfidelity sample.The output of this case constitutes a vector comprising the pressure coefficients of all grid points covering the entire wing surface, with a dimensionality of 13,638 for the coarse grid and 29,684 for the dense grid.shock structure develops on the upper surface of the wing, as shown in Figure 5, posing substantial challenges for the prediction model.The pressure distributions obtained by using two sets of grids are consistent, but near the suction peak and shock wave on the upper surface, the dense grid results are in better agreement with the reference experiment results shown in Figure 6.The computational condition is set at M ∞ = 0.8395, α = 3.06 • , T ∞ = 255.56K,Re ∞ = 1.172 × 10 7 .Under this condition, a λ-shaped shock structure develops on the upper surface of the wing, as shown in Figure 5, posing substantial challenges for the prediction model.The pressure distributions obtained by using two sets of grids are consistent, but near the suction peak and shock wave on the upper surface, the dense grid results are in better agreement with the reference experiment results shown in Figure 6.The sample data in this study are generated using the in-house unstructured grid solver Flowstar [23], which is founded on a cell-centered finite volume methodology and is adept at handling diverse element types, including hexahedra, tetrahedra, prisms, pyramids, and other polyhedra generated via geometrical multigrid techniques.Second-order accuracy in space is attained through linear reconstruction within cells.The vertexbased Green-Gauss approach [24] is employed for gradient computations to uphold accuracy and robustness.To mitigate oscillations in regions of high gradients, Venkatakrishnan's limiter [25] is utilized.The Roe scheme is engaged for inviscid flux computations.A first-order backward Euler time-differencing scheme with local time stepping is implemented to approximate a steady state, facilitating convergence.The flux Jacobian is de-  The sample data in this study are generated using the in-house unstructured grid solver Flowstar [23], which is founded on a cell-centered finite volume methodology and is adept at handling diverse element types, including hexahedra, tetrahedra, prisms, pyramids, and other polyhedra generated via geometrical multigrid techniques.Second-order accuracy in space is attained through linear reconstruction within cells.The vertexbased Green-Gauss approach [24] is employed for gradient computations to uphold accuracy and robustness.To mitigate oscillations in regions of high gradients, Venkatakrishnan's limiter [25] is utilized.The Roe scheme is engaged for inviscid flux computations.A first-order backward Euler time-differencing scheme with local time stepping is implemented to approximate a steady state, facilitating convergence.The flux Jacobian is de- The sample data in this study are generated using the in-house unstructured grid solver Flowstar [23], which is founded on a cell-centered finite volume methodology and is adept at handling diverse element types, including hexahedra, tetrahedra, prisms, pyramids, and other polyhedra generated via geometrical multigrid techniques.Second-order accuracy in space is attained through linear reconstruction within cells.The vertex-based Green-Gauss approach [24] is employed for gradient computations to uphold accuracy and robustness.To mitigate oscillations in regions of high gradients, Venkatakrishnan's limiter [25] is utilized.The Roe scheme is engaged for inviscid flux computations.A firstorder backward Euler time-differencing scheme with local time stepping is implemented to approximate a steady state, facilitating convergence.The flux Jacobian is derived from a first-order upwind scheme, with the divided convective flux Jacobian composed of the convective flux Jacobian and its spectral radius.The viscous flux Jacobian is approximated using its spectral radius.
In Kriging modeling, the regression function is a constant function, and the covariance function, which determines the spatial correlation structure, is a Gaussian function.In the POD method, the generalized energy criterion is set at 99.9%.

Results and Discussion
The prediction accuracy of the model can be assessed through the prediction error of the high-fidelity output vector s H .For incomplete samples within the training dataset, the error vector is defined as CFD , serving as a metric for evaluating the accuracy of the Gappy POD method.For the testing sample, the error vector is defined as CFD , serving to evaluate the accuracy of the overall model.For each sample, we define dimensionless errors based on the 1-norm, 2-norm, and infinity norm of the error vector:

Low-Speed Flow around an NACA0012 Airfoil
In this case, we fix n train = 32 and n test = 49 and assign M values of 11, 13, 15, and 17.This configuration facilitates the analysis of the cost and accuracy trade-offs of the overall model in this paper relative to modeling with high-fidelity samples only.Specifically, the computational cost of 32 low-fidelity samples is deemed equivalent to that of 2 highfidelity samples.
(1) Accuracy analysis of Gappy POD method Given that high-fidelity sample data serve as the basis for subsequent POD and Kriging modeling, it is crucial to first analyze the accuracy of the Gappy POD method in reconstructing the corresponding high-fidelity output from low-fidelity output.In the analysis, M is set to 11, with similar results observed for M values of 13, 15, and 17.
Randomly selecting an incomplete sample from the training dataset, Figure 7 presents a comparison between the wall friction coefficient distribution predicted using the Gappy POD method and high-fidelity CFD computation.Visually, the disparity between the two is minimal.
To further evaluate the accuracy of the Gappy POD method, the prediction error was computed for all incomplete samples; the resulting error distribution is depicted in Figure 8. Across all incomplete samples, it can be noted that ε 1 remains below 0.25%, ε 2 remains below 0.6%, and ε inf remains below 2.5%.Considering the randomness of the Latin hypercube sampling and exchange algorithm, steps (a) through (d) in the training process were repeated a total of 10 times.Figure 9 presents the statistical analysis of errors for all incomplete samples, revealing median values of 0.0007, 0.0015, and 0.0070 for ε 1 , ε 2 , and ε inf , respectively.The results demonstrate that the Gappy POD approach is capable of reconstructing high-fidelity outputs using a limited amount of complete sample data and low-fidelity outputs, effectively substituting for computationally expensive high-fidelity CFD calculations and significantly reducing computational costs.To further evaluate the accuracy of the Gappy POD method, the prediction error was computed for all incomplete samples; the resulting error distribution is depicted in Figure 8. Across all incomplete samples, it can be noted that ε1 remains below 0.25%, ε2 remains below 0.6%, and εinf remains below 2.5%.Considering the randomness of the Latin hypercube sampling and exchange algorithm, steps (a) through (d) in the training process were repeated a total of 10 times.Figure 9 presents the statistical analysis of errors for all incomplete samples, revealing median values of 0.0007, 0.0015, and 0.0070 for ε1, ε2, and εinf, respectively.The results demonstrate that the Gappy POD approach is capable of reconstructing high-fidelity outputs using a limited amount of complete sample data and lowfidelity outputs, effectively substituting for computationally expensive high-fidelity CFD calculations and significantly reducing computational costs.To further evaluate the accuracy of the Gappy POD method, the prediction error was computed for all incomplete samples; the resulting error distribution is depicted in Figure 8. Across all incomplete samples, it can be noted that ε1 remains below 0.25%, ε2 remains below 0.6%, and εinf remains below 2.5%.Considering the randomness of the Latin hypercube sampling and exchange algorithm, steps (a) through (d) in the training process were repeated a total of 10 times.Figure 9 presents the statistical analysis of errors for all incomplete samples, revealing median values of 0.0007, 0.0015, and 0.0070 for ε1, ε2, and εinf, respectively.The results demonstrate that the Gappy POD approach is capable of reconstructing high-fidelity outputs using a limited amount of complete sample data and lowfidelity outputs, effectively substituting for computationally expensive high-fidelity CFD calculations and significantly reducing computational costs.(2) Accuracy analysis of overall model Given that this study employs the Monte Carlo method to perform extensive random sampling on the overall model, it is imperative to validate the predictive capability of the model.Figure 10 presents a comparison between the wall friction coefficient distribution predicted by the overall model and the high-fidelity CFD computation for a randomly (2) Accuracy analysis of overall model Given that this study employs the Monte Carlo method to perform extensive random sampling on the overall model, it is imperative to validate the predictive capability of the model.Figure 10 presents a comparison between the wall friction coefficient distribution predicted by the overall model and the high-fidelity CFD computation for a randomly selected sample from the test dataset with M = 11.Visually, the two exhibit excellent agreement.(2) Accuracy analysis of overall model Given that this study employs the Monte Carlo method to perform extensive random sampling on the overall model, it is imperative to validate the predictive capability of the model.Figure 10 presents a comparison between the wall friction coefficient distribution predicted by the overall model and the high-fidelity CFD computation for a randomly selected sample from the test dataset with M = 11.Visually, the two exhibit excellent agreement.The prediction error was computed for all testing samples, and the resulting error distribution is depicted in Figure 11.Across all testing samples, it can be noted that ε1 remains below 0.25%, ε2 remains below 0.5%, and εinf remains below 2.5%.The results strongly validate the predictive capability of the proposed model in accurately estimating the wall friction coefficient distribution for new samples, thus providing robust support for large-scale random sampling processes.The prediction error was computed for all testing samples, and the resulting error distribution is depicted in Figure 11.Across all testing samples, it can be noted that ε 1 remains below 0.25%, ε 2 remains below 0.5%, and ε inf remains below 2.5%.The results strongly validate the predictive capability of the proposed model in accurately estimating the wall friction coefficient distribution for new samples, thus providing robust support for large-scale random sampling processes.(3) Analysis of the influence of sample size In the context of multidimensional correlated responses, while the multi-fidelity modeling approach presented in this study is applicable, an alternative approach could involve utilizing solely high-fidelity data for POD decomposition and surrogate model construction.Given that low-fidelity samples, despite their lower acquisition costs, still require CFD computations, a rigorous examination is warranted to determine the precise impact of incorporating additional low-fidelity calculations on the overall modeling accuracy.
To this end, we analyzed the prediction error performance of the model using both the overall model developed in this paper and the approach of using only high-fidelity data for modeling.The results are shown in Figures 12-14.To eliminate the randomness of the sampling algorithm, all processes were repeated 10 times.The settings for the POD and the Kriging model were consistent in the comparison.The vertical axis in the three (3) Analysis of the influence of sample size In the context of multidimensional correlated responses, while the multi-fidelity modeling approach presented in this study is applicable, an alternative approach could involve utilizing solely high-fidelity data for POD decomposition and surrogate model construction.Given that low-fidelity samples, despite their lower acquisition costs, still require CFD computations, a rigorous examination is warranted to determine the precise impact of incorporating additional low-fidelity calculations on the overall modeling accuracy.
To this end, we analyzed the prediction error performance of the model using both the overall model developed in this paper and the approach of using only high-fidelity data for modeling.The results are shown in Figures 12-14.To eliminate the randomness of the sampling algorithm, all processes were repeated 10 times.The settings for the POD and the Kriging model were consistent in the comparison.The vertical axis in the three graphs represents the statistical error, while the horizontal axis indicates the models and sample size.Specifically, "11High" represents a single-fidelity model using 11 high-fidelity samples, and "11High32Low" represents a multi-fidelity model using 11 high-fidelity samples and 32 low-fidelity samples.The vertical green solid line divides the horizontal space into several regions, with computational costs in the same region roughly considered equivalent.The figures reveal that as the number of high-fidelity samples used in the multifidelity modeling increases from 11 to 17, the overall prediction error remains relatively stable and low, indicating the robustness of the multi-fidelity model.For the model using 11 high-fidelity samples and 32 low-fidelity samples, the median values for ε 1 , ε 2 , and ε inf are 0.0012, 0.0018, and 0.0065, respectively, all maintaining a very low error level.For the single-fidelity model, the error decreases significantly as the sample size increases, and it still shows a downward trend even when using 17 high-fidelity samples, indicating that it has not yet stabilized.Furthermore, even when the computational cost is higher than that of the multi-fidelity modeling, the prediction error remains larger.The statistical results fully demonstrate the efficiency and accuracy advantages of the multi-fidelity modeling approach.(4) Statistical results analysis Latin hypercube sampling was employed again within the nine-dimensional input space to generate 10 6 samples, which were sequentially passed to the overall model to derive the corresponding distribution of wall friction coefficients.The statistical analysis of the sample data yielded the mean value and 99% confidence interval of the wall friction coefficient, as depicted in Figure 15.The results indicate significant uncertainty in the friction coefficient across the entire airfoil.(4) Statistical results analysis Latin hypercube sampling was employed again within the nine-dimensional input space to generate 10 6 samples, which were sequentially passed to the overall model to derive the corresponding distribution of wall friction coefficients.The statistical analysis of the sample data yielded the mean value and 99% confidence interval of the wall friction coefficient, as depicted in Figure 15.The results indicate significant uncertainty in the friction coefficient across the entire airfoil.The relative significance of the nine model parameters on the distribution of the friction coefficients can be ascertained through the sensitivity analysis approach, as presented in Table 2.A global sensitivity analysis approach proposed by Gamboa et al., which decomposes the covariance of multi-output variables [26], is utilized.As shown, κ emerges as the most influential parameter, which can be expected from a physical standpoint.In the constant Reynolds stress layer of plane shear turbulence, the eddy viscosity coefficient is directly proportional to κ, i.e., ν t = κu τ d.The eddy viscosity coefficient characterizes the relationship between Reynolds stress and mean flow, serving a role analogous to molecular viscosity in RANS simulations.Hence, κ is intimately linked with the viscosity of the flow, and the friction coefficient is directly correlated with the flow viscosity.
Latin hypercube sampling was employed again within the nine-dimensional input space to generate 10 6 samples, which were sequentially passed to the overall model to derive the corresponding distribution of wall friction coefficients.The statistical analysis of the sample data yielded the mean value and 99% confidence interval of the wall friction coefficient, as depicted in Figure 15.The results indicate significant uncertainty in the friction coefficient across the entire airfoil.The relative significance of the nine model parameters on the distribution of the friction coefficients can be ascertained through the sensitivity analysis approach, as presented in Table 2.A global sensitivity analysis approach proposed by Gamboa et al., which decomposes the covariance of multi-output variables [26], is utilized.As shown, κ emerges as the most influential parameter, which can be expected from a physical standpoint.In the constant Reynolds stress layer of plane shear turbulence, the eddy viscosity coefficient

Transonic Flow around an M6 Wing
(1) Accuracy analysis of Gappy POD method In the analysis, n train is set to 21 and M is 6.Randomly selecting an incomplete sample from the training dataset, Figure 16 presents a comparison between the pressure coefficient distribution predicted by the Gappy POD method and the high-fidelity CFD computation.Visually, the disparity between the two is minimal.
To further evaluate the accuracy of the Gappy POD method, the prediction error was computed for all incomplete samples, and the resulting error distribution is depicted in Figure 17.Across all incomplete samples, it can be noted that ε 1 remains below 0.0035%, ε 2 remains below 0.03%, and ε inf remains below 1.5%.The results demonstrate that the Gappy POD approach is capable of reconstructing high-fidelity outputs using a limited amount of complete sample data and low-fidelity outputs, effectively substituting for computationally expensive high-fidelity CFD calculations and significantly reducing computational costs.
(2) Accuracy analysis of overall model The overall model was constructed using n train high-fidelity responses, of which M responses were calculated using CFD simulations, and the rest were reconstructed using the Gappy POD algorithm from its low-fidelity responses.Figure 18 presents a comparison between the pressure coefficient predicted by the overall model and the high-fidelity CFD computation for a randomly selected sample from the test dataset.Visually, the two exhibit excellent agreement.

Transonic Flow around an M6 Wing
(1) Accuracy analysis of Gappy POD method In the analysis, ntrain is set to 21 and M is 6.Randomly selecting an incomplete sample from the training dataset, Figure 16 presents a comparison between the pressure coefficient distribution predicted by the Gappy POD method and the high-fidelity CFD computation.Visually, the disparity between the two is minimal.To further evaluate the accuracy of the Gappy POD method, the prediction error was computed for all incomplete samples, and the resulting error distribution is depicted in Figure 17.Across all incomplete samples, it can be noted that 1 ε remains below 0.0035%, 2 ε remains below 0.03%, and inf ε remains below 1.5%.The results demonstrate that the Gappy POD approach is capable of reconstructing high-fidelity outputs using a limited amount of complete sample data and low-fidelity outputs, effectively substituting for computationally expensive high-fidelity CFD calculations and significantly reducing computational costs.(2) Accuracy analysis of overall model The overall model was constructed using ntrain high-fidelity responses, of which M responses were calculated using CFD simulations, and the rest were reconstructed using the Gappy POD algorithm from its low-fidelity responses.Figure 18 presents a comparison between the pressure coefficient predicted by the overall model and the high-fidelity CFD computation for a randomly selected sample from the test dataset.Visually, the two exhibit excellent agreement.(2) Accuracy analysis of overall model The overall model was constructed using ntrain high-fidelity responses, of which M responses were calculated using CFD simulations, and the rest were reconstructed using the Gappy POD algorithm from its low-fidelity responses.Figure 18 presents a comparison between the pressure coefficient predicted by the overall model and the high-fidelity CFD computation for a randomly selected sample from the test dataset.Visually, the two exhibit excellent agreement.The prediction error was computed for all testing samples, and the resulting error distribution is depicted in Figure 19.Across all testing samples, it can be noted that ε 1 remains below 0.009%, ε 2 remains below 0.035%, and ε inf remains below 2%.The results strongly validate the predictive capability of the proposed model in accurately estimating the pressure coefficient distribution for new samples, thus providing robust support for large-scale random sampling processes.(3) Analysis of the influence of sample size To demonstrate the advantages of the proposed multi-fidelity model over the singlefidelity model using only high-fidelity samples, we analyzed the prediction performance of the two approaches.The computational cost of the low-fidelity sample is converted into the equivalent cost of the high-fidelity sample, with a ratio of 1 to 3.5.The mean prediction error of the two approaches is shown in Figure 20, demonstrating that with the inclusion of low-fidelity samples, the multi-fidelity model can achieve a much lower prediction error with approximately the same computational cost compared with the single-fidelity model.(4) Statistical results analysis Latin hypercube sampling was employed again within the nine-dimensional input space to generate 10 6 samples.These inputs were sequentially fed into the fast prediction model to acquire the corresponding pressure coefficients.The statistical analysis of the sample data yielded the mean value and standard deviation of the wall pressure coefficient, as depicted in Figure 21.Notably, the uncertainty of the pressure coefficient is concentrated at the λ -shaped shock on the upper surface, whereas the uncertainty at other locations is minimal.This observation aligns with expectations, given that the prediction of shocks is highly sensitive to turbulence model parameters.(3) Analysis of the influence of sample size To demonstrate the advantages of the proposed multi-fidelity model over the singlefidelity model using only high-fidelity samples, we analyzed the prediction performance of the two approaches.The computational cost of the low-fidelity sample is converted into the equivalent cost of the high-fidelity sample, with a ratio of 1 to 3.5.The mean prediction error of the two approaches is shown in Figure 20, demonstrating that with the inclusion of low-fidelity samples, the multi-fidelity model can achieve a much lower prediction error with approximately the same computational cost compared with the single-fidelity model.(3) Analysis of the influence of sample size To demonstrate the advantages of the proposed multi-fidelity model over the singlefidelity model using only high-fidelity samples, we analyzed the prediction performance of the two approaches.The computational cost of the low-fidelity sample is converted into the equivalent cost of the high-fidelity sample, with a ratio of 1 to 3.5.The mean prediction error of the two approaches is shown in Figure 20, demonstrating that with the inclusion of low-fidelity samples, the multi-fidelity model can achieve a much lower prediction error with approximately the same computational cost compared with the single-fidelity model.(4) Statistical results analysis Latin hypercube sampling was employed again within the nine-dimensional input space to generate 10 6 samples.These inputs were sequentially fed into the fast prediction model to acquire the corresponding pressure coefficients.The statistical analysis of the sample data yielded the mean value and standard deviation of the wall pressure coefficient, as depicted in Figure 21.Notably, the uncertainty of the pressure coefficient is concentrated at the λ -shaped shock on the upper surface, whereas the uncertainty at other locations is minimal.This observation aligns with expectations, given that the prediction of shocks is highly sensitive to turbulence model parameters.(4) Statistical results analysis Latin hypercube sampling was employed again within the nine-dimensional input space to generate 10 6 samples.These inputs were sequentially fed into the fast prediction model to acquire the corresponding pressure coefficients.The statistical analysis of the sample data yielded the mean value and standard deviation of the wall pressure coefficient, as depicted in Figure 21.Notably, the uncertainty of the pressure coefficient is concentrated at the λ-shaped shock on the upper surface, whereas the uncertainty at other locations is minimal.This observation aligns with expectations, given that the prediction of shocks is highly sensitive to turbulence model parameters.Through the application of the multi-output global sensitivity analysis approach grounded in covariance decomposition, the relative significance of model parameters on the pressure coefficient distribution can be ascertained, as presented in Table 3. Again, κ emerges as the most influential parameter.Following the same analysis as the first case, κ is closely linked with the viscosity of the flow, which significantly impacts the kinetic energy loss of fluid micro-groups within the boundary layer and the capacity to resist adverse pressure gradients, ultimately influencing the prediction of shock wave positions.

Conclusions
This study addresses the requirement for uncertainty quantification in multidimensional correlated responses within flow fields.Building on the previously established modeling framework, which utilizes Proper Orthogonal Decomposition for flow field reduction and surrogate models, we introduce a multi-fidelity modeling framework that integrates high-and low-fidelity sample data.The findings of this investigation demonstrate the following: (1) The Gappy POD-based method for supplementing missing data in flow fields enables the restoration of high-fidelity outputs from a limited amount of complete sample data, utilizing the low-fidelity outputs of incomplete samples.This approach effectively avoids the need for computationally expensive high-fidelity CFD calculations on a large number of samples, significantly reducing computational costs.(2) The multi-fidelity modeling approach demonstrates a marked improvement in prediction accuracy and model stability compared to single-fidelity methods while incurring approximately the same computational cost for sample processing.This methodology offers an efficient and robust prediction model for large-scale random sampling.Through the application of the multi-output global sensitivity analysis approach grounded in covariance decomposition, the relative significance of model parameters on the pressure coefficient distribution can be ascertained, as presented in Table 3. Again, κ emerges as the most influential parameter.Following the same analysis as the first case, κ is closely linked with the viscosity of the flow, which significantly impacts the kinetic energy loss of fluid micro-groups within the boundary layer and the capacity to resist adverse pressure gradients, ultimately influencing the prediction of shock wave positions.

Conclusions
This study addresses the requirement for uncertainty quantification in multidimensional correlated responses within flow fields.Building on the previously established modeling framework, which utilizes Proper Orthogonal Decomposition for flow field reduction and surrogate models, we introduce a multi-fidelity modeling framework that integrates high-and low-fidelity sample data.The findings of this investigation demonstrate the following: (1) The Gappy POD-based method for supplementing missing data in flow fields enables the restoration of high-fidelity outputs from a limited amount of complete sample data, utilizing the low-fidelity outputs of incomplete samples.This approach effectively avoids the need for computationally expensive high-fidelity CFD calculations on a large number of samples, significantly reducing computational costs.(2) The multi-fidelity modeling approach demonstrates a marked improvement in prediction accuracy and model stability compared to single-fidelity methods while incurring approximately the same computational cost for sample processing.This methodology offers an efficient and robust prediction model for large-scale random sampling.
The cases investigated in this study demonstrate a relatively stable variation in the flow field under differing turbulence model coefficients.In such cases, it is commonly accepted that Proper Orthogonal Decomposition effectively captures the fundamental modes of the flow field.However, if the flow field exhibits rapid changes in response to model parameters, such as smooth flow fields under certain parameters and discontinuous flow fields under others, further validation is required to ascertain the accuracy of the developed model in predicting flow field variables.Autoencoder and deep neural network approaches represent potential solutions to these challenges and will be the focus of future investigations.
. The left part of the figure represents the main algorithms used in each stage, and the right part is the model training process.Below, we focus on introducing the model training process.

Figure 1 .
Figure 1.Overall flow chart for multi-fidelity uncertainty propagation model.

Figure 1 .
Figure 1.Overall flow chart for multi-fidelity uncertainty propagation model.
and corresponding eigenvectors, b i (i = 1, 2, • • • M), are obtained.The optimal orthogonal base φ i can be expressed by

Figure 3 .
Figure 3. Wall friction coefficient distribution for NACA0012 airfoil under standard SA model parameters.

23 Figure 3 .
Figure 3. Wall friction coefficient distribution for NACA0012 airfoil under standard SA model parameters.

23 Figure 5 .
Figure 5. Pressure contours for M6 wing under standard model parameters using fine grid.

Figure 6 .
Figure 6.Pressure coefficient distribution for M6 wing under original SA model parameter at 65% wingspan.

Figure 5 . 23 Figure 5 .
Figure 5. Pressure contours for M6 wing under standard model parameters using fine grid.

Figure 6 .
Figure 6.Pressure coefficient distribution for M6 wing under original SA model parameter at 65% wingspan.

Figure 6 .
Figure 6.Pressure coefficient distribution for M6 wing under original SA model parameter at 65% wingspan.

Figure 7 .
Figure 7.Comparison between the wall friction coefficients predicted using the Gappy POD method and high-fidelity CFD computation for the NACA0012 case.

Figure 8 .
Figure 8. Error analysis of Gappy POD method for predictions on incomplete samples for the NACA0012 case.

Figure 7 .
Figure 7.Comparison between the wall friction coefficients predicted using the Gappy POD method and high-fidelity CFD computation for the NACA0012 case.

Figure 7 .
Figure 7.Comparison between the wall friction coefficients predicted using the Gappy POD method and high-fidelity CFD computation for the NACA0012 case.

Figure 8 .
Figure 8. Error analysis of Gappy POD method for predictions on incomplete samples for the NACA0012 case.

Figure 8 . 23 Figure 9 .
Figure 8. Error analysis of Gappy POD method for predictions on incomplete samples for the NACA0012 case.Aerospace 2024, 11, x FOR PEER REVIEW 14 of 23

Figure 9 .
Figure 9. Statistical error analysis of Gappy POD method for predictions on incomplete samples (repeated 10 times).

Figure 9 .
Figure 9. Statistical error analysis of Gappy POD method for predictions on incomplete samples (repeated 10 times).

Figure 10 .
Figure 10.Comparison between the wall friction coefficients predicted by the overall model and the high-fidelity CFD computation.

Figure 10 .
Figure 10.Comparison between the wall friction coefficients predicted by the overall model and the high-fidelity CFD computation.

Aerospace 2024 , 23 Figure 11 .
Figure 11.Error analysis of overall model for predictions on testing samples for the NAC0012 case.

Figure 11 .
Figure 11.Error analysis of overall model for predictions on testing samples for the NAC0012 case.

Figure 14 .
Figure 14.Statistical error analysis for predictions on testing samples (ε inf ).

Figure 15 .
Figure 15.Mean value and 99% confidence interval of the wall friction coefficient.

Figure 15 .
Figure 15.Mean value and 99% confidence interval of the wall friction coefficient.

Figure 16 .
Figure 16.Comparison between the pressure coefficients predicted using the Gappy POD method and the high-fidelity CFD computation for the M6 case.

Figure 16 . 23 Figure 17 .
Figure 16.Comparison between the pressure coefficients predicted using the Gappy POD method and the high-fidelity CFD computation for the M6 case.Aerospace 2024, 11, x FOR PEER REVIEW 19 of 23

Figure 17 . 23 Figure 17 .
Figure 17.Error analysis of Gappy POD method for predictions on incomplete samples for the M6 case.

Figure 18 .
Figure 18.Comparison between the pressure coefficients predicted by the overall model and the high-fidelity CFD computation for the M6 case.The prediction error was computed for all testing samples, and the resulting error distribution is depicted in Figure19.Across all testing samples, it can be noted that 1 ε

Figure 18 .
Figure 18.Comparison between the pressure coefficients predicted by the overall model and the high-fidelity CFD computation for the M6 case.

Aerospace 2024 , 23 Figure 19 .
Figure 19.Error analysis of overall model for predictions on testing samples for the M6 case.

Figure 20 .
Figure 20.Comparison of prediction error on testing samples between multi-fidelity and high-fidelity models for the M6 case.

Figure 19 .
Figure 19.Error analysis of overall model for predictions on testing samples for the M6 case.

Aerospace 2024 , 23 Figure 19 .
Figure 19.Error analysis of overall model for predictions on testing samples for the M6 case.

Figure 20 .
Figure 20.Comparison of prediction error on testing samples between multi-fidelity and high-fidelity models for the M6 case.

Figure 20 .
Figure 20.Comparison of prediction error on testing samples between multi-fidelity and high-fidelity models for the M6 case.

Aerospace 2024 ,Figure 21 .
Figure 21.Contours of mean value and standard deviation of the wall pressure coefficients.

Figure 21 .
Figure 21.Contours of mean value and standard deviation of the wall pressure coefficients.

Table 1 .
Interval and standard value of SA model parameters.

Table 2 .
Sobol indicators of SA model parameters.

Table 3 .
Sobol indicators of SA model parameters for M6 case.

Table 3 .
Sobol indicators of SA model parameters for M6 case.