1. Introduction
Engineering problems are inherently uncertain, and conducting reliability studies is an essential method for addressing these uncertainties [
1,
2]. The modelling of engineering systems has become increasingly precise with the advent of greater computational power, but this has also led to a significant increase in the time required and the associated costs [
3]. The surrogate model is a widely utilised approach in reliability analysis, as it can significantly reduce the number of required calculations [
4,
5,
6].
A number of surrogate models have been proposed and employed in reliability analysis [
7,
8,
9,
10,
11,
12]. Among the various models, the Kriging model is prominently used because it includes a built-in error measure for the predicted values, which facilitates active learning methods [
13]. Two representative works are the active Kriging Monte Carlo simulation (AK-MCS) [
14] and the efficient global reliability analysis (EGRA) [
15]. A considerable number of active learning methods have been proposed based on the AK-MCS method, collectively referred to as AK methods [
4]. These methods have been widely applied in various engineering fields, including aerospace [
16], marine [
17], machinery [
18], and civil engineering [
19].
Surrogate models are generally trained with single-fidelity data, predominantly high-fidelity samples. The term “fidelity” refers to a model’s accuracy in capturing the essential state and behavioral traits of a real-world target (e.g., object, feature, or condition) [
5]. In general, detailed simulations and physical experiments yield HF samples, and simple simulations produce low-fidelity (LF) samples. Nevertheless, the process of obtaining HF samples is frequently time-consuming and costly. The MF surrogate model is a promising approach that integrates HF samples and LF samples to reduce the number of HF samples [
20,
21,
22]. NASA indicated that the MF surrogate model required substantial research and would offer a powerful methodology for mitigating overall risk in the design of aerospace systems [
23].
The MF method is widely employed in reliability analysis, with the MF Kriging method being a particularly prevalent approach. Yi et al. [
24] proposed an active learning method for MF reliability analysis called AMK-MCS+AEFF. The MF model embedded within AMK-MCS+AEFF was a dis-Kriging model. The dis-Kriging model was constituted of two principal components, the LF Kriging model and the discrepancy Kriging model, which served to connect the HF model and the LF model. Yi et al. [
25] put forth an active learning-based MF Kriging method called MF-BSC-Believer. The dis-Kriging model was also employed as the MF model. In a manner analogous to the AMK-MCS+AEFF approach, the MF-BSC-Believer method chose HF and LF samples during one step. Feng et al. [
26] developed a two-stage active learning approach (AHK-MCS), wherein the LF and HF models were constructed one after another, and the hierarchical Kriging model was employed as the MF model. Using the hierarchical Kriging model, Che et al. [
27] developed the MF model by combining HF data with a fused LF model derived from non-hierarchical LF samples. With the Co-Kriging model, Lima et al. [
28] performed a reliability evaluation on the ultimate strength of stiffened panels. By integrating the Co-Kriging model with fast Fourier transform filtering, Dong et al. [
29] put forward a conditional random field strategy to enable slope reliability evaluation.
While MF Kriging surrogate modeling has emerged as a powerful tool for accelerating computationally expensive engineering analyses, systematic comparative studies of different MF Kriging models within MF active learning frameworks remain relatively scarce. This gap in comparative research is particularly noteworthy, given the critical role that model selection plays in determining the efficiency and accuracy of reliability analysis applications. By systematically evaluating the performance of different MF Kriging models in reliability analysis, researchers can establish guidelines for model selection, ultimately contributing to more efficient and reliable reliability methods.
In this work, a comparative analysis of the performance of the hierarchical Kriging, dis-Kriging, and Co-Kriging models in the context of active learning MF reliability is presented. Two frameworks are adopted for different multi-fidelity (MF) Kriging models: the one-stage AMK-MCS+AEFF method and the two-stage AHK-MCS method. The subsequent sections are structured as outlined below.
Section 2 offers a concise review of the core principles underlying the key models and methods, including the Kriging model, hierarchical Kriging model, dis-Kriging model, Co-Kriging model, the AMK-MCS+AEFF method, and the AHK-MCS method.
Section 3 delivers a comparative analysis of the performance of different MF Kriging models in two distinct MF reliability frameworks. Some discussions and conclusions are presented in
Section 4 and
Section 5, respectively.
3. Comparative Results of Different MF Kriging Models
In this section, five numerical examples are employed to compare the performance of different MF Kriging models in two reliability analysis frameworks. The hierarchical Kriging model, dis-Kriging model, and Co-Kriging model are employed in the frameworks of the AHK-MCS and AMK-MCS+AEFF methods. This work thus sets out to compare six different methods. To provide a concise representation of the six methods, they are designated as AHK-MCS, ADK-MCS, ACK-MCS, AHK-MCS+AEFF, ADK-MCS+AEFF, and ACK-MCS+AEFF. The second letter of the six methods is congruent with the first letter of the MF Kriging model. The initial number of the candidate samples for the six methods is
. The number of initial HF samples for the six methods is five. The stopping threshold can be taken as the percentage of wrong sign estimations of the limit state function. An excessive threshold value may result in incorrect failure probability estimation [
34]. In this study, the stopping threshold is set to
to evaluate the performance of different MF Kriging models under varying threshold conditions [
25]. If a method requires more than 100 HF samples while still failing to meet the stopping threshold, it is deemed to have failed the reliability analysis. All MF surrogate models are calculated 30 times to assess the stability of failure probability estimation.
3.1. Multimodal Function
The first example involves a two-dimensional, highly nonlinear function. Referenced from [
15,
35], the limit state function is expressed as
where the parameters of the normally distributed
take a mean of [1.5, 2.5] and a variance of [1, 1]. The multimodal function has a cost ratio
of 5.
The comparative results are presented in
Table 1. The
is the number of HF samples and the
is the number of LF samples. The
is the total cost which equals
. The
is the failure probability. In this work,
is employed to denote the standard deviation of
, with
being utilised to ascertain the stability of the MF Kriging models in predicting failure probability. A smaller
value indicates that the estimated failure probability is closer to the real failure probability. In other words, the MF Kriging model can predict the failure probability more stably. Furthermore, this paper presents the distribution of
.
As demonstrated in
Table 1 and
Figure 3, it is evident that all methods provide an accurate estimation of the failure probability, with the exception of the ACK-MCS method when the stopping threshold is set to be 0.02 and 0.03. The two-stage methods have overall demonstrated higher efficiencies than the one-stage methods when using the same MF Kriging model. The
and the
of the ACK-MCS+AEFF method are maximal, while its accuracy is moderate. It should be noted that the
of ACK-MCS+AEFF with a stopping threshold of 0.02 is higher than that of 0.01, since a prohibitive number of HF samples is sometimes required. This also indicates that the active learning process is unstable. Despite the fact that the
of the AHK-MCS+AEFF method is minimal, the
is excessive, thereby resulting in a low level of efficiency. Overall, the dis-Kriging and hierarchical Kriging models perform well in both one-stage and two-stage frameworks.
3.2. Four-Boundary Series System
The second example involves a two-dimensional, four-boundary series system. Referenced from [
14,
25], the limit state function is expressed as
where
and
follow a standard normal distribution. The four-boundary series system has a cost ratio
of 5.
For the four-boundary series system, the results obtained with different MF Kriging models are reported in
Table 2 and
Figure 4. The two-stage method demonstrates a superior efficiency and stability compared with the one-stage approach when employing the hierarchical Kriging model. The dispersion of the failure probability of the ACK-MCS method remains high at stopping thresholds of 0.02 and 0.03, while
shows a substantial increase at 0.01, demonstrating that the ACK-MCS method exhibits excessive sensitivity to the stopping threshold. The
and
of the ADK-MCS method when the stopping threshold is set to 0.01 are excessively high, indicating that an excessively low stopping threshold is unnecessary. The results of the four-boundary series system show that both the Co-Kriging and dis-Kriging models perform poorly in the two-stage framework, and they offer no significant advantage over hierarchical Kriging in the one-stage framework. The hierarchical Kriging method demonstrates a superior efficiency and stability for this example.
3.3. Four-Dimensional Park Function
The third example involves a four-dimensional park function, a test case for high-dimensional problems [
36]. The limit state function of the four-dimensional park function is expressed by
where the variables are uniformly distributed in
. The four-dimensional park function has a cost ratio
of 10.
For the four-dimensional park function, the results obtained with different MF Kriging models are reported in
Table 3 and
Figure 5. In terms of precision, the two-stage methods demonstrate a superior performance compared with the one-stage methods. The stability of the one-stage methods is also unsatisfactory. Notably, the two-stage methods utilise 71 LF samples, whereas the one-stage methods employ fewer than 55 LF samples. An insufficient number of LF samples might account for the inferior performance of the one-stage methods, particularly for the hierarchical Kriging model [
26]. The ADK-MCS and ACK-MCS methods show a significant increase in
when the stopping threshold is set to 0.03. The hierarchical Kriging model outperforms the other MF Kriging models within the two-stage framework by requiring the fewest HF samples to correctly construct the HF model.
3.4. Nonlinear Oscillator System
The fourth example considers a six-dimensional nonlinear oscillator system [
25,
37], depicted in
Figure 6, whose limit state function is given by
where
. The statistics of the variables are summarized in
Table 4. The nonlinear oscillator system has a cost ratio
of 10.
Table 5 and
Figure 7 show the results of the nonlinear oscillator system with different MF Kriging models. With the stopping threshold configured at 0.01, the AHK-MCS and ACK-MCS methods demonstrate excessively high
values, although they maintain low
values. The ADK-MCS method fails to reach the stopping threshold when it is set to 0.01. This indicates that employing a stringent stopping threshold may be unnecessary for the nonlinear oscillator system. The ADK-MCS method underperforms among the one-stage methods and the
of the ADK-MCS+AEFF method is the highest among the two-stage framework methods, which indicates that the dis-Kriging method has limited capabilities in handling complex problems. Co-Kriging yields a relatively low
in the one-stage framework, whereas its performance deteriorates in the two-stage framework, resulting in a substantially higher cost. It is evident that the
of the ACK-MCS+AEFF method is less in the one-stage methods, whilst the
of the ACK-MCS method is more in the two-stage methods. Moreover, the stability of the ACK-MCS+AEFF method remains unsatisfactory. It should be noted that when applying the ACK-MCS+AEFF method to the four-dimensional park function and the nonlinear oscillator system, premature reaching of the stopping threshold frequently occurs, consequently leading to inaccurate failure probability estimation. The hierarchical Kriging model performs well in both one-stage and two-stage frameworks.
3.5. Vehicle Side Impact Problem
The fifth example involves a seven-dimensional vehicle side impact problem. Referenced from [
24,
38], the problem is expressed as
In the vehicle side impact problem, the statistical parameters of the variables have been adjusted compared to those in [
38]. This modification aims to increase the failure probability, thereby reducing the number of candidate samples required. The statistics of the variables are summarized in
Table 6. The cost ratio
of this problem is changed from 20 to 5 to avoid too many LF samples.
The results of the vehicle side impact problem are set out in
Table 7 and
Figure 8. The ACK-MCS and ACK-MCS+AEFF methods are not capable of satisfying the stopping thresholds. This indicates that the Co-Kriging model performs poorly in handling complex and multivariable problems. In general, the two-stage methods demonstrate superior performances in comparison to the one-stage methods. Although the
index when using the dis-Kriging model is lower than when using the hierarchical Kriging model, the
and the
for the dis-Kriging model are significantly greater than those for the hierarchical Kriging model. This indicates that the dis-Kriging model is deficient in its ability to address complex problems. The hierarchical Kriging model has been shown to exhibit a superior performance in both one-stage and two-stage frameworks when compared with the dis-Kriging model.
4. Discussion
In this work, three MF Kriging models are compared in two types of MF active learning frameworks, i.e., six active learning MF methods are employed. The performances of the hierarchical Kriging model, the dis-Kriging model, and the Co-Kriging model are explored through five numerical examples. In general, the three MF Kriging models have been demonstrated to possess the capacity to predict failure probability except for the vehicle side impact problem. It has been shown that two-stage methods overall outperform one-stage methods for the estimation of failure probability, both in terms of efficiency and stability. The primary rationale for this phenomenon lies in the two-stage methods’ full utilization of the LF model to reduce the required number of HF samples [
26]. The influence of the stopping threshold is also studied in this work. As the stopping threshold decreases, the number of required HF samples and the stability of failure probability generally increase. Nevertheless, an excessively low stopping threshold is not necessary. In some cases, setting an excessively low stopping threshold may prevent the threshold from being met. It should be noted that extreme values of
appear rarely. These cases mainly occur when thresholds are set at 0.01 or 0.03, indicating overfitting or underfitting. This highlights the need for a more robust stopping criterion.
To summarise the results of the five numerical examples, it can be concluded that the hierarchical Kriging model performs relatively well in both the one-stage and two-stage frameworks. The advantages of the hierarchical Kriging model in the two-stage framework are more readily apparent because the constructed high-precision LF model is used to predict the basis function of the observed samples. Consequently, the trend of the LF function is effectively mapped to the sampled HF data, thereby yielding a more accurate surrogate model for the HF function. Furthermore, the hierarchical Kriging model demonstrates a superior performance in estimating the MSE compared with both the dis-Kriging and Co-Kriging methods. It has been posited that the hierarchical Kriging model is an efficient, accurate, and robust model [
31]. The dis-Kriging model demonstrates a strong performance in the multimodal function and four-boundary series system examples. Nevertheless, its efficacy is less pronounced in the four-dimensional park function, nonlinear oscillator systems, and vehicle side impact problem. This indicates that the dis-Kriging model lacks sufficient capacity to address intricate, high-degree nonlinear problems. The dis-Kriging model is a concise form of the MF Kriging model. However, its MSE is simply the sum of the MSE of the LF Kriging model and the discrepancy Kriging model, which fails to fully capture the uncertainty of the predicted values. The Co-Kriging model demonstrates an unstable performance when handling five numerical examples, especially for multi-dimensional nonlinear problems. Furthermore, the ACK-MCS and ACK-MCS+AEFF methods do not reach the stopping threshold in the vehicle side impact problem. Studies of the five examples demonstrate the limited robustness of the Co-Kriging model. A plausible explanation for this phenomenon is that the Co-Kriging model incorporates a significantly larger correlation matrix than both hierarchical Kriging and dis-Kriging. As the dimensionality of the problem increases, the complexity of the correlation matrix grows proportionally. This poses greater challenges to obtain the model parameters through optimization methods, consequently leading to diminished robustness.
Based on the results of the five examples, it is suggested that the hierarchical Kriging model could be employed for MF reliability analysis. The dis-Kriging model is recommended for simple situations. Considering that the robustness of the Co-Kriging model is not satisfactory, it is not currently recommended to use the Co-Kriging model for MF reliability analysis. It should be pointed out that the Co-Kriging model utilized is derived from the ooDACE toolbox [
33]. It is reasonable to expect that improved optimization methods for the Co-Kriging model could enhance its robustness in MF active learning methods.
It is important to note that the examples employed in this study are primarily numerical simulations, although the last two examples actually originate from engineering backgrounds. However, real-world scenarios typically involve greater uncertainty and noise within samples. To achieve a more comprehensive evaluation of the performance of different Kriging-based metamodels, it is imperative to incorporate authentic engineering problems into subsequent comparative analyses. It should be noted that several new Kriging-based models have also been proposed [
9,
39]. It is suggested to conduct further comparative research on a broader range of Kriging models.