Next Article in Journal
A Survey on Personalized Conflict Resolution Approaches in Air Traffic Control
Previous Article in Journal
An Arbitrary Order Virtual Element Method for Free Torsional Vibrations of Beams
Previous Article in Special Issue
Multi-Agent Reinforcement Symbolic Regression for the Fatigue Life Prediction of Aircraft Landing Gear
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep-Reinforcement-Learning-Enhanced Kriging Modeling Method with Limit State Dominant Sampling for Aeroengine Structural Reliability Analysis

1
Department of Aeronautics and Astronautics, Fudan University, Shanghai 200433, China
2
AECC Hunan Power Machinery Research Institute, Zhuzhou 412002, China
*
Author to whom correspondence should be addressed.
Aerospace 2025, 12(9), 752; https://doi.org/10.3390/aerospace12090752
Submission received: 24 July 2025 / Revised: 16 August 2025 / Accepted: 19 August 2025 / Published: 22 August 2025

Abstract

Reliability analysis of aeroengine structures is a critical task in aerospace engineering, but traditional methods often face challenges of low computational efficiency and insufficient accuracy when dealing with complex, high-dimensional, and nonlinear problems. This paper proposes a novel reliability assessment method (AC-Kriging) based on the Actor–Critic network and Kriging surrogate models to address these issues. The Actor network optimizes the sampling strategy for design variables, making sampling more efficient. The Critic network assesses the reliability of these samples to ensure accurate results, while a Kriging surrogate model replaces expensive finite element simulations and cuts computational cost. Three case studies demonstrate that AC-Kriging significantly outperforms traditional methods in both sampling efficiency and reliability estimation accuracy. This research provides an efficient and reliable solution for reliability analysis of aeroengine structures, with important theoretical and engineering application value. Three case studies demonstrate that AC-Kriging significantly outperforms traditional methods in both sampling efficiency and reliability-estimation accuracy, requiring only 52–147 samples to achieve comparable accuracy while maintaining the relative failure probability error within 0.87–7.27%. This research provides an efficient and reliable solution for the reliability analysis of aeroengine structures.

1. Introduction

Structural reliability analysis of aeroengine components represents one of the most critical challenges in aerospace engineering, directly impacting flight safety, operational efficiency, and economic viability of modern aviation systems [1]. It is significant to quantify the structural failure probability under complex working conditions for ensuring structural reliability and safety. In the past few decades, there have been many works about structural failure probability estimation [1,2,3]. As the critical power components of aircraft, aeroengine structures like turbine blisks are typically exposed to the uncertain environment of high temperature, severe pressure differentials and alternating loads, which represent challenges to the accurate assessment of fatigue failure probability [4,5]. To quantify the structural failure probability, the equation can be defined as follows [6]:
P f = g ( x ) 0 f ( x ) d x
where x = [x1, x2, x3, …, xn] represents the input vector of random variables including material properties, loads settings and fatigue parameters; g(·) is the structural limit state function (LSF) or failure surface, which is shown as structural failure when g(x) ≤ 0; f(·) denotes the joint probability density function associated with the random variables x. However, it is an intractable analytically to calculate the failure probability of structures under complex operating conditions by the integral Equation (1). Therefore, numerical surrogate methods are needed to balance the computational efficiency with estimation accuracy.
Traditional reliability analysis methods, including the first-order reliability method (FORM), second-order reliability method (SORM), etc., approximate the LSF by Taylor series expansion around the design of points (DoPs) [7,8]. Despite the computational efficiency advantages, FORM and SORM are limited to precision accuracy in high-dimensional nonlinear problems by low-order polynomial approximations. Moreover, the fatigue failure mechanisms of turbine blisks result in non-convex failure domains, which violate the assumptions of FORM/SORM [9]. Monte Carlo Simulation (MCS), as one of the most significant numerical simulation methods, provides a gold standard for failure probability estimation by directly sampling the probability space [10]. Based on the MCS, the estimation of failure probability is described as follows:
P ^ f 1 N i = 1 N I [ g ( x ) 0 ] = N f N   s . t .   I [ g ( x ) ] = 0 , g x 0 1 , g x > 0
where I[·] is the indicator function of g(x), which equals 1 in case of structural failure; otherwise, it is 0; N is the total number of simulation samples; and Nf is the theoretical minimum simulation number with a relative error ε of Pf, and it is as follows:
N z α / 2 2 ( 1 P ^ f ) ε 2 P ^ f
where z is the normal distribution; α/2 is the quantile of z, and 1 − α is used to define the confidence interval. However, prohibitively high computational demands in the theoretical random sampling process of MCS are needed, which exponentially increase with the precision of failure probability. The typical structural failure probability of aeroengines is usually less than 1 × 10−5, which is needed for millions of high-fidelity finite element (FE) simulations in MCS, far beyond the acceptable range in practical projects. To address this issue, advanced reliability analysis methods, such as importance sampling (IS), directional sampling (DS) and subset simulation (SS), have been developed to estimate small probability of failure [11,12,13]. However, IS faces challenges in accurately selecting the importance function and solving high-dimensional complex failure regions, and SS suffers from high computational cost in each subset and sensitivity to initial samples.
To address the above-mentioned problems, machine learning (ML)-based surrogate modeling has been applied to fit complex implicit LSF and calculate failure probability in structural reliability analysis problems, which has become one of the current research hotspots [14,15,16,17]. Recent studies demonstrate extensive applications of surrogate models in structural reliability analysis, such as polynomial chaos expansions [18], support vector machines [19], artificial neural networks (ANNs) [20], and Kriging models [21,22]. Sun et al. developed a method using LIF, MCMC and MC to update the Kriging model and improve efficiency in structural reliability analysis [21]. Qian et al. proposed a time-variant reliability method for an industrial robot rotate vector reducer with multiple failure modes using a Kriging model [22]. Vazirizade et al. introduced an ANN-based method to reduce the computational effort required for reliability analysis and damage detection [23]. Yang et al. enhanced sparse polynomial chaos expansion accuracy through sequential sample point extraction, successfully applying it to slope reliability analysis [24]. Gaussian processes have been particularly well investigated for structural reliability analysis due to their statistical learning capabilities [25,26,27]. However, it is critical to select the DoPs by a sampling method for training the surrogate models. The lack of training samples near the LSF will limit the accuracy and efficiency of surrogate models. For complex structures like aeroengines, it is difficult to construct an accurate model of the LSF by one-time sampling to obtain design points. Therefore, how to optimize sampling and improve the accuracy of models near the LSF has become one key research issue in structural reliability analysis.
Recently, active sampling methods have become interesting in the field of structural reliability analysis due to the advantages in enhancing accuracy of surrogate models and computational efficiency [20,28,29]. The active selection of samples greatly reduces the computational cost by maximizing contributions of failure probability evaluation. For example, Ling et al. [6] proposed a method combining adaptive Kriging with MCS to efficiently estimate the failure probability function; Wang et al. developed a new active learning method for estimating the failure probability based on a penalty learning function [30]; and Yuan et al. proposed an efficient reliability method for structural systems with multiple failure modes, developing a new learning function based on the system structure function to select added points from a system perspective [20]. These methods only require calling the real model once in each iteration, significantly improving computational efficiency. However, it is easy to fall into local optima in active-learning-based modeling due to variance estimation bias near nonlinear failure domain boundaries causing redundant samples or missing key areas. To address the above-mentioned problem, failure probability sensitivity-based sampling methods are proposed [31,32]. Dang et al. proposed a Bayesian active learning line sampling to improve the epistemic uncertainty about failure probability [33]. Moustapha et al. developed an active learning strategy designed for solving the presence of multiple failure modes and the uneven contribution to failure [34]. Liu et al. integrated the classical active Kriging-MCS and adaptive linked importance sampling to establish a novel reliability analysis in the extremely small failure probability [35]. However, current active sampling approaches face two significant challenges that limit their effectiveness: (1) the majority of existing techniques rely on fixed learning functions that cannot dynamically adjust their sampling strategies based on evolving problem understanding or provide adaptive feedback for sample quality assessment; (2) the computational burden grows substantially with increasing problem dimensionality, creating practical barriers for implementation in high-dimensional engineering applications typical of complex structural systems.
Deep reinforcement learning (DRL), through the unique “state-action-reward” interactive framework, achieves autonomous optimization and iterative evolution of strategies, providing a new paradigm for active learning in structural reliability analysis. Recently, DRL has made significant breakthroughs in various fields, including the successes of AlphaGo [36], Deep Q-Networks (DQNs) [37] and large language models (like ChatGPT-3, Deepseek-R1) [38,39]. Many studies on DRL-based structural reliability analysis have emerged; for instance, Xiang et al. proposed a DRL-based sampling method for structural reliability assessment [40]; Guan et al. developed high-accuracy structural dominant failure modes and self-play strategy searching methods based on DRL [41,42]. Wei et al. introduced a general DRL framework for structural maintenance policy [43]. Li et al. proposed an efficient optimization method of the base isolation systems and shape memory alloy inverter using different DRL algorithms [44]. The DRL-based reliability analysis methods have shown potential, but there are several challenges in the LSF modeling process: (1) It is not easy to converge in the training process of deep learning network. (2) The reward function settings still need improvement to reasonably select DoPs. However, DRL is still in the exploratory stage in structural reliability analysis, and its potential has been shown.
The critical gap in the current literature lies in the absence of a theoretically grounded, computationally efficient, and practically robust method that can adaptively optimize sampling strategies based on evolving the understanding of the limit state function, provide theoretical convergence guarantees for failure probability estimation accuracy, scale effectively to high-dimensional problems typical in aerospace engineering, balance multiple objectives including accuracy, efficiency, and computational budget constraints, and integrate seamlessly with established surrogate modeling frameworks. To solve the above-mentioned issues, an Actor–Critic network-enhanced Kriging method (AC-Kriging) is proposed to obtain the positive DoPs through DRL-based active searching and establish the surrogate LSF for structural reliability evaluation. Specifically, in the application of AC-Kriging, the following steps are carried out: (1) Firstly, an initial Kriging model is constructed using a Latin hypercube sampling (LHS)-generated DoPs to approximate the structural LSF. (2) The Actor network identifies candidate points within the DoPs, while the Critic network evaluates their potential contributions through global-local reward functions based on the Kriging model’s prediction errors, thereby selecting the next optimal sampling point. (3) The accuracy of the Kriging model is iteratively evaluated, and the convergence of reliability analysis is monitored until the dual convergence criteria (model precision and algorithmic stability) are satisfied. The AC-Kriging method aims to develop an efficient and accurate structural reliability analysis framework by the seamless integration of Kriging modeling and Actor–Critic reinforcement learning networks.
The primary contributions of this paper to the field of structural reliability analysis are as follows:
(1)
This study introduces a comprehensive framework that combines Actor–Critic reinforcement learning with Kriging surrogate modeling for structural reliability analysis, enabling dynamic and intelligent sample selection that adapts to the evolving understanding of limit state boundaries throughout the analysis process.
(2)
The proposed continuous state–action representation effectively addresses the curse of dimensionality that affects traditional active sampling methods, achieving improved computational scaling with problem size and making it more suitable for complex aerospace engineering applications.
(3)
This research develops dual convergence criteria that monitor both Kriging model precision and algorithmic stability, providing a systematic approach for determining when sufficient sampling accuracy has been achieved in the reliability analysis process.
(4)
The AC-Kriging method optimizes multiple competing objectives by its reward function design, addressing the real-world constraints faced by aerospace engineers while providing a theoretical foundation for integrating deep reinforcement learning principles with established surrogate modeling frameworks.
The remainder of this paper is structured as follows: Section 2 introduces the theoretical basis of active sampling based on AC networks, Section 3 elaborates on the proposed AC-Kriging method, Section 4 verifies the accuracy and efficiency of the method through theoretical and engineering aeroengine structural reliability analysis, and Section 5 provides the conclusions.

2. Deep Reinforcement Learning

DRL is a hybrid algorithm framework integrating deep learning with reinforcement learning principles to enable autonomous decision-making in high-dimensional state spaces [45]. It is skilled at solving the sequential decision problem by trial and error of an agent in a complex environment. Based on the learning strategy, the final objective would be achieved by the DRL output continuous action of an agent. To evaluate the actions of agents, the rewards calculated by a specific reward function are specified according to each action. DRL aims to maximize the cumulative rewards by combining the rewards of each action and final reward upon task completion, along with a punishment function for undesired actions. Unlike supervised learning, the DRL model is trained by interactive data with rewards instead of diverse labeled data. In addition, the agent is represented by a deep neural network trained by multiple episodes, and each episode contains numerous steps.
As illustrated in Figure 1, DRL is governed by Markov decision process (MDP) in the training process, which provides a basic mathematical framework for solving the decision-making problems, with uncertainty and long-term cumulative rewards [46]. MDP usually consists of five elements (S, A, R, P, γ), in which S is all the possible states or actual states of agent in the environment; A is the actions of agent; R is the reward obtained by agent after one action aA in one state sS; P is the state transition probability function, which determines the next S’ given state S = s and action A = a; and γ is the discount rate, which determines the importance of a future reward.
In reinforcement learning, at time step t, st represents the state of one agent, and at represents the action adopted by one agent in state st. According to the reward function r(s,a), the agent receives the reward rt and reaches the next state st+1. The trajectory of the agent in a game is recorded as follows:
S 1 , A 1 , R 1 ; S 2 , A 2 , R 2 ; ; S t , A t , R t ; ; S n , A n , R n
where n is the step number of one game; St, At and Rt are the t-th state, action and reward of agent, respectively. In addition, suppose state transitions have Markov properties, namely:
P S t + 1 | S t , A t = P S t + 1 | S 1 , A 1 , S 2 , A 2 , S t , A t
The action taken by one agent is determined by the policy function π(a|s). The probability of an agent taking action a in state s is π(a|s) = P[S = s, A = a]. The agent is trained to maximize the expectation of the cumulative discounted return Ut which is defined as follows:
U t = r t + γ r t + 1 + γ 2 r t + 2 + γ 3 r t + 3 + + γ T t r T
where T is the last step of each step, and ri is the reward obtained by the agent at each time step. Ut is a variable that includes future actions and states, with inherent randomness.
Taking the expectation of Ut yields the action-value function Qπ:
Q π s t , a t = E S t + 1 , A t + 1 , , S n , A n U t | S t = s t , A t = a t
where St = st and At = at are the observed values of St and At, respectively. Qπ depends on t-th state st and action at, not on state st+1 and action at+1 from t + 1 step, because the random variables are eliminated by expectation. To eliminate the influence of strategy policy π, the optimal action-value function Q* is described as follows:
Q * s t , a t = max π Q π s t , a t ,       s t S ,     a t A .
where the best policy function is selected as follows:
π * = argmax π Q π s t , a t       s t S ,     a t A .
which illustrates Q* depends on st and at.
To quantify the chances of winning the game, the state-value function is defined as follows:
V π s t = E A t ~ π | s t Q π s t , A t = a A π a | s t Q π s t , a
where At is eliminated as a random parameter.
V π s t = E A t , S t + 1 , A t + 1 , , S n , A n U t | S t = s t
where the expectation eliminates the dependence of Ut on the random variables At, St+1, At+1, …, Sn, An. The greater the state value, the higher the expected return. State value can be used to evaluate the quality of policy π and state st.
Reinforcement learning aims to learn a policy function π(a|s) to maximize cumulative discounted rewards. Because directly computing the action-value function Qπ(s,a) and state-value function Vπ(s) is often infeasible, DRL employs a deep neural network (DNN) to approximate these functions. For instance, DQNs use neural networks to estimate Qπ(s,a), policy gradient methods directly optimize the policy function parameters, and Actor–Critic methods combine both by using neural networks to approximate both the policy and value functions.

3. Proposed AC-Kriging Method

In this study, the AC-Kriging method is proposed for efficient structural reliability analysis, integrating the Actor–Critic reinforcement learning framework with the Kriging model to optimize the selection of experimental sampling points, aiming to accurately approximate the limit state surface while reducing computational cost. As illustrated in Figure 2, the AC-Kriging method establishes correspondences between reliability analysis and the reinforcement learning paradigm: the sampling space represents the environment state, the deep neural network serves as the agent, and the selection of experimental points corresponds to actions.
Step 1: The Kriging model provides essential information about the current approximation of the limit state function.
Step 2: The AC-Kriging method consists of two key networks: (1) an Actor network that determines the optimal location for the next experimental point based on current information, and (2) a Critic network that evaluates the expected contribution of selected points to reliability assessment accuracy. Both networks share initial convolutional layers that extract relevant features from the sampling domain representation, and these features are subsequently processed through fully connected layers to generate either sampling decisions or value estimations.
Step 3: The method operates iteratively, with each cycle involving the extraction of state information from the current Kriging model, selection of a new experimental point, evaluation of the true limit state function at this point, updating of the Kriging surrogate model, and calculation of rewards to refine the neural network parameters.
Through this process, the AC-Kriging method overcomes the limitations of traditional sampling methods by adaptively learning optimal sampling strategies for various structural reliability problems.

3.1. Environment and State Definition

To effectively transform structural reliability analysis into a reinforcement learning problem, the environment is represented as the n-dimensional design space xRn, where random variables follow their respective probability distributions. The limit state function (LSF) g(x) partitions this space into failure (g(x) < 0) and safe (g(x) ≥ 0) domains.

3.1.1. State Space Design

In traditional reinforcement learning frameworks, state representations often rely on discretized sampling spaces, which can lead to excessive computational complexity when dealing with high-dimensional problems. To overcome this limitation, the AC-Kriging method proposes a continuous state representation that efficiently captures the essential information needed for reliability analysis. The state space is defined as a two-dimensional continuous space, where each state, sRn, represents the spatial coordinates of an agent. Specifically, s = [x1, x2, …, xn], with xi denoting the agent’s position along the respective axes.

3.1.2. Action Space Design

The action space is a continuous space in Rn, corresponding to the displacement vector an agent can execute. Formally, an action a = [Δx1, Δx2, …, Δxn], where Δxi ∈ [−1, 1] in Figure 3. This range allows for flexible movement in any direction within the environment while maintaining control over the scale of displacement, ensuring that the agent can perform both fine and coarse adjustments to its position as dictated by the boundary exploration strategy.

3.1.3. Reward Function Design

As shown in Figure 4, the AC-Kriging sampling process consists of two phases: in the searching phase (a), the agent starts from random positions to locate the LSF and identify high quality (HQ) points; in the tracking phase (b), the agent initializes from the discovered HQ point to systematically extract more sampling points along the LSF, achieving efficient boundary-following for structural reliability analysis.
In t-step, the reward function is a composite objective function designed to guide the agent’s learning process through a multi-faceted incentive structure, mathematically formulated as follows:
R k s , a = a 1 r edge s + a 2 r movement s , a + a 3 r exploration s + a 4 r keep s
where the coefficients a1, a2, a3 and a4 are weighting factors that balance the influence of each reward component, determined through empirical tuning to optimize the agent’s performance in the boundary exploration task.
The reward function comprises four distinct components, each addressing a specific aspect of the boundary-following behavior, which are defined as follows:
(1)
The edge proximity reward is defined as follows:
r edge s = exp g s
which decays exponentially with the agent’s distance from the boundary, encouraging the agent to remain near the boundary. This exponential decay ensures that the reward is highest when the agent is precisely on the boundary and diminishes rapidly as the agent moves away.
(2)
To promote progress along the boundary, the movement reward is defined as follows:
r movement s , a = exp g s g l ast   if   g s < 0.5 , otherwise   0
which rewards the agent for making progress along the boundary by comparing the current boundary proximity g(s) with the previous value glast. This component is activated only when the agent remains sufficiently close to the boundary, encouraging consistent boundary traversal rather than random movements.
(3)
Exploration of uncharted regions is incentivized through the exploration reward is defined as follows:
r exploration s = constant
if the distance from s to the nearest boundary point in the shared boundary database exceeds 0.3; otherwise, it is 0. This mechanism encourages the agent to explore uncharted regions of the boundary, enhancing the completeness of the boundary mapping process and preventing the agent from repeatedly traversing already mapped sections.
(4)
The keep reward is defined as follows:
r keep s = max 0 , 0.5 g s , i f   g s < 0.5 , otherwise   0
which incentivizes the agent to maintain a consistent distance from the boundary, promoting stable boundary-following behavior. This component provides a graduated reward that is maximized when the agent maintains an optimal distance from the boundary, balancing the need to stay close to the boundary.
This meticulously designed reward function serves as the cornerstone for the reinforcement learning framework, guiding the agent to efficiently map the boundary while balancing exploration and exploitation, and ensuring adherence to the environmental constraints.

3.2. AC-Kriging-Based Active Sampling Method

The AC-Kriging method integrates the Actor–Critic network architecture with Kriging surrogate modeling to achieve efficient and accurate structural reliability analysis. This section details the key components of this integrated approach, focusing on the network structure and the adaptive sampling strategy.

3.2.1. Kriging Model

In the realm of structural engineering simulation and optimization design, the Kriging model stands as a widely utilized meta-model. Its construction process is as follows:
Given a sample set x = {x1, x2, …, xm}, xiRn and the corresponding response values g = [g(x1), g(x2), …, g(xm)], the Kriging model is expressed as follows:
g ˜ Kriging x = f T x β + z x
where f(x) is an n × 1 constant vector of ones, β is the regression coefficient vector, and z(x) represents a Gaussian process with a mean of 0 and a variance of σ2. The covariance is defined as follows:
cov [ z ( x i ) , z ( x j ) ] = σ 2 R ( α , x i , x j ) = σ 2 exp k = 1 n α k ( x i k x j k )
where R(α,xi,xj) indicates the correlation function between sample points xi and xj. A Gaussian function is employed as the correlation function in this study. α(n × 1) is the correlation parameter vector. According to the maximum likelihood method, α can be obtained by solving the following optimization problem:
max α 1 2 ln | R | + m 2 ln σ 2
where R = [Rij]m×m (Rij = R(α,xi,xj)) is the correlation matrix. The estimators of σ2 and β, namely, σ ^ 2 and β ^ , are defined as follows:
β ^ = ( F T R 1 F ) 1 F T R 1 g σ ^ 2 = 1 m g F β ^ T R 1 g F β ^
where F = [Fij] m×n (Fij = fj(xi)) is the regression matrix. Consequently, the predictor and prediction variance at an unknow point x are defined as follows:
g ^ Kriging ( x ) = f T ( x ) β ^ + r T ( x ) R 1 g F β ^
σ ^ g 2 ( x ) = σ ^ 2 1 + u T ( x ) F T R 1 F 1 u ( x ) r T ( x ) R 1 r ( x )
where the r(x) is an n dimensional vector with entry ri = R[z(xi),z(x)], defined as follows:
r ( x ) = R ( α , x 1 , x ) , R ( α , x 2 , x ) , , R ( α , x m , x )
The vector of the m observed function value can be calculated as follows:
u x = F T R 1 r x f

3.2.2. Actor–Critic Network

As shown in Figure 5, the Actor–Critic network is a reinforcement learning architecture that combines policy gradient and value function estimation, in which (1) the Actor network uses neural net π(a|s;θ) to approximate π(a|s), where θ is trainable parameters of the neural net, aiming to learn an optimal policy that maximizes the expected cumulative reward; (2) the Critic network uses neural net q(s,a,ω) to approximate Qπ(s,a), where ω is a trainable parameter of the neural net, aiming to compare the expected return from the selected actions with the actual rewards received and provide feedback to the Actor for policy improvement.
The training strategy of policy network uses the approximate policy gradient θ J ( θ ) to update the parameter θ. The unbiased estimation of the strategy gradient is described as follows
g ^ ( s , a ; θ ) q ( s , a ; ω ) θ ln π ( a | s ; θ )
where q(s,a,ω) is the approximate of action-value function Qπ(s,a). Then, parameter θ of the policy neural network is updated by gradient ascent:
θ θ + ρ g ^ ( s , a ; θ )
herein ρ is the learning rate of policy network.
Based on the above-mentioned update strategy, the Actor will gain higher score, which results in the dependence of critic network evaluation ability. The state-value function Vπ(s) can be approximate to the following:
v ( s ; θ ) = E A ~ π ( | s ; θ ) q ( s , A ; ω )
where v(s; θ) is the mean of Critic score.
At t-step, the output of value-state network is as follows:
q ^ t = q ( s t , a t ; ω )
which is the estimation of the action-value function Qπ(st,at). At t + 1 step, the temporal-difference (TD) target is calculated with the observed rt, st+1 and at+1, defined as follows:
y ^ t r t + γ q ( s t + 1 . a t + 1 ; ω )
which is also the estimation of the action-value function Qπ(st,at). However, the later estimation is closer to the truth due to the consideration of actual observed reward rt. To update the parameters of value-state network, the loss function and its gradient are defined as follows:
L ( ω ) 1 2 q ( s t , a t ; ω ) y ^ t 2 ω L ( ω ) = ( q ^ t y ^ t ) TD   error   δ t ω q ( s t , a t ; ω )
then, conduct the next gradient descent to update ω:
ω ω + α ω L ( ω )
where α is the learning rate of value-state network.
The training process of Actor–Critic network is defined as follows:
Assume the current policy network parameters are θnow and the value network parameters are ωnow, perform the following steps to update the parameters to θnew and ωnew:
Step 1: Observe the current state st and make a decision based on the policy network: at~π(·|st; θnow); then, let the agent perform the action at.
Step 2: Receive the reward rt and observe the next state st+1 from the environment.
Step 3: Make a decision based on the policy network: a ˜ t + 1 ~ π ( | s t + 1 ; θ now ) , but do not let the agent perform action a ˜ t + 1 .
Step 4: Evaluate the value network:
q ^ t = q ( s t , a t ; ω now ) ,       q ^ t + 1 = q ( s t + 1 , a ˜ t + 1 ; ω now )
Step 5: Compute the TD target and TD error:
y ^ t = r t + γ q ^ t + 1 ,       δ t = q ^ t y ^ t
Step 6: Update the value network:
ω new ω now α δ t w q ( s t , a t ; ω now )
Step 7: Update the policy network:
θ new θ now + ρ · q ^ t · θ ln π a t | s t ; θ now

3.3. The Framework of AC-Kriging-Based Structural Reliability Analysis

The proposed AC-Kriging method aims at introducing a novel active sampling and modeling framework for efficient and accuracy structural reliability analysis by integrating the DRL-based Actor–Critic network and the Kriging model. The detailed overview of the framework is illustrated in Figure 6. The computational implementation of the AC-Kriging framework is detailed in Algorithm 1, which systematically describes the integration of reinforcement-learning-based active sampling with Kriging surrogate modeling for efficient structural reliability analysis. In addition, the AC-Kriging framework operates through three distinct phases: initialization, iterative optimization through Actor–Critic network-guided sampling, and final structural reliability assessment.
Algorithm 1. Pseudocode of the AC-Kriging Method for Structural Reliability Analysis
1. Input: Design space X n , initial sample size m , convergence tolerance τ , reward weights of Actor-Critic network a 1 , a 2 , a 3 , a 4 .
2. Output: Failure probability P f , reliability degree R .
3. Initialization Phase.
4. Generate initial samples using Latin Hypercube Sampling: X 0 = { x 1 , x 2 , , x m } ;
5. Evaluate limit state function: G 0 = { g ( x 1 ) , g ( x 2 ) , , g ( x m ) } ;
6. Build initial Kriging model with samples ( X 0 , G 0 ) ;
7.  Compute correlation matrix R with parameters α = { α 1 , α 2 , , α n } ;
8.  Estimate regression coefficients: β ^ = ( F T R 1 F ) 1 F T R 1 G 0 ;
9.  Estimate process variance: σ ^ 2 = ( G 0 F β ^ ) T R 1 ( G 0 F β ^ ) / m ;
10. Initialize Actor network parameters θ 0 and Critic network parameters ω 0 ;
11. Randomly select starting point: x k X 0 , set iteration counter k = 1 .
12. Main Iteration Loop.
13. While convergence criteria not satisfied Do
14.  Construct state s k based on current position x k and Kriging:  s k = [ x k , g ^ ( x k ) , σ g ^ 2 ( x k ) ] ;
15.   Generate displacement action using Actor network: a k ~ π ( | s k ; θ k ) ;
16.   Compute new sample point location: x k + 1 = x k + a k (ensure within design space);
17.   Evaluate limit state function: g k + 1 = g ^ ( x k + 1 ) ;
18.   Update sample sets: X X { x k + 1 } , G G { g k + 1 } .
19.  Update Kriging model parameters.
20.   Update correlation matrix R and correlation parameters α ;
21.   Recompute regression coefficients β ^ and process variance σ ^ 2 .
22.  Calculate reward R k based on multiple criteria:
23.    r e d g e = exp ( | g k + 1 | ) (boundary proximity reward);
24.    r m o v e m e n t = exp ( | g k + 1 g l a s t | ) if | g _ { k + 1 } |   <   0.5 , else 0;
25.    r e x p l o r a t i o n = constant if distance to boundary >   0.3 , else 0;
26.    r k e e p = max ( 0 , 0.5 | g k + 1 | ) if | g k + 1 | < 0.5 , else 0;
27.    R k = a 1 r e d g e + a 2 r m o v e m e n t + a 3 r e x p l o r a t i o n + a 4 r k e e p .
28.  Update Actor-Critic networks.
29.   Update Critic network: ω k + 1 ω k ρ ω L ( ω k ) ;
30.   Update Actor network: θ k + 1 θ k + η θ J ( θ k ) .
31.  Check and Update.
32.   Criteria 1: Kriging model convergence M A E K r i g i n g < τ 1 ;
33.   Criteria 2: Failure probability convergence P f , k P f , k + 1 P f , k + 1 < τ 2
34.   Update: x k x k + 1 , k k + 1 .
35.  End While
36. Structural Reliability Analysis.
37. Generate N samples based on Monte Carlo: X M C = { x 1 , x 2 , , x N } ;
38. Count failures: N f = i = 1 N I [ g ( X ) 0 ] ;
39. Calculate failure probability: P f = N f / N
40. Return P f

3.3.1. Initialization

The initialization phase establishes the computational foundation for the AC-Kriging method by creating both the initial Kriging surrogate model and the reinforcement learning environment. This phase begins with the generation of initial design points using Latin Hypercube Sampling (LHS) to ensure uniform coverage across the n-dimensional design space. The LHS approach generates m initial sample points X0 = {x1, x2, …, xm}, where each point represents a realization of the random variables in the structural reliability problem.
Following sample generation, the limit state function g(xi) is evaluated at each initial sample point through high-fidelity finite element analysis or other appropriate computational methods. These evaluations yield the response dataset G0 = {g(x1), g(x2), …, g(xm)}, which forms the basis for constructing the initial Kriging surrogate model.
The initial Kriging model construction involves the estimation of three critical parameter sets. First, the correlation matrix R is computed, along with the correlation parameters α = {α1, α2, …, αn}, which control the spatial correlation structure of the Gaussian process. Second, the regression coefficients and process variance are estimated by Equation (20). The initialization concludes by establishing the reinforcement learning components. The Actor network parameters θ0 and Critic network parameters ω0 are randomly initialized, creating the foundation for the subsequent active learning process. One starting point xk is randomly selected from the initial sample set X0, which serves as the agent’s initial position. The iteration counter k is set to 1, marking the beginning of the iterative optimization phase.

3.3.2. Actor–Critic Network-Based Active Sampling and Kriging Modeling Method

The iterative optimization phase represents the core innovation of the AC-Kriging method. This phase operates through a continuous feedback loop between the Actor–Critic networks and the evolving Kriging surrogate model, with each iteration strategically selecting new sample points to maximize information gain for reliability analysis. The details are defined as follows:
Step 1: Each iteration begins with state construction, where the current system state sk is formulated based on the agent’s current position xk and the Kriging model’s predictions. The state vector sk = [xk, ĝ(xk), σ2ĝ(xk), boundary proximity] encapsulates essential information including the current location, the Kriging model’s prediction at that location, the associated prediction uncertainty, and proximity indicators to the limit state boundary. This comprehensive state representation enables the Actor network to make informed decisions about the next sample.
Step 2: The Actor network processes the state information to generate a displacement action ak~π(·|sk; θk), which represents the optimal direction and magnitude for moving to the next sample point. The action space is designed as a continuous domain, allowing for precise positioning of sample points anywhere within the feasible design space. The new sample point location is then computed using xk+1 = xk + ak, with appropriate boundary constraints to ensure the new point remains within the design space.
Step 3: The newly selected sample point undergoes evaluation, where the true limit state function g(xk+1) is computed through high-fidelity analysis. This new information is incorporated into the sample database by updating both the sample set Xk+1 = Xkxk+1 and the response set Gk+1 = Gkgk+1.
Step 4: The Kriging model update process ensures that the surrogate model continuously improves its approximation of the limit state function. The correlation matrix R and correlation parameters α are updated to accommodate the new sample point, followed by recomputation of the regression coefficients β ^ and process variance σ ^ 2 . This incremental update strategy maintains computational efficiency while ensuring model accuracy.
Step 5: The Actor–Critic network updates employ temporal difference learning for the Critic network by Equation (31) and policy gradient methods for the Actor network by Equation (26). These updates enable the networks to learn optimal sampling strategies specific to the current reliability problem.
Step 6: During the iterative process, dual convergence criteria are continuously monitored. The first criterion evaluates Kriging model precision by checking whether the maximum prediction uncertainty falls below the threshold εm. The second criterion assesses the stability of failure probability estimates by monitoring the relative change between successive iterations. The iteration completes by updating the agent’s position (xkxk+1) and incrementing the iteration counter, preparing for the next cycle of the optimization process.

3.3.3. Structural Reliability Analysis

The sample points selected through the AC-Kriging method are utilized to construct the Kriging model for structural reliability analysis. The training process follows standard procedures with samples divided into training and validation sets. During the iterative optimization phase, the process monitors dual convergence criteria.
The first criterion is the mean absolute error (MAE) focusing on Kriging model precision, which is defined as follows:
M A E Kriging = 1 n i = 1 n y i y ^ i < τ 1
where n is the number of test samples; τ1 is the first criteria of convergence; yi is the i-th true output of Kriging model; and y ^ i is the i-th prediction output of Kriging model. Then, the coefficient of variation (COV) of the failure probability estimation of the Kriging model is calculated to evaluate the convergence of the reliability analysis:
COV ( P ^ f ) = 1 P ^ f / N P ^ f
where N is the number of samples based on MCS. Finally, the trained Kriging model is applied to structural reliability assessment.
Subsequently, the convergence of the prediction accuracy based on the pre-determined number of iterations or the surrogate model, which can be defined as follows:
P f ( k ) P f ( k 1 ) P f ( k 1 ) < τ 2
where Pf(k) is the t-step failure probability estimate, and τ2 is the convergence threshold of criteria 2. In this study, τ1 and τ2 are equal to 0.05 and 0.01, respectively.

4. Case Studies

In this section, three cases are studied to validate the AC-Kriging method. The first case involves a series system with four failure modes. The second case examines a two-degree-of-freedom dynamic system. The third case analyzes the deformation reliability analysis of turbine blade. Four active sampling methods are selected for comparison with AC-Kriging: IS [11], DS [12], SS [13], Kriging with expected improvement infill sampling (Kriging-EI) [47], and an active sampling method combining Kriging and MCS (AK-MCS) [48].

4.1. Nonlinear Series System Reliability Analysis

This case is a two-dimensional reliability assessment problem with four failure modes [49,50]. The LSF is described as follows:
g ( x 1 , x 2 ) = min x 1 x 2 + k 2 ; 3 + 0.1 x 1 x 2 2 x 1 + x 2 2 x 2 x 1 + k 2 ; 3 + 0.1 x 1 x 2 2 x 1 + x 2 2
where k is a constant equal to 7. The random variables x1 and x2 are normally distributed and their statistical characteristics are shown in Table 1.
To obtain the number of samples, N is calculated based on Equation (37), which gives a failure probability of 10−3 and COV < 5%. The minimum required sample size is determined as N > (1 − 10−3)/(0.025 × 10−3) = 399,600. To establish a reference solution, direct MCS is performed with 5 × 105 samples to obtain the benchmark failure probability. Various active sampling methods are employed to construct surrogate models, which are then used with MCS to estimate the failure probability and evaluate the efficiency of different methods. The relative error εr compared with direct MCS is defined as follows:
ε r = P ^ f P f , M C S P ^ f
where Pf,MCS is the true failure probability base on direct MCS.
To establish the reference solution for comparison, a Monte Carlo convergence study is conducted, as shown in Table 2. The failure probability estimates exhibit significant variation at smaller sample sizes (1.000 × 10−3 at 103 samples, 2.440 × 10−3 at 104 samples) and gradually stabilize as the sample size increases. The estimate stabilizes at 2.316 × 10−3 for 5 × 105 samples and remains at 2.308 × 10−3 with 106 samples; therefore, the failure probability obtained from 5 × 105 samples is taken as the converged value.
Figure 7 shows the true limit state surface of Case 1. The comparison of sampling point distributions among different methods using the same number of experimental points is shown in Figure 8. The blue points represent samples in the failure domain, while the red points represent samples in the safe domain. It can be observed that traditional methods such as IS, DS, and SS tend to distribute points more uniformly across the entire design space. In addition, the Kriging-EI method shows some adaptive behavior by placing more samples near the boundary compared to traditional methods but still maintains a relatively scattered distribution across the design space. In contrast, the proposed AC-Kriging method demonstrates superior adaptive sampling by concentrating points strategically along the LSF boundary. This targeted strategy enables more accurate LSF approximation with the same computational budget, achieving a better classification of safe and failure domains.
Table 3 presents the performance comparison of different active sampling methods for failure probability estimation. Using direct MCS as the reference (5 × 105 samples, Pf = 2.316 × 10−3), the proposed AC-Kriging achieves superior efficiency with only 52 samples and a 0.87% relative error. Other methods require significantly more samples: AK-MCS (96 samples, 3.72% error), Kriging-EI (156 samples, 6.24% error), DS (2219 samples, 5.13% error), IS (3562 samples, 10.71% error), and SS (2.8 × 104 samples, 4.09% error). The results clearly demonstrate that AC-Kriging achieves the highest computational efficiency while maintaining excellent accuracy in failure probability evaluation.

4.2. A Two-Degree-of-Freedom Dynamic System

As shown in Figure 9, a two-degree-of-freedom dynamic system is considered using the LSF computed by the force capacity (Fs) of the stiffness of second spring (Ks), which is defined below [49].
g ( M p , M s , K p , K s , ξ p , ξ c , S 0 , F s ) = F s 3 K s × π S 0 4 ξ s ω s ξ a ξ s ξ p ξ s 4 ξ a 2 + θ 2 + γ ξ a 2 × ξ p ω p 3 + ξ s ω s 3 ω p 4 ξ a ω a 4 1 / 2
where γ represents the mass ratio Ms/Mp, ωn denotes the natural frequency calculated as (ωp + ωs)/2, ξn indicates the damping coefficient given by (ξp + ξs)/2, θ signifies the frequency ratio (ωpωc)/ωn, and S0 represents the white noise intensity. The stochastic parameters follow log-normal probability distributions, with their statistical characteristics detailed in Table 4.
To establish the reference solution for comparison, a Monte Carlo convergence study is conducted, as shown in Table 5. The failure probability estimates exhibit significant variation at smaller sample sizes (2.572 × 10−2 at 103 samples, 3.245 × 10−2 at 2.5 × 105 samples) and gradually stabilize as the sample size increases. The estimate converges to 3.164 × 10−2 at 5 × 105 samples and remains consistent at 3.216 × 10−2 with 106 samples, confirming that 5 × 105 samples provide a reliable reference solution with sufficient convergence for this problem.
Table 6 presents the performance comparison of different active sampling methods for failure probability estimation in Case 2. Using direct MCS as the reference (5 × 105 samples, Pf = 3.164 × 10−2), the proposed AC-Kriging method achieves superior efficiency with only 101 samples and 2.13% relative error. Other methods require significantly more samples: AK-MCS (172 samples, 6.22% error), Kriging-EI (264 samples, 10.51% error), DS (3534 samples, 15.81% error), SS (3.352 × 104 samples, 7.04% error), and IS (4636 samples, 22.21% error). The results demonstrate that AC-Kriging achieves the highest computational efficiency while maintaining excellent accuracy in failure probability evaluation.

4.3. Radial Deformation Reliability Analysis of Aeroengine Turbine Blade

To assess the robustness of AC-Kriging in practical engineering scenarios, this research examines the reliability analysis of maximum radial displacement for an aeroengine’s high-pressure turbine blade. The structural reliability is determined by comparing the blade’s radial deformation with its allowable deformation threshold, with a detailed analysis provided in [1]. Figure 10a illustrates the blade’s geometric configuration and applied boundary conditions, and Figure 10b shows the finite element model of turbine blade.
In this case, the turbine blade root remains constrained while accounting for centrifugal acceleration effects arising from high-speed rotation. The finite element model of the turbine blade consists of 26,735 nodes and 7557 elements. The selected material is K417 with a density of 8210 kg/m3, an elastic modulus of 2.15 × 105 MPa, and a Poisson’s ratio of 0.28. Five distinct convective heat transfer boundary conditions are applied to the root, lower section, middle section, upper section, and blade tip regions. These boundary conditions are characterized by four sets of convection parameters: temperatures T1, T2, T3, T4 and corresponding heat transfer coefficients a1, a2, a3, a4. Nine stochastic variables follow normal distribution patterns, as detailed in Table 7, where indices 1–4 represent the progression from exterior to interior surfaces. Finite element simulations are conducted under these prescribed boundary conditions, with the results presented in Figure 11 as baseline data for a subsequent reliability assessment.
For this case, the LSF is given as follows:
g X = u 0 u max X
where u0 represents the allowable radial displacement of 1.40 mm, umax(X) denotes the peak radial displacement obtained from finite element analysis, and X is the input random parameters including ω, T1, T2, T3, T4, a1, a2, a3, and a4.
Figure 12 demonstrates that the reliability degree rapidly converges to approximately 0.9968 within the first 104 samples and remains stable thereafter.
To establish the reference solution for comparison, a Monte Carlo convergence study is conducted, as shown in Table 8. The failure probability estimates exhibit significant variation at smaller sample sizes (1 × 10−2 at 103 samples, 0.68 × 10−2 at 3 × 103 samples) and gradually stabilize as the sample size increases. The estimate converges to 0.31 × 10−2 at 1.5 × 105 samples and remains consistent at 0.33 × 10−2 with 5 × 104 samples, confirming that 1.5 × 105 samples provide a reliable reference solution with sufficient convergence for this problem.
Table 9 compares the computational efficiency of six probability estimation methods for the turbine blade reliability analysis. The results show that while Direct MCS requires 1.5 × 105 samples to achieve a failure probability of 3.10 × 10−3, advanced sampling techniques significantly reduce the computational costs. AC-Kriging demonstrates the highest efficiency with only 147 samples and a 7.27% relative error, followed by AK-MCS requiring 283 samples with 15.67% error, and Kriging-EI with 421 samples and 27.94% error. Traditional methods require substantially more samples: DS (4552 samples, 30.25% error), SS (4.2 × 104 samples, 24.50% error), and IS (6894 samples, 39.01% error). The results demonstrate that AC-Kriging achieves the highest computational efficiency and accuracy in the complex structural reliability analysis of turbine blade.
AC-Kriging achieves higher efficiency by replacing uniform sampling with a targeted search that concentrates evaluations near the limit state surface. Traditional methods are inefficient because they search the whole design space evenly, while the critical g(x) ≈ 0 region is small. For failure probabilities on the order of 10−3, approximately 99.9 % of uniformly drawn samples lie in the safe domain and contribute little to the estimate. AC-Kriging overcomes this limitation based on the Actor–Critic network, where the reward function (Equations (12)–(16)) drives the agent toward high-score regions, ensures exploration with a little computational cost, and maintains consistent boundary tracking. The temporal difference learning mechanism provides continuous feedback to refine sampling strategies, unlike static learning functions that cannot adapt to problem-specific characteristics. The continuous state–action representation enables precise boundary following with adaptive step sizes based on local LSF curvature, scaling linearly with dimension rather than exponentially. Therefore, the AC-Kriging method can reduce the computational cost by a factor of 102–103 and maintain the high precision of reliability analysis.

5. Conclusions

This paper proposes an Actor–Critic network-enhanced Kriging method (AC-Kriging) for efficient structural reliability analysis of aeroengine components by integrating deep reinforcement learning with adaptive surrogate modeling. The main conclusions are drawn as follows:
(1)
AC-Kriging demonstrates superior sampling efficiency compared to conventional methods, achieving comparable accuracy with only 52–147 samples versus the thousands required by traditional methods like IS, DS, and SS while maintaining relative errors below 8%.
(2)
The Actor–Critic framework effectively optimizes sampling strategies through intelligent reward function design, which concentrates sampling in reliability-critical boundary regions and achieves adaptive sequential decision-making rather than uniform space-filling.
(3)
The continuous state–action representation enables precise boundary following with adaptive step sizes, scaling linearly with dimension and addressing the curse of dimensionality that limits traditional approaches in high-dimensional problems.
(4)
Case studies validate AC-Kriging’s effectiveness across diverse applications, from nonlinear systems to practical turbine blade reliability assessment, with dual convergence criteria ensuring both model precision and algorithmic stability.
(5)
The method successfully balances exploration and exploitation through its reinforcement learning framework, enabling effective handling of complex aerospace engineering problems with non-convex, multi-modal failure boundaries.
The current study only considers steady-state conditions and does not address the added complexity of time-varying environmental loads or component degradation. Future research could focus on extending AC-Kriging to time-dependent reliability problems and investigating its applicability in real-time reliability assessment scenarios for safety-critical aerospace systems.

Author Contributions

Conceptualization, J.W. and B.Z.; Data Curation, Y.S. and H.F.; Formal Analysis, A.C. and J.L.; Funding Acquisition, J.L.; Investigation, J.W., Y.S. and A.C.; Methodology, J.W., Y.S. and B.Z.; Project Administration, J.W.; Resources, B.Z. and H.F.; Software, Y.S. and H.F.; Supervision, J.L.; Validation, J.W., Y.S. and H.F.; Visualization, J.W. and Y.S.; Writing—Original Draft, J.W.; Writing—Review and Editing, J.W. and Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This paper is supported by the National Science and Technology Major Project (grant No. J2022-IV-0012). The authors would like to thank it.

Data Availability Statement

The data used to support the findings of this study are included within this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature

LSFlimit state function
FORMfirst-order reliability method
SORMsecond-order reliability method
DoPsdesign of points
MCSMonte Carlo simulation
FEfinite element
ISimportance sampling
DSdirectional Sampling
SSsubset simulation
AK-MCSactive sampling method combining Kriging and MCS
MLmachine learning
SVMsupport vector machine
ANNartificial neural network
DRLdeep reinforcement learning
LHSLatin hypercube sampling
MDPMarkov decision process
DNNdeep neural network

References

  1. Li, C.; Wen, J.; Wan, J.; Taylan, O.; Fei, C. Adaptive directed support vector machine method for the reliability evaluation of aeroengine structure. Reliab. Eng. Syst. Saf. 2024, 246, 110064. [Google Scholar] [CrossRef]
  2. Wen, J.; Zheng, B.; Fei, C. Prioritized experience replay-based adaptive hybrid method for aerospace structural reliability analysis. Aerosp. Sci. Technol. 2025, 163, 110257. [Google Scholar] [CrossRef]
  3. Sun, Y.P.; Feng, H.; Zheng, B.; Wen, J.R.; Chao, A.F.; Fei, C.W. Multi-Agent Reinforcement Symbolic Regression for the Fatigue Life Prediction of Aircraft Landing Gear. Aerospace 2025, 12, 718. [Google Scholar] [CrossRef]
  4. Zhen, H.; Cheung, C.; Leung, C.; Choy, Y. A comparison of the emission and impingement heat transfer of LPG-H2 and CH4-H2 premixed flames. Int. J. Hydrogen Energy 2012, 37, 10947–10955. [Google Scholar] [CrossRef]
  5. Choy, Y.; Huang, L. Drum silencer with shallow cavity filled with helium. J. Acoust. Soc. Am. 2003, 114, 1477–1486. [Google Scholar] [CrossRef]
  6. Ling, C.; Lu, Z.; Zhang, X. An efficient method based on AK-MCS for estimating failure probability function. Reliab. Eng. Syst. Saf. 2020, 201, 106975. [Google Scholar] [CrossRef]
  7. Cho, S. First-order reliability analysis of slope considering multiple failure modes. Eng. Geol. 2013, 154, 98–105. [Google Scholar] [CrossRef]
  8. Lim, J.; Lee, B.; Lee, I. Post optimization for accurate and efficient reliability-based design optimization using second-order reliability method based on importance sampling and its stochastic sensitivity analysis. Int. J. Numer. Methods Eng. 2016, 107, 93–108. [Google Scholar] [CrossRef]
  9. Zhang, A.; Chen, Z.; Pan, Q.; Li, X.; Feng, P.; Gan, X.; Chen, G.; Gao, L. Reliability analysis method for multiple failure modes with overlapping failure domains. Probabilistic Eng. Mech. 2025, 79, 103741. [Google Scholar] [CrossRef]
  10. Papadrakakis, M.; Papadopoulos, V.; Lagaros, N. Structural reliability analysis of elastic-plastic structures using neural networks and Monte Carlo simulation. Comput. Methods Appl. Mech. Eng. 1996, 136, 145–163. [Google Scholar] [CrossRef]
  11. Ren, W.; Chen, H. Finite element model updating in structural dynamics by using the response surface method. Eng. Struct. 2010, 32, 2455–2465. [Google Scholar] [CrossRef]
  12. Jafari-Asl, J.; Seghier, M.; Ohadi, S.; Correia, J.; Barroso, J. Reliability analysis based improved directional simulation using Harris Hawks optimization algorithm for engineering systems. Eng. Fail. Anal. 2022, 135, 106148. [Google Scholar] [CrossRef]
  13. Ni, P.; Xia, Y.; Li, J.; Hao, H. Using polynomial chaos expansion for uncertainty and sensitivity analysis of bridge structures. Mech. Syst. Signal Process. 2019, 119, 293–311. [Google Scholar] [CrossRef]
  14. Fei, C.; Han, Y.; Wen, J.; Li, C.; Han, L.; Choy, Y. Deep learning-based modeling method for probabilistic LCF life prediction of turbine blisk. Propuls. Power Res. 2024, 13, 12–25. [Google Scholar] [CrossRef]
  15. Fei, C.; Tang, W.; Bai, G.; Ma, S. Dynamic probabilistic design for blade deformation with SVM-ERSM. Aircr. Eng. Aerosp. Technol. Int. J. 2015, 87, 312–321. [Google Scholar] [CrossRef]
  16. Roy, A.; Chakraborty, S. Support vector machine in structural reliability analysis: A review. Reliab. Eng. Syst. Saf. 2023, 233, 109126. [Google Scholar] [CrossRef]
  17. Sun, Y.; Wen, J.; Li, J.; Cao, A.; Fei, C. Novel integrated model approach for high cycle fatigue life and reliability assessment of helicopter flange structures. Aerospace 2025, 12, 78. [Google Scholar] [CrossRef]
  18. Cheng, J.; Li, Q. Reliability analysis of structures using artificial neural network based genetic algorithms. Comput. Methods Appl. Mech. Eng. 2008, 197, 3742–3750. [Google Scholar] [CrossRef]
  19. Zhang, L.; Lu, Z.; Wang, P. Efficient structural reliability analysis method based on advanced Kriging model. Appl. Math. Model. 2015, 39, 781–793. [Google Scholar] [CrossRef]
  20. Yuan, K.; Xiao, N.; Wang, Z.; Shang, K. System reliability analysis by combining structure function and active learning kriging model. Reliab. Eng. Syst. Saf. 2020, 195, 106734. [Google Scholar] [CrossRef]
  21. Sun, Z.; Wang, J.; Li, R.; Tong, C. LIF: A new Kriging based learning function and its application to structural reliability analysis. Reliab. Eng. Syst. Saf. 2017, 157, 152–165. [Google Scholar] [CrossRef]
  22. Qian, H.; Li, Y.; Huang, H. Time-variant reliability analysis for industrial robot RV reducer under multiple failure modes using Kriging model. Reliab. Eng. Syst. Saf. 2020, 199, 106936. [Google Scholar] [CrossRef]
  23. Vazirizade, S.M.; Nozhati, S.; Zadeh, M.A. Seismic reliability assessment of structures using artificial neural network. J. Build. Eng. 2017, 11, 230–235. [Google Scholar] [CrossRef]
  24. Yang, T.; Zou, J.F.; Pan, Q. A sequential sparse polynomial chaos expansion using Voronoi exploration and local linear approximation exploitation for slope reliability analysis. Comput. Geotech. 2021, 133, 104059. [Google Scholar] [CrossRef]
  25. Su, G.; Yu, B.; Xiao, Y.; Yan, L. Gaussian process machine-learning method for structural reliability analysis. Adv. Struct. Eng. 2014, 17, 1257–1270. [Google Scholar] [CrossRef]
  26. Zhou, K.; Hegde, A.; Cao, P.; Tang, J. Design optimization toward alleviating forced response variation in cyclically periodic structure using Gaussian process. J. Vib. Acoust. 2017, 139, 011017. [Google Scholar] [CrossRef]
  27. Su, G.; Peng, L.; Hu, L. A Gaussian process-based dynamic surrogate model for complex engineering struc-tural reliability analysis. Struct. Saf. 2017, 68, 97–109. [Google Scholar] [CrossRef]
  28. Xiang, Z.; Chen, J.; Bao, Y.; Li, H. An active learning method combining deep neural network and weighted sampling for structural reliability analysis. Mech. Syst. Signal Process. 2020, 140, 106684. [Google Scholar] [CrossRef]
  29. Meng, Y.; Zhang, D.; Shi, B.; Wang, D.; Wang, F. An active learning Kriging model with approximating parallel strategy for structural reliability analysis. Reliab. Eng. Syst. Saf. 2024, 247, 110098. [Google Scholar] [CrossRef]
  30. Wang, Y.; Pan, H.; Shi, Y.; Wang, R.; Wang, P. A new active-learning estimation method for the failure probability of structural reliability based on Kriging model and simple penalty function. Comput. Methods Appl. Mech. Eng. 2023, 410, 116035. [Google Scholar] [CrossRef]
  31. Zhang, J.; Xiao, M.; Gao, L.; Chu, S. A combined projection-outline-based active learning Kriging and adaptive importance sampling method for hybrid reliability analysis with small failure probabilities. Comput. Methods Appl. Mech. Eng. 2019, 344, 13–33. [Google Scholar] [CrossRef]
  32. Wang, P.; Zhang, Z.; Huang, X.; Zhou, H. An application of active learning Kriging for the failure probability and sensitivity functions of turbine disk with imprecise probability distributions. Eng. Comput. 2022, 38, 17–37. [Google Scholar] [CrossRef]
  33. Dang, C.; Valdebenito, M.A.; Faes, M.G.R.; Song, J.; Wei, P.; Beer, M. Structural reliability analysis by line sampling: A Bayesian active learning treatment. Struct. Saf. 2023, 104, 102351. [Google Scholar] [CrossRef]
  34. Moustapha, M.; Parisi, P.; Marelli, S.; Sudret, B. Reliability analysis of arbitrary systems based on active learning and global sensitivity analysis. Reliab. Eng. Syst. Saf. 2024, 248, 110150. [Google Scholar] [CrossRef]
  35. Liu, F.; Wei, P.; Zhou, C.; Yue, Z. Reliability and reliability sensitivity analysis of structure by combining adaptive linked importance sampling and kriging reliability method. Chin. J. Aeronaut. 2020, 33, 1218–1227. [Google Scholar] [CrossRef]
  36. Wang, F.Y.; Zhang, J.J.; Zheng, X.; Wang, X.; Yuan, Y.; Dai, X.; Zhang, J.; Yang, L. Where does AlphaGo go: From church-turing thesis to AlphaGo thesis and beyond. IEEE/CAA J. Autom. Sin. 2016, 3, 113–120. [Google Scholar] [CrossRef]
  37. Osband, I.; Blundell, C.; Pritzel, A.; Van Roy, B. Deep exploration via bootstrapped DQN. Adv. Neural Inf. Process. Syst. 2016, 29, 4033–4041. [Google Scholar]
  38. Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
  39. Guo, D.; Yang, D.; Zhang, H.; Song, J.; Zhang, R.; Xu, R.; Zhu, Q.; Ma, S.; Wang, P.; Bi, X.; et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv 2025, arXiv:2501.12948. [Google Scholar]
  40. Xiang, Z.; Bao, Y.; Tang, Z.; Li, H. Deep reinforcement learning-based sampling method for structural reliability assessment. Reliab. Eng. Syst. Saf. 2020, 199, 106901. [Google Scholar] [CrossRef]
  41. Guan, X.; Xiang, Z.; Bao, Y.; Li, H. Structural dominant failure modes searching method based on deep reinforcement learning. Reliab. Eng. Syst. Saf. 2022, 219, 108258. [Google Scholar] [CrossRef]
  42. Guan, X.; Sun, H.; Hou, R.; Xu, Y.; Bao, Y.; Li, H. A deep reinforcement learning method for structural dominant failure modes searching based on self-play strategy. Reliab. Eng. Syst. Saf. 2023, 233, 109093. [Google Scholar] [CrossRef]
  43. Wei, S.; Bao, Y.; Li, H. Optimal policy for structure maintenance: A deep reinforcement learning framework. Struct. Saf. 2020, 83, 101906. [Google Scholar] [CrossRef]
  44. Li, C.; Zhao, J.; Pan, H.; Cao, L.; Guan, Q.; Xu, Z. Deep reinforcement learning based performance optimization of hybrid system for base-isolated structure and shape memory alloy-inerter. Eng. Struct. 2025, 334, 120244. [Google Scholar] [CrossRef]
  45. Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
  46. Zhang, S.; Li, Y.; Dong, Q. Autonomous navigation of UAV in multi-obstacle environments based on a deep reinforcement learning approach. Appl. Soft Comput. 2022, 115, 108194. [Google Scholar] [CrossRef]
  47. Forrester, A.; Sobester, A.; Keane, A. Engineering Design via Surrogate Modelling: A Practical Guide; John Wiley & Sons: Hoboken, NJ, USA, 2008; pp. 89–91. [Google Scholar]
  48. Bichon, B.J.; Eldred, M.S.; Swiler, L.P.; Mahadevan, S.; McFarland, J.M. Efficient global reliability analysis for nonlinear implicit performance functions. AIAA J. 2008, 46, 2459–2468. [Google Scholar] [CrossRef]
  49. Echard, B.; Gayton, N.; Lemaire, M. AK-MCS: An active learning reliability method combining Kriging and Monte Carlo simulation. Struct. Saf. 2011, 33, 145–154. [Google Scholar] [CrossRef]
  50. Zhou, T.; Guo, T.; Dang, C.; Beer, M. Bayesian reinforcement learning reliability analysis. Comput. Methods Appl. Mech. Eng. 2024, 424, 116902. [Google Scholar] [CrossRef]
Figure 1. Reinforcement learning framework showing the interaction between agent and environment.
Figure 1. Reinforcement learning framework showing the interaction between agent and environment.
Aerospace 12 00752 g001
Figure 2. The framework of AC-Kriging method for structural reliability analysis including the Actor–Critic network and the Kriging model.
Figure 2. The framework of AC-Kriging method for structural reliability analysis including the Actor–Critic network and the Kriging model.
Aerospace 12 00752 g002
Figure 3. Design of points space and sampling environment.
Figure 3. Design of points space and sampling environment.
Aerospace 12 00752 g003
Figure 4. Active sampling process of AC-Kriging method.
Figure 4. Active sampling process of AC-Kriging method.
Aerospace 12 00752 g004
Figure 5. The framework of the Actor–Critic network.
Figure 5. The framework of the Actor–Critic network.
Aerospace 12 00752 g005
Figure 6. The detailed overview of the AC-Kriging method for structural reliability analysis.
Figure 6. The detailed overview of the AC-Kriging method for structural reliability analysis.
Aerospace 12 00752 g006
Figure 7. True limit state surface of Case 1 (Note: The red region is safe domain and the blue region is danger domain.).
Figure 7. True limit state surface of Case 1 (Note: The red region is safe domain and the blue region is danger domain.).
Aerospace 12 00752 g007
Figure 8. Comparison of sampling point distributions for different reliability analysis methods in Case 1.
Figure 8. Comparison of sampling point distributions for different reliability analysis methods in Case 1.
Aerospace 12 00752 g008
Figure 9. Two-degree-of-freedom dynamic system of Case 2.
Figure 9. Two-degree-of-freedom dynamic system of Case 2.
Aerospace 12 00752 g009
Figure 10. Three-dimensional geometry and Finite element model of turbine blade. (a) Three-dimensional geometry and boundary conditions of turbine blade. (b) Finite element model of turbine blade.
Figure 10. Three-dimensional geometry and Finite element model of turbine blade. (a) Three-dimensional geometry and boundary conditions of turbine blade. (b) Finite element model of turbine blade.
Aerospace 12 00752 g010
Figure 11. Turbine blade deformation.
Figure 11. Turbine blade deformation.
Aerospace 12 00752 g011
Figure 12. Reliability convergence analysis as a function of a sample number. (Note: The red star is the value of reliability degree.)
Figure 12. Reliability convergence analysis as a function of a sample number. (Note: The red star is the value of reliability degree.)
Aerospace 12 00752 g012
Table 1. The distribution of random variables in Case 1.
Table 1. The distribution of random variables in Case 1.
ParameterLower BoundUpper BoundMeanStandard DeviationDistribution
x1−6601Normal
x2−6601Normal
Table 2. MCS study for Case 1: nonlinear series system.
Table 2. MCS study for Case 1: nonlinear series system.
Samples1031041052.5 × 1055 × 105106
Pf1.000 × 10−32.440 × 10−32.250 × 10−32.250 × 10−32.316 × 10−32.308 × 10−3
Table 3. Performance comparison of different active sampling methods in Case 1.
Table 3. Performance comparison of different active sampling methods in Case 1.
MethodN (Number of Samples)Pfεr (%)
Direct MCS5 × 105 (COV(Pf) = 2.9%)2.316 × 10−3-
IS35622.092 × 10−310.71
DS22192.203 × 10−35.13
SS2.8 × 1042.225 × 10−34.09
Kriging-EI1562.180 × 10−36.24
AK-MCS962.233 × 10−33.72
AC-Kriging522.296 × 10−30.87
Table 4. The distribution of random variables in Case 2.
Table 4. The distribution of random variables in Case 2.
ParameterLower BoundUpper BoundMeanStandard DeviationDistribution
Mp0.91.110.1logNormal
Ms0.0090.0110.010.001logNormal
Kp0.81.210.2logNormal
Ks0.009980.01020.010.002logNormal
ξp0.030.070.050.02logNormal
ξc0.010.030.020.01logNormal
S09011010010logNormal
Fs13.516.5151.5logNormal
Table 5. MCS study for Case 2: A two degree of freedom dynamic system.
Table 5. MCS study for Case 2: A two degree of freedom dynamic system.
Samples1031041052.5 × 1055 × 105106
Pf2.572 × 10−22.788 × 10−23.014 × 10−23.245 × 10−23.164 × 10−23.216 × 10−2
Table 6. Performance comparison of different active sampling methods in Case 2.
Table 6. Performance comparison of different active sampling methods in Case 2.
MethodN (Number of Samples)Pfεr (%)
Direct MCS5 × 105 (COV(Pf) = 2.9%)3.164 × 10−2-
IS46362.589 × 10−222.21
DS35342.732 × 10−215.81
SS3.352 × 1042.956 × 10−27.04
Kriging-EI2642.863 × 10−210.51
AK-MCS 1723.374 × 10−26.22
AC-Kriging1013.098 × 10−22.13
Table 7. The distribution of random variables in Case 3.
Table 7. The distribution of random variables in Case 3.
ParameterLower BoundUpper BoundMeanStandard DeviationDistribution
ω(rad/s)1062.881273.12116835.04Normal
T1(°C)937.01123.0103031.00Normal
T2(°C)891.81068.298029.40Normal
T3(°C)746.2893.882024.60Normal
T4(°C)491.4588.654016.20Normal
α1 (Wm−2K−1)10,697.9612,814.0411,756352.68Normal
α2 (Wm−2K−1)7513.238992.778253246.59Normal
α3 (Wm−2K−1)5957.777136.236547196.41Normal
α4 (Wm−2K−1)2848.33411.7313093.90Normal
Table 8. MCS study for Case 3: Radial deformation reliability analysis of aeroengine turbine blade.
Table 8. MCS study for Case 3: Radial deformation reliability analysis of aeroengine turbine blade.
Samples1021033 × 1038 × 1031 × 1045 × 1041.5 × 105
Pf1 × 10−20.5 × 10−20.68 × 10−20.36 × 10−20.32 × 10−20.33 × 10−20.31 × 10−2
Table 9. Probability distribution of a random input variable.
Table 9. Probability distribution of a random input variable.
MethodN (Number of Samples)Pfεr (%)
Direct MCS1.5 × 105 (COV(Pf) = 4.63%)3.10 × 10−3-
IS68942.23 × 10−339.01
DS45522.38 × 10−330.25
SS4.2 × 1042.49 × 10−324.50
Kriging-EI4212.42 × 10−327.94
AK-MCS2832.68 × 10−315.67
AC-Kriging1472.89 × 10−37.27
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wen, J.; Sun, Y.; Chao, A.; Zheng, B.; Li, J.; Feng, H. Deep-Reinforcement-Learning-Enhanced Kriging Modeling Method with Limit State Dominant Sampling for Aeroengine Structural Reliability Analysis. Aerospace 2025, 12, 752. https://doi.org/10.3390/aerospace12090752

AMA Style

Wen J, Sun Y, Chao A, Zheng B, Li J, Feng H. Deep-Reinforcement-Learning-Enhanced Kriging Modeling Method with Limit State Dominant Sampling for Aeroengine Structural Reliability Analysis. Aerospace. 2025; 12(9):752. https://doi.org/10.3390/aerospace12090752

Chicago/Turabian Style

Wen, Jiongran, Yipin Sun, Aifang Chao, Baiyang Zheng, Jian Li, and Haozhe Feng. 2025. "Deep-Reinforcement-Learning-Enhanced Kriging Modeling Method with Limit State Dominant Sampling for Aeroengine Structural Reliability Analysis" Aerospace 12, no. 9: 752. https://doi.org/10.3390/aerospace12090752

APA Style

Wen, J., Sun, Y., Chao, A., Zheng, B., Li, J., & Feng, H. (2025). Deep-Reinforcement-Learning-Enhanced Kriging Modeling Method with Limit State Dominant Sampling for Aeroengine Structural Reliability Analysis. Aerospace, 12(9), 752. https://doi.org/10.3390/aerospace12090752

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop