The prediction accuracy of the BP neural network for sample points near the failure surface largely determines the calculation accuracy of failure probability. In order to improve the prediction accuracy of the BP model near the failure surface, this paper proposes a sampling method of sampling points encryption near the failure surface. For low-dimensional problems (the number of variables < 4), the key points generated by setting “*” shape and random integer method cover the whole design space evenly; For high-dimensional problems (the number of variables ≥ 4), the whole design space is evenly covered by setting the key points generated by random integer method and random sampling of variables. Through finite element calculation, the failure of key points is determined, and the region containing the failure surface is identified. In this region, a set of additional sample points is generated by reducing the standard deviation, thereby constructing a BP neural network with high accuracy. Finally, the sample points near the failure surface are predicted by the BP model. Taking the distance from the sample point to the failure surface as the screening criterion, the sample point with a closer distance is substituted into the FEM for testing and added to the training set. Iterative training is repeated to obtain a BP neural network prediction model with high prediction accuracy for the sample points near the failure surface.
To solve the problem that the calculation time of implicit structural performance function constructed by the BP neural network is long in MCS reliability analysis, an improved calculation method of BP-MCS is proposed in this paper. In order to reduce the number of BP model calls when calculating MCS failure probability, the product of random variable probability density function is defined as the weight of sample point . Considering that the probability of failure point of engineering structure is very low, that is, its weight is very small, there must be a weight-critical value . When the weight of the sample point extracted by MCS is , the sample point is in the reliable region, and there is no need to call the BP model to calculate the structural response value at this time. However, when the weight of sample points is , it needs to be substituted into the BP model to judge whether it is failed. This method filters out most of the sample points located in the reliable region from the probability of occurrence, which greatly reduces the number of calls of the BP model, thus shortening the calculation time.
3.1. Definition of Weight
In the design space, a sample point
contains multiple random variables, denoted as
, and
m is the number of random variables. Each random variable in a sample point obeys a different probability distribution, and the probability density function value of the sample point in the design space is used as its weight. Because the distribution of each random variable has its own probability density function, when they are independent of one another, the weight of sample point
can be defined as the product of probability density functions, that is:
When each random variable is not independent of one another, it is replaced by the joint probability density function of each variable for calculation, that is:
where:
is the probability density function of the
m-th random variable;
is the joint probability density function of the
m − 1 and
m random variables.
For an
m-dimensional independent normal distribution, the weight of each sample point is:
where:
is the mean of the
m-th random variable;
is the standard deviation of the
m-th random variable.
When it is an
m-dimensional standard normal distribution, the weight of each sample point is:
It can be seen from Equations (5) and (6) that the weight of the sample point is the largest at the center point . By observing the exponential term, is essentially equivalent to the distance between the sample point and the center point. The closer the sample point is to the center point, the greater the weight. The farther away from the center point, the smaller the weight. According to the normal distribution law, a large number of sample points randomly selected by MCS will have an aggregation effect to the central point , and the distribution of sample points is not uniform enough at this time. In order to solve this problem, this paper uses random integer method to design the sample points evenly.
3.2. Construction of Sample Points
For one-dimensional normal distribution,
contains almost all sample points in the whole design space. Therefore, when sampling low-dimensional problems (the number of variables
m < 4) by the random integer method, the sampling range can be defined as all integer points within [−5, 5], and then the sample points can be constructed according to the mean
and standard deviation of variables
, that is:
where:
is the integer value extracted by random integer method.
Since is an integer point in [−5, 5], the sample points constructed are very evenly distributed in the global design space. However, with the increase of variable dimension, the agglomeration effect of a large number of sample points to the center point becomes more and more obvious. At this time, it is necessary to explore the sampling frequency and extraction range of the random integer method.
Considering that the random integer method used to construct sample points is similar to the orthogonal experiment method with multi-factor and multi-level, when the sampling range of random integer method is [−3, 3], the orthogonal experiment design with multi-factor 7 level can be carried out. Based on the number of test groups used in the orthogonal test method, when the variable number
m is 2~8, the orthogonal test table can be L
49(7
8), and the number of test groups is 49. Taking the three-dimensional standard normal distribution as the research object, 49 groups of sample points were sampled by the random integer method and orthogonal test method, respectively, and the distribution of sample points is shown in
Figure 2.
Figure 2 shows that the spatial distribution of the sample points adopted by the orthogonal test method has the characteristics of stratification, and each factor is orthogonal in pairs and each level has the same number of occurrences. Therefore, it is found that the weights are discontinuous and have the characteristics of hierarchical distribution when calculated by Equation (6). However, the sample points adopted by the random integer method effectively avoid this phenomenon, and the sample points are evenly distributed in the design space and the weights are continuously distributed, which shows that the random integer method is essentially an optimization of the orthogonal test method. The orthogonal test method mainly selects representative sample points to reflect the global design space, so the number of experimental groups can be used as the sampling times of the random integer method. The orthogonal tests of different variable dimensions are shown in
Table 1. When the number of variables
m is 9, the orthogonal test table can be L
64(8
9), and then the pseudo-horizontal method can be used to replace the excess level with the existing level. Considering that the sampling times of the random integer method are related to the dimensionality of random variables, and the number of experimental groups of the orthogonal test method should be included, the sampling times of random integer method can be 10 ×
m groups.
Since normal distribution can be normalized into standard normal distribution, this paper takes
m-dimensional standard normal distribution as the research object and uses MCS to carry out random sampling of different orders of magnitude. The obtained sample point weight orders are shown in
Table 2. At the same time, the random integer method is used to carry out the statistics of weight orders in different extraction ranges, and the specific results are shown in
Table 3.
As can be seen from
Table 2 and
Table 3, when using the random integer method to construct sample points, for low-dimensional problems (
m < 4), the weight order of sample points constructed in [−5, 5] can reach the minimum order of magnitude required when MCS is used to extract 10
7 times. For high-dimensional problems (
m ≥ 4), the weight order of sample points constructed in [−3, 3] can reach the minimum order of magnitude required for 10
7 extraction with MCS. When the value range is [−4, 4] or [−5, 5], the minimum weight order of sample points is too small compared with MCS; that is, many sample points with extremely low probability are added. With the increasing number of random variables, the maximum weight order of sample points extracted by random integer method is also decreasing, and it cannot reach the maximum order of magnitude required by MCS for 10
7 extraction times. Therefore, the missing area can be covered by random normal distribution of sample points extracted at the center point. Thanks to the characteristics of uniform design, the sample points constructed by random integer method are evenly distributed in the whole design space.
In view of the above statistical phenomena, this paper proposes a sampling method that can cover the global design space and distribute evenly. For low-dimensional problems, the range of the random integer method is [−5, 5]. At this time, there are 11 integer points, and the 10 ×
m key points generated by it are not enough to reflect the whole design space. Therefore, combined with the key point method proposed in reference [
28], it is proposed to set the “*”-shaped key points combined with 10 ×
m groups of key points generated by the random integer method to evenly cover the global design space. The distribution of “*”-shaped key points is shown in
Figure 3, and there are 25 groups of two-dimensional problems and 43 groups of three-dimensional problems. As can be seen from
Figure 3, the “*”-shaped key points are evenly distributed in the design space. However, as the dimensionality of the variables increases, the number of required “*”-shaped key points increases dramatically and the applicability needs to be improved.
For high-dimensional problems, the value range of the random integer method is [−3, 3], then the global design space is uniformly covered by setting the key points generated by the random integer method and random sampling of variables. In this case, the number of sample points generated by random sampling of variables needs to be discussed. Similar to the method used to determine the sampling times of the random integer method, considering that the weights of the sample points need to maintain continuity and the number of groups of sample points is related to the dimensions of random variables, random sampling of 3~7 ×
m groups of
m-dimensional standard normal distribution is carried out, respectively. and the weights of sample points are calculated. The weights of sample points obtained are shown in
Table 4. According to
Table 3 and
Table 4, the minimum weight orders of sample points randomly selected are all smaller than the maximum weight orders of 10 ×
m sample points extracted by the random integer method, which indicates that the combination of the two can effectively cover the missing area generated by the random integer method at the center point. At the same time, it is observed that when the number of random variables is 4~5 and the sampling times is 6 ×
m group, the weight order tends to be stable. When the number of random variables is 6~8 and the sampling times is 4 ×
m group, the weight order tends to be stable. At this time, the number of randomly selected sample points is about 30 groups, so the number of randomly selected sample points groups can be set as 30 groups.
3.3. High-Precision BP Neural Network Prediction Model
On the limit state surface, the structure performance function value is 0. In the reliable region, the structure performance function value is positive. In the failure region, the structure performance function value is negative. When using the BP model to calculate MCS reliability, we only need to pay attention to whether the response value of the sample points is reliable, but not to its value. Therefore, it is only necessary to ensure and improve the prediction accuracy of the BP model near the limit state surface on the basis of uniform coverage of the global region, then the BP model can be used to judge whether the sample points extracted by MCS are failed, so as to complete the failure-probability calculation of the structure.
The training steps of the high-precision BP model proposed in this paper are as follows:
(1) Extracting key points. For low-dimensional problems (m < 4), firstly, 10 × m groups of key points are extracted from [−5, 5] by the random integer method, and then the “*”-shaped key points are selected. The combination of the two forms the initial sample point set . For high-dimensional problems (m ≥ 4), firstly, 10 × m groups of key points are extracted from [−3, 3] by the random integer method, and then the sample points of m-dimensional random normal distribution are extracted according to the mean and standard deviation of each random variable (the number can be 30 groups). The combination of them constitutes the initial sample point set . is substituted into the finite element model for calculation, and the response set of structural performance function is obtained. The initial sample point set and the response set together constitute the initial set.
(2) Constructing the training set. For low-dimensional problems, the region where the failure surface is located can be determined according to the failure situation of the “*”-shaped key points, and then is taken from some key points close to the failure surface to generate 10 groups of sample points, respectively, which are substituted into FEM to calculate the response values. The newly added sample points and the initial set are used together as the training set. For high-dimensional problems, the initial set is directly used as the training set.
(3) Calculating the training set’s weight and constructing the prediction set. Calculating the weights of all sample points in the training set and arranging them in descending order. After the first failure point with is found, the weight of this point is taken as the boundary, respectively; m sample points close to 0 with greater weight are found upward, and (m − 1) sample points close to 0 with greater weight are found downward. Taking at these sample points to generate 10 groups of sample points to form a prediction set, the number of sample points in the prediction set is 2m × 10. At this time, these sample points do not need to be substituted into FEM for calculation.
(4) Training the BP model. Taking the sample point of the training set as input and the response value as output, the initial BP model is trained by the MATLAB R2021a Neural Network toolbox, and the response of the prediction set is calculated by the BP model.
(5) Verifying the model accuracy. The sample points of the prediction set are arranged in ascending order according to the size of
, and then the sample points whose
is close to 0 are screened out. For the first iteration, all the sample points with
close to 0 can be taken as the validation set and put into FEM to calculate the response, and the error between the predicted value and the real value can be compared. For subsequent iterations, the number of sample points in the validation set can be 15% of the number of training sets each time [
24], and at this time, the validation set takes the sample points with large weight and
close to 0. If the BP model meets the required prediction accuracy, output the model. Otherwise, add the sample points from the validation set to the training set, return to step (3) to generate a new prediction set, and retrain the BP model. Repeat the above steps until the prediction accuracy meets the requirements.
When training the BP model, its fitting and prediction accuracy can be evaluated using the determination coefficient (
), mean absolute error (
), and mean bias error (
). Higher model accuracy is indicated by
values approaching 1, with
and
values close to 0 [
29]. Prior to using the BP model for prediction, it must demonstrate robust fitting capability on the training set to enhance prediction accuracy. Therefore, the BP model should satisfy the following criteria on the training set before proceeding to further accuracy testing:
,
and
. When verifying the model’s accuracy on the validation set, the BP model must meet the following standards to ensure reliable failure probability calculations:
,
and
.
3.4. The Steps of Reliability Calculation for Improved BP-MCS Method
Because the BP model constructs the implicit structure-performance function, it takes a longer time to predict the structure response than the explicit structure performance function. When the failure probability is small, the number of MCS calculations reaches more than 106, which makes the calculation time increase dramatically. Considering that the failure probability of the actual engineering structure is very low, meaning the weight of its failure point is very small, then there must be a weight critical value . When the weight of sample points extracted by MCS is , these sample points are considered reliable, and there is no need to call the BP model to calculate the structural response value at this time. However, when the weight of sample points is , the area contains all failure points, so it is necessary to call the BP model to determine whether failure occurs. Therefore, most of the sample points located in the reliable domain are filtered out in the probability of occurrence, which significantly reduces the number of calls of the BP model and shortens the calculation time.
The detailed implementation steps of the improved BP-MCS reliability analysis method proposed in this paper are as follows:
(1) The BP model with high prediction accuracy for the sample points near the failure surface is obtained by training. See 3.3 for details.
(2) Selecting the weight-critical value
. For the weight-critical value
, it can be selected through the sample points of the training set and the validation set after each BP model accuracy test: Arrange the sample points of the training set and the validation set according to their descending weights, finding the first failure point with
, and return the weight
of the previous sample point with
. If there is a sample point very close to 0
+ before the first failure point, the weight
of this sample point is returned. To avoid not taking all failure points into account, the amplification factor
can be added. According to the follow-up research,
can be 1.1~1.3, then the critical weight value is:
(3) Using the improved MCS method to calculate the failure probability. Firstly,
sample points are extracted by MCS, and the weight of each sample point is calculated. When the weight of the sample point is
, the sample point is reliable, and there is no need to call the BP model. When
, the BP model should be called to judge whether it is failed. If the predicted value
, the sample point will be failed. If the total number of failure points is
, the structural failure probability is:
(4) Testing the variation coefficient of failure probability according to Equation (2) to ensure that the number of sample points meets the requirements. If , the failure probability is output. Otherwise, go back to step (3) to generate more MCS sample points for calculation, and the initial sample number can be 106. The above iterative process is repeated until the convergence requirement is met, and the final failure probability is output.
Combined with the training steps of the high-precision BP model, the specific calculation flow of the improved BP-MCS method is shown in
Figure 4.