Fatigue Life Prediction of 2024-T3 Al Alloy by Integrating Particle Swarm Optimization—Extreme Gradient Boosting and Physical Model

Zhaoji Li; Haitao Yue; Ce Zhang; Weibing Dai; Chenguang Guo; Qiang Li; Jianzhuo Zhang

doi:10.3390/ma17215332

Abstract

The multi-parameter characteristics of the physical model pose a challenge to the fatigue life prediction of 2024-T3 aluminum (Al) alloy. In response to this issue, a parameter-solving method that integrates particle swarm optimization (PSO) with extreme gradient boosting (XGBoost) is proposed in this study. The fatigue performance and failure mechanism of the 2024-T3 Al alloy are analyzed. Furthermore, the fatigue life prediction physical model of the 2024-T3 Al alloy is established by using the energy method of fracture mechanics. The physical model incorporates critical physical parameters. Meanwhile, the PSO algorithm optimizes the hyperparameters of the XGBoost model based on fatigue data of the 2024-T3 Al alloy. Eventually, the optimized XGBoost model is used to solve the parameters of the physical model. Furthermore, the analytical equation of the fatigue life prediction model is obtained. This paper provides a new method for solving the parameters of the fatigue life prediction model, which reduces the error and cost of obtaining the model parameters in the experiment and shortens the time required.

Keywords:

physical model; multi-parameter; Al alloy; particle swarm optimization; extreme gradient boosting; fatigue life

1. Introduction

Due to their excellent strength-to-weight ratio and corrosion resistance, Al alloys are widely used in the aerospace, automotive, and construction industries [1,2]. However, Al alloy structural components have been subjected to alternating loads, presenting a risk of fatigue fractures [3]. This poses a threat to the safety of transport equipment. In particular, a fatigue life prediction model is of great significance for the fatigue reliability design of Al alloy structural parts. However, the construction of a fatigue life prediction model faces several challenges, including the complexity of the material’s microstructure, the diversity of manufacturing processes, and variations in actual service environments [4]. Zhou [5] et al. and Nowell D [6] et al. demonstrated that microstructural defects in aluminum alloys, such as pores and inclusions, significantly impact the initiation and propagation of fatigue cracks. This leads to challenges in predicting fatigue life. Additionally, the anisotropy of Al alloys and the mechanical property variations resulting from different heat treatment histories make fatigue life prediction more complex [7]. The fatigue performance is also significantly impacted by environmental factors, such as temperature, humidity, and corrosive media, further complicating the prediction [8]. Therefore, predicting the fatigue life of Al alloys requires a comprehensive consideration of multiple factors, posing significant challenges to improving prediction accuracy and reliability.

In recent years, significant progress has been made in predicting the fatigue life of Al alloys using traditional physical material research methods. Researchers have conducted experimental tests and analytical modeling to deeply investigate the behavior of Al alloys under cyclic loading, thereby enhancing the accuracy of fatigue life prediction [9]. Chen et al. [10] proposed a fatigue life prediction of a 6061-T6 Al alloy based on defect analysis. However, the constructed life prediction model did not deeply consider the fatigue failure mechanism. Chabouk et al. [11] used the Manson–Coffin–Basquin equation to estimate the fatigue life of a 2024-T351 Al alloy. Due to the inability to obtain analytical solutions for the life prediction equation, the application of the model is limited. Moreover, the lack of the fatigue failure mechanism analysis results in the poor interpretability of the life prediction model. Consequently, the fatigue life prediction models in the above studies have not fully integrated the impact of microstructural defects and their evolution on fatigue behavior. Therefore, the accuracy and applicability of the life predictions made are limited. Cauthen et al. [12] studied the microstructural fatigue crack growth behavior of AA7065 and AA2099 Al alloys. The results show that crack initiation in the AA7065 is mainly due to voids and intermetallic particles, while persistent slip bands and intermetallic particles lead to the fatigue failure of AA2099. Wisner et al. [13] examined the impact of specimen geometry and loading schemes on particle fracture in Al alloys. Hence, the understanding of fatigue crack initiation is enhanced. This provides a theoretical basis for establishing a fatigue life prediction model. However, the fatigue life prediction model based on physical models contains multiple parameters. The parameters require extensive experimental data for parameter fitting [14]. Notably, it takes a lot of time to obtain the model parameters. Moreover, the parameter values depend on the accuracy of the test equipment and the data processing method. The explainable fatigue life prediction physical equation and its parameter-solving method are two important issues in realizing the accurate prediction of fatigue life.

Advanced machine learning methods demonstrate significant advantages in handling high-dimensional data and complex nonlinear relationships. Specifically, based on elastoplastic fatigue damage and machine learning models, Zhan et al. [15] proposed a novel approach for predicting the fatigue life of aerospace alloy components. The results showed that this method outperformed traditional approaches in terms of predictive accuracy and generalizability. In addition, Pałczy et al. [16] demonstrated the successful application of machine learning techniques in multiaxial fatigue life prediction, proving that these techniques are highly adaptable under complex multiaxial loading conditions. Furthermore, Raja et al. [17] confirmed the effectiveness of machine learning in predicting fatigue crack growth behavior in aluminum alloys, especially under high-stress and complex-stress conditions. Methods such as Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) exhibit superior performance in handling high-dimensional data and capturing complex relationships within the data [18,19]. These techniques utilize large datasets to identify patterns and make predictions, significantly enhancing the capabilities beyond traditional empirical methods. Despite the potential of machine learning, data-driven models alone have limitations [20,21]. These models typically lack physical interpretability and may not reveal the underlying fatigue failure mechanism of specimens [9,22]. In addition, the predictive performance of machine learning models is highly dependent on the quality and quantity of the training data. Moreover, the results are affected by data noise and imbalance [23]. Thus, the key to improving the accuracy of fatigue life prediction is to study the fatigue life prediction method by combining machine learning and physical models.

In this study, integrating machine learning with physical modeling is investigated to address the issues in the construction of traditional fatigue life prediction models. This hybrid approach aims to leverage the advantages of both methodologies, providing accurate predictions along with physically meaningful insights [24]. By incorporating physical principles into the machine learning framework, these models can offer better generalization and robustness, particularly in scenarios with limited data [25]. This approach not only improves the predictive performance but also enhances the model’s transparency, making it more acceptable for practical engineering applications [26]. Foremost, the tension–tension fatigue life of the 2024-T3 Al alloy is tested under high and low cycle fatigue conditions. The fatigue failure mechanism of the specimen is analyzed. Furthermore, based on crack propagation theory and energy release theory, a physical model for predicting the fatigue life of the Al alloy is proposed. According to the fatigue life data, PSO is used to optimize the hyperparameters of the XGBoost model, enhancing its predictive accuracy and robustness. Finally, the parameters of the physical model are determined by the fatigue life predictions from the optimized machine learning model. By integrating machine learning with a physical model, this study presents a comprehensive fatigue life prediction model for the Al alloys. The model determines the precise values of parameters within the physical model based on prediction from machine learning, offering not only accurate predictions but also physical interpretations, thereby meeting the complex requirements of practical applications.

2. Theoretical Framework and Materials

2.1. Material and Experiments

The experimental material used in this study is 2024-T3 bare Al alloy (Ra = 0.8 µm). The Al alloy is provided by Shenyang Aerospace University, and the size of each Al alloy plate is 915 mm × 380 mm × 1.6 mm. The chemical composition and mechanical properties of the Al alloy are shown in Table 1 and Table 2, respectively. The results come from our previous studies [27].

Table 1. Chemical composition of 2024-T3 Al alloy (wt.%).

Table 2. Mechanical properties of 2024-T3 Al alloy.

For the preparation of fatigue specimens, wire electrical discharge machining was used to prevent deformation of the plate. The geometric size of the specimens is shown in Figure 1. In addition, the specimens obtained by wire-cutting were ultrasonically cleaned in deionized water in time to avoid surface corrosion of the cutting fluid, followed by drying. After the workpieces were cut from the plates, their wire-cutting surface was polished step by step with 400-2000# sandpaper. Then, the specimens were cleaned with alcohol using an ultrasonic cleaner, followed by drying with a hair dryer.

Figure 1. Geometry of the fatigue specimens.

The fatigue tests were conducted using an electro-hydraulic fatigue testing machine (EHF-EV200K2-040-1A, Shimadzu, Kyoto, Japan). The fatigue testing machine primarily consists of a controller, a hydraulic workstation, hydraulic fixtures, and a cooling machine, as shown in Figure 2. All fatigue test data were recorded by a data analysis and recording system when fatigue failure occurred in the Al alloy specimens. The setting parameters for fatigue life assessment are a sinusoidal waveform, loading force, a stress ratio R of 0.1, and a frequency of 20 Hz. The fatigue life was obtained under the maximum cyclic stresses σmax of 200, 220, 240, 260, 310, 350, 370, and 390 MPa. Under each stress level, from three to four parallel tests were carried out. The results are shown in Figure 3. The statistical analysis of the fatigue life data is presented in Table 3.

Figure 2. Fatigue test machine in this study.

Figure 3. Fatigue life under the eight stress levels.

Table 3. Statistical analysis of fatigue life.

2.2. Overview of Fatigue Mechanics and Physical Models

To reveal the fatigue failure mechanism of the 2024-T3 Al alloy is an important problem in establishing the fatigue life prediction model [28]. Figure 4 shows a scanning electron microscope (SEM) image of the 2024-T3 Al alloy undergoing fatigue fracture at the maximum cyclic stress of 220 MPa. The results indicate that a fatigue crack was initiated at the specimen surface, and the fatigue crack extended approximately 45 degrees inside the Al alloy. For the specimens subjected to tensile stress, the maximum shear stress τ₁ is in the 45-degree direction. Since the roughness of the 2024-T3 Al alloy is 0.8, there are peaks and valleys on its surface, as shown in Figure 5a. This results in local stress concentration on the specimen surface. The stress concentration coefficient k_t is related to the depth d of the valley [29], as shown in Equation (1).

k_{t} = 1 + \sqrt{\frac{d}{ρ}}

(1)

where ρ is valley root radius. The increase in the surface roughness increased the k_t. In addition, stress intensity factor ΔK is another important parameter for evaluating fatigue performance. The ΔK is expressed as follows [30]:

Δ K = Y σ \sqrt{π a}

(2)

where Y is a shape factor, σ is the amplitude of the stress, and a is the crack length. The increase in the k_t increased the ΔK. Thus, the fatigue cracks with high roughness are easy to form on the surface, and fatigue life is poor [31].

Figure 4. Fatigue fracture of the 2024-T3 Al alloy.

Figure 5. Schematic diagram of fatigue failure mechanism of the 2024-T3 Al alloy based on dislocation slip.

In addition to the surface roughness, the intrusion caused by dislocation slip also leads to the stress concentration on the surface of the 2024-T3 Al alloy. As shown in Figure 5b, the equilibrium condition of dislocations can be expressed as follows:

τ_{1}^{D} + τ_{1} - k = 0

(3)

where τ₁^D is the back stress due to dislocations piled up on the boundary and k is the frictional stress. Due to the free surface of the specimen, a minor hindrance induces the dislocations to slip towards the surface, as shown in Figure 5c. Because k, grain size 2a′, Burgers vector b, and A, related to the dislocation stress field, are determined by material type, the number of dislocations N_D of the 2024-T3 Al alloy depends on the shear stress (

N_{D} = (τ_{1} - k) a^{'} / π A

) [32].

The large shear stress τ₁ enables dislocations to break through the obstacles of grain boundaries, and a large number of dislocations gradually slip toward the surface of the specimen. The dislocation slip induces intrusion and extrusion zones on the surface of the specimen, as shown in Figure 5c. The size of the intrusion and extrusion zones is determined by the plastic displacement γ₁ (

γ_{1} = π N_{D} b a^{'} / 2

). The increase in ND results in the large γ₁. Consequently, the size of the intrusion dIn increases at the direction of maximum shear stress τ₁. For specimens subjected to certain cyclic stress, the valley depth d’ is positively correlated with dIn (dIn = d’ − d). This implies that the stress concentration generated by the extrusion zone and valley induces cracks on the surface of the 2024-T3 Al alloy specimen, as shown in Figure 5d. The fatigue failure mechanism revealed in Figure 4 is consistent with the failure characteristics exhibited by the fracture surface.

Tanaka et al. pointed out that dislocation slip induces crack initiation and propagation accompanied by energy release [33]. Moreover, fatigue failure primarily resulted from the growth of microstructural defects under cyclic loading [34]. The energy release rate, denoted as G, quantifies the energy dissipated per unit area of the new crack surface formed. Moreover, G is a pivotal factor in understanding crack propagation dynamics [35]. The rate of energy release is intricately linked to the stress intensity factor range ΔK at the crack tip, which reflects the localized stress state influencing crack growth. This relationship between the G and ΔK is expressed as follows:

G = \frac{{(Δ K)}^{2}}{E^{'}}

(4)

where E′ represents the effective modulus of elasticity. The precise calculation of G facilitates the estimation of how quickly a crack may propagate through the Al alloy, thus directly affecting the fatigue life.

Paris’ law provides the relationship between the crack propagation rate and the cyclic stress [36]. The equation is as follows:

\frac{d a}{d N} = C {(Δ K)}^{m}

(5)

where N denotes the number of stress cycles the specimen is subjected to. C and m are material constants. Equation (4) can be rewritten as follows:

Δ K = \sqrt{G E^{'}}

(6)

Substituting Equation (6) into Equation (5), the crack growth rate equation is defined as follows:

\frac{d a}{d N} = C {(G E^{'})}^{m / 2}

(7)

This equation demonstrates the response of the Al alloy to cyclic loading, where C and m indicate the sensitivity of crack growth rate to the cyclic stresses.

As shown in Figure 3 and Figure 4, the fatigue failure of the 2024-T3 Al alloy is attributed to the formation and propagation of the cracks. Thus, the fatigue failure process can be modeled dynamically by defining a state vector x(t) and an input vector u(t). Their functions are, respectively, as follows:

x (t) = [\begin{matrix} a (t) \\ G (t) \end{matrix}]

(8)

and

u (t) = σ (t)

(9)

where a(t) represents the dynamic crack length. G(t) denotes the dynamic energy release rate. σ(t) is the dynamic stress amplitude. The state–space model exhibits the evolution of crack size and energy release over time. Thus, Equation (6) can be expressed as follows:

a (t) = C {(G (t) E^{'})}^{m / 2}

(10)

According to Equations (2) and (6), G(t) is expressed as follows:

G (t) = \frac{Y σ (t)}{E^{'} \sqrt{π a (t)}}

(11)

where Y reflects the influence of the crack’s shape and loading mode on its propagation.

a (t)

represents the dynamically change rate of crack length and

G (t)

denotes the dynamically change rate of energy release. This state–space model exhibits the temporal evolution of crack size and energy release, which is essential for simulating the fatigue behavior of the Al alloy.

Notably, the crack length had a critical size during the propagation process, which is called the critical crack length ac. The parameter ac refers to the maximum allowable crack length before the specimen enters an unstable fracture state. When the crack reaches ac, the crack is at its critical state, and any further growth could result in the rapid propagation and failure of the specimen. In addition, when the stress intensity factor

Δ K

of the Al alloy reaches its fracture toughness,

Δ K_{I C}

, the specimen also enters an unstable fracture. Using the relationship between the stress intensity factor and crack length (Equation (2)), the relationship between ac and

Δ K_{I C}

is expressed as follows:

a_{c} = {(\frac{Δ K_{I C}}{Y σ \sqrt{π}})}^{2}

(12)

Li et al. indicated that the initial crack length a₀ was proportional to the surface roughness Ra. Moreover, the quantitative relationship between the Ra of the specimen and the a₀ is [37] as follows:

a_{0} = 2.97 Ra

(13)

Based on the analysis of dynamic characteristics of the fatigue crack growth, the fatigue life N_f of the Al alloy is as follows:

N_{f} = \int_{a_{0}}^{a_{c}} \frac{1}{C {(\sqrt{G E^{'}})}^{m}} d a

(14)

Substituting Equations (12) and (13) into Equation (14), the fatigue life calculation equation for the 2024-T3 Al alloy is obtained as follows:

N_{f} = \frac{2}{C {(Y^{2} σ^{2} π)}^{m / 2} (m - 2)} [{(\frac{Δ K_{I C}}{Y σ \sqrt{π}})}^{2 (1 - m / 2)} - {(2.97 Ra)}^{1 - m / 2}]

(15)

3. Methodology: Integrating Machine Learning with Physical Model

3.1. Machine Learning Model

3.1.1. Support Vector Machine

Support vector machine (SVM) aims to identify a hyperplane that minimizes the distance of all specimen points from this plane. SVM is particularly advantageous in nonlinear mapping and high-dimensional pattern recognition [38]. Notably, the output of SVM is a continuous value. Moreover, the ε-insensitive loss function is integrated into SVM to facilitate regression analysis, termed Support Vector Regression [39]. The analysis of the fatigue failure of the 2024-T3 Al alloy shows that cracking was a process of continuous change accompanied by changes in energy. For the fatigue life prediction, SVM is an advantageous tool for solving equation parameters. In linear regression analysis, the goal is to derive a regression model (

f (x) = ω^{T} \cdot x + b

) that ensures each training specimen is as close as possible to f(x). The maximum allowable error between the specimen and the model f(x) is designated as ε. Loss is computed when

| f (x_{j}) - y_{j} | > ε

. The ε-insensitive loss function is defined as follows:

l_{ε} = \{\begin{array}{l} 0, & | f (x_{j}) - y_{j} | < ε \\ | f (x_{j}) - y_{j} | - ε, & other \end{array}

(16)

Based on the maximum margin optimization principle of SVM, the optimization objective for regression analysis can be formulated as follows:

\min \frac{1}{2} ∥ ω ∥^{2} + C_{i} \sum_{i = 1}^{n} l_{ε} (f (x_{j}) - y_{j})

(17)

where C_i is the regularization parameter. To solve this optimization problem, non-negative slack variables

ξ_{j}

and

{\hat{ξ}}_{j}

are introduced, leading to the following reformulated objective:

\min \frac{1}{2} ∥ ω ∥^{2} + C_{i} \sum_{i = 1}^{n} (ξ_{j} + {\hat{ξ}}_{j})

(18)

Equation (18) yields the following:

\{\begin{array}{l} ω^{T} \cdot x_{j} + b - y_{j} \leq ε + ξ_{j} \\ y_{j} - ω^{T} \cdot x_{j} - b \leq ε + {\hat{ξ}}_{j} \\ ξ_{j} \geq 0, {\hat{ξ}}_{j} \geq 0, j = 1, 2, \dots, n \end{array}

(19)

In order to convert this constrained optimization problem into an unconstrained one, the Lagrangian function is introduced to Equation (18). The goal function is expressed as follows:

\begin{matrix} L = \frac{1}{2} ∥ ω ∥^{2} + C \sum_{j = 1}^{n} (ξ_{j} + {\hat{ξ}}_{j}) - \sum_{i = 1}^{n} (u_{j} ξ_{j} + {\hat{u}}_{j} {\hat{ξ}}_{j}) \\ - \sum_{j = 1}^{n} α_{j} (ξ_{j} + {\hat{ξ}}_{j} - y_{j} + ω^{T} \cdot x + b) - \sum_{j = 1}^{n} {\hat{α}}_{j} (ξ_{j} + {\hat{ξ}}_{j} + y_{j} - ω^{T} \cdot x - b) \end{matrix}

(20)

where

u_{j} \geq 0

,

{\hat{u}}_{j} \geq 0

,

α_{j} \geq 0

and

{\hat{α}}_{j} \geq 0

are Lagrange multipliers. By taking partial derivatives with respect to

ω

, b,

ξ_{j}

, and

{\hat{ξ}}_{j}

, the following equations can be derived.

\sum_{j = 1}^{n} ({\hat{α}}_{j} - α_{j}) x_{j} = ω

(21)

\sum_{j = 1}^{n} ({\hat{α}}_{j} - α_{j}) = 0

(22)

α_{j} + u_{j} = C_{i}

(23)

{\hat{α}}_{j} + {\hat{u}}_{j} = C_{i}

(24)

Substituting Equations (21)–(24) into Equation (20), the dual problem of SVR can be obtained. The expression is as follows:

\max \sum_{j = 1}^{n} y_{j} ({\hat{α}}_{j} - α_{j}) - ε \sum_{j = 1}^{n} ({\hat{α}}_{j} + α_{j}) - \frac{1}{2} \sum_{j, k = 1}^{n} ({\hat{α}}_{j} - α_{j}) ({\hat{α}}_{k} - α_{k}) x_{k}^{T} \cdot x_{k}

(25)

Moreover, Equation (25) satisfies the Karush-Kuhn-Tucker conditions. The equations are as follows:

\sum_{j = 1}^{n} ({\hat{α}}_{j} - α_{j}) = 0

(26)

0 \leq {\hat{α}}_{j}, α_{j} \leq C_{i}

(27)

Inserting the solution into the linear regression model, the prediction function is as follows:

f (x) = \sum_{i = 1}^{n} ({\hat{α}}_{i} - α_{i}) x_{i}^{T} x + y_{i} + ε - \sum_{j = 1}^{n} ({\hat{α}}_{j} - α_{j}) x_{j}^{T} x_{i}

(28)

For nonlinear regression analysis, by incorporating the kernel function f(x_i,x), the prediction function is modified to the following:

f (x) = \sum_{j = 1}^{n} ({\hat{α}}_{j} - α_{j}) k (x_{j}, x) + y_{i} + ε - \sum_{k = 1}^{n} ({\hat{α}}_{k} - α_{k}) k (x_{j}, x_{k})

(29)

In this study, the radial basis function (RBF) kernel is utilized due to its effectiveness in extending features to infinite dimensions and its ease of parameter tuning.

3.1.2. Random Forest

Random Forest (RF) is an ensemble learning method that enhances prediction accuracy and robustness by constructing multiple decision trees and aggregating their results [40]. RF exhibits significant advantages in handling high-dimensional data and nonlinear problems. For predicting fatigue life, RF is employed in regression analysis.

In regression analysis, the objective is to build a model f(x). The model approximates the training samples (x_l,y_l) as closely as possible. Randomness is introduced by generating multiple decision trees for RF. This decreases the overfitting and improves generalization. Each decision tree consists of a random sample set (obtained via bootstrapping from the training data) and a random subset of features. Each decision tree in the forest predicts using the mean squared error (MSE) as the loss function, defined as follows:

M S E = \frac{1}{n} \sum_{l = 1}^{n} {(y_{l} - {\hat{y}}_{l})}^{2}

(30)

where y_l represents the actual value and

{\hat{y}}_{l}

denotes the predicted value. The RF prediction is determined by averaging the predictions from all the trees. The optimization objective of RF can be expressed as follows:

\min \frac{1}{n} \sum_{l = 1}^{n} {(y_{l} - \frac{1}{T} \sum_{t = 1}^{T} f_{t} (x_{l}))}^{2}

(31)

where T is the number of decision trees, and f_t(x_l) is the prediction of the t-th tree for sample x_l.

To avoid overfitting, RF applies several regularization techniques. These include limiting the tree depth, setting a threshold for node splits, and specifying a minimum number of samples for leaf nodes. In order to convert this constrained optimization problem into an unconstrained one, the Lagrangian function is introduced to Equation (32). The goal function is expressed as follows:

L = \frac{1}{2} ∥ θ_{1} ∥^{2} + C_{j} \sum_{l = 1}^{n} {(y_{l} - {\hat{y}}_{l})}^{2}

(32)

where θ₁ represents the model parameters, C_j is the regularization coefficient and

{\hat{y}}_{l}

is the predicted value for sample x_l. To minimize this function, we set the partial derivatives as follows with respect to θ₁ to zero:

\frac{\partial L}{\partial θ_{1}} = 0

(33)

The prediction process of RF is as follows: Firstly, each decision tree independently predicts the output for a given input sample. Secondly, the final prediction is obtained by averaging the results from all the trees. In practice, cross-validation techniques are employed to evaluate the performance of the RF model, and grid search is used to optimize hyperparameters such as the number of trees, maximum depth, and minimum samples required for a split. The optimization goal in each tree node is to maximize the information gain or minimize the mean squared error, expressed as follows:

\max \sum_{i = 1}^{n} I (G_{i})

(34)

where I(G_i) represent the information gain at node G_i.

In this study, RF was selected due to its robust performance in handling high-dimensional and nonlinear data, along with its high prediction accuracy and stability. By appropriately tuning parameters and optimizing the model, RF can effectively predict fatigue life and provide reliable regression analysis results.

3.1.3. Extreme Gradient Boosting

Extreme Gradient Boosting (XGBoost) is an advanced ensemble learning technique that employs gradient boosting algorithms to enhance prediction accuracy and robustness [41]. XGBoost is highly effective in handling high-dimensional data and complex nonlinear relationships. For predicting fatigue life, XGBoost is employed in regression analysis.

In regression analysis, the objective is to build a model f(x) that approximates the training samples (x_p, y_p) as closely as possible. XGBoost constructs an ensemble of decision trees, where each tree is sequentially added to correct the errors made by the previous trees. The loss function used in XGBoost for regression is typically the mean squared error (MSE). The optimization objective of XGBoost combines the loss function with a regularization term to avoid overfitting, expressed as follows:

\min \sum_{p = 1}^{n} l (y_{p}, {\hat{y}}_{p}) + \sum_{k = 1}^{K} Ω (f_{k})

(35)

where l is the loss function.

Ω (f_{k}) = γ T + \frac{1}{2} λ ∥ {w ∥}^{2}

is the regularization term for the k-th tree. w is the number of leaves in the tree. γ and λ are regularization parameters.

The training process of XGBoost involves the following steps. The model is initialized with a base prediction, which is typically set as the mean of the target values. In each iteration, the gradient and hessian of the loss function with respect to the current predictions are computed. A decision tree is then fitted to the negative gradients (also known as residuals), with the Hessian being used to weight the gradients. The model is updated by adding the predictions from the newly fitted tree. Regularization techniques are applied to control the complexity of the model, ensuring better generalization and avoiding overfitting.

The objective function in each iteration is given by the following:

L^{(t)} = \sum_{n}^{i = 1} [g_{i} f_{t} (x_{p}) + \frac{1}{2} h_{i} f_{t} {(x_{p})}^{2}] + Ω (f_{t})

(36)

where g_i and h_i are the gradient and hessian of the loss function, respectively. In order to convert this constrained optimization problem into an unconstrained one, the Lagrangian function is introduced to Equation (36). The goal function is expressed as follows:

L = \sum_{n}^{i = 1} l (y_{p}, {\hat{y}}_{p}) + \sum_{K}^{k = 1} Ω (f_{k}) + \sum_{m}^{j = 1} λ_{j} (\sum_{n}^{i = 1} w_{i j} - θ_{j})

(37)

where λ_j is the Lagrange multipliers, w_ij is the weights, and θ_j is the constraints. By taking the partial derivatives with respect to the model parameters and setting them to zero, the optimal solution can be derived as follows:

\frac{\partial L}{\partial θ_{j}} = 0

(38)

In practice, XGBoost uses additional techniques such as shrinkage (learning rate), column subsampling, and early stopping to improve model performance. Moreover, overfitting issues are avoided. The prediction function of XGBoost can be expressed as follows:

\hat{y} = \sum_{k = 1}^{K} f_{k} (x)

(39)

where K is the total number of trees. f_k is the prediction from the k-th tree. Advantages of XGBoost include its scalability, efficiency, and ability to handle missing values.

In this study, XGBoost was selected due to its superior performance in handling high-dimensional and nonlinear data, along with its high prediction accuracy and robustness. By appropriately tuning parameters and optimizing the model, XGBoost can effectively predict fatigue life and provide reliable regression analysis results.

3.2. Particle Swarm Optimization

Particle Swarm Optimization (PSO) is a swarm intelligence-based optimization algorithm that mimics the foraging behavior of birds to search for the optimal solution [42]. PSO demonstrates significant advantages in optimizing complex high-dimensional functions and multimodal problems [43]. In this study, PSO is employed to optimize the parameters of machine learning models to enhance prediction accuracy and robustness.

A swarm of particles moves through the search space to find the optimal solution. This is the central idea of the PSO algorithm. Each particle represents a candidate solution with attributes of position and velocity. The movement of particles is influenced by both their own experience and the experience of the swarm. The objective of PSO is to minimize or maximize an objective function, which, in this study, is the prediction error of the machine learning model. The position and velocity update formulas for particles are as follows:

v_{i} (t + 1) = ω v_{i} (t) + c_{1} r_{1} (p_{i} - x_{i} (t)) + c_{2} r_{2} (g - x_{i} (t))

(40)

x_{i} (t + 1) = x_{i} (t) + v_{i} (t + 1)

(41)

where v_i(t) is the velocity of particle i at time t. x_i(t) is the position of the particle. w is the inertia weight. c₁ and c₂ are acceleration coefficients. r₁ and r₂ are random numbers between 0 and 1. p is the personal best position of the particle. g is the global best position of the swarm. The inertia weight w determines the extent to which the particle retains its previous velocity. A larger inertia weight facilitates global exploration, while a smaller inertia weight aids in local exploitation. The acceleration coefficients c₁ and c₂ are known as cognitive and social parameters, respectively, balancing the influence of individual and swarm experiences.

The PSO optimization process involves the following steps. (1) Initialization: randomly initialize the positions and velocities of particles in the search space and evaluate each particle’s fitness value. (2) Update personal best position: if a particle’s current fitness value is better than its historical best position, update its personal best position p_i. (3) Update global best position: if a particle’s current fitness value is better than the current global best position, update the global best position. (4) Update velocity and position: update each particle’s velocity and position according to the velocity and position update formulas. (5) Iteration: repeat steps 2–4 until the maximum number of iterations is reached or the fitness value converges.

In machine learning model optimization, PSO is used to search for the optimal combination of parameters, such as learning rate, regularization parameters, and maximum depth, to minimize the prediction error. The optimization objective can be expressed as follows:

\min \sum_{i = 1}^{n} l (y_{i}, {\hat{y}}_{i}) + Ω (θ_{k})

(42)

where l is the loss function.

{\hat{y}}_{i}

is the predicted value. y_i is the actual value. Ω(θ_k) is the regularization term with model parameters θ_k. λ is the regularization parameter.

During the optimization process, the fitness value of each particle is determined by the objective function, which can be the prediction error of the machine learning model. Using PSO to optimize machine learning models can significantly improve model performance and generalization ability. In practice, PSO combined with cross-validation techniques is used to evaluate model performance, and parameter ranges, which are initially set by grid search and random search, are optimized.

In this study, PSO was chosen to optimize the parameters of machine learning models due to its excellent performance in high-dimensional and multimodal optimization problems. The PSO-optimized models can more accurately predict fatigue life and provide reliable regression analysis results.

3.3. Model Evaluation Criteria

To comprehensively assess the performance of the fatigue life prediction model across different strategies, multiple metrics are employed: the coefficient of determination (R²), root mean square error (RMSE), and mean absolute percentage error (MAPE). R² values approaching 1 indicate a high degree of similarity between the predicted and observed values, signifying greater model accuracy. RMSE quantifies the deviation between predicted and observed values, with low RMSE values reflecting good predictive performance. Due to its sensitivity to large values, RMSE can be influenced by outliers. MAPE is the average percentage deviation between predicted and observed values. By normalizing the errors of each prediction, MAPE offers robust insights into model accuracy, with low MAPE values indicating high prediction precision. The formulas for calculating R², RMSE, and MAPE are presented in Equations (43)–(45), respectively.

R^{2} = 1 - \frac{\sum_{n}^{i = 1} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{n}^{i = 1} {(y_{i} - \bar{y})}^{2}}

(43)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{n}}

(44)

M A P E = \frac{1}{n} \sum_{n}^{i = 1} |\frac{y_{i} - {\hat{y}}_{i}}{y_{i}}| \times 100 %

(45)

where y_i represents the observed values.

{\hat{y}}_{i}

represents the predicted values.

\bar{y}

is the mean of the observed values. n is the number of samples.

3.4. Model Integration Strategy

This study integrates physical models with machine learning techniques to develop a fatigue life prediction model for the 2024-T3 Al alloy. The strategy for fatigue life prediction is illustrated in Figure 6. The fatigue life data for the Al alloy are initially obtained under the eight cyclic stresses. Experimental data are used to train the machine learning model for fatigue life prediction. The PSO algorithm is then combined with the machine learning model that has the best predictive performance to enhance accuracy. The predicted output data from the model are used to fine-tune key parameters in the physical model. Thus, this combined approach optimizes the parameters of the physical model, as derived in Equation (14). Notably, C, m, and

Δ K_{I C}

are material constants. The parameters are precisely adjusted through the optimization algorithm, ultimately yielding an accurate physical model.

Figure 6. Strategy diagram of fatigue life prediction for Al alloy.

The implementation of the fatigue life prediction models in this study was carried out using various Python-based libraries. Scikit-learn was employed for the development of traditional machine learning models such as SVM and RF. The XGBoost model was implemented using the XGBoost library. PSO was realized through the Pyswarm library. The computing environment consisted of an Intel Core i7-9700K processor (8 cores, 3.6 GHz), 32 GB of RAM, and an Ubuntu 20.04 operating system. The Python version used was 3.9.

4. Results and Discussion

4.1. Fatigue Life Prediction Results and Discussion

The fatigue life of the 2024-T3 Al alloy was diagnosed using SVM, XGBoost, and RF models. Figure 7 shows the performance of the machine learning models in predicting fatigue life. The difference between the actual fatigue life values from the test set and the predicted values from each model was compared. Hence, the prediction accuracy of each model and its ability to handle complex nonlinear data are evaluated. As shown in Figure 7, the horizontal axis represents the index of test set specimens, while the vertical axis represents fatigue life. In addition, the blue dots represent the actual fatigue life values measured experimentally, and the yellow lines represent the predicted values. Notably, the overlap between the blue dots and the yellow lines shows the prediction accuracy of each model. The more overlap, the more accurate the prediction. On the contrary, points with a low overlap or noticeable deviation indicate a prediction error.

Figure 7. Comparison of actual and predicted values, (a) RF, (b) SVM, and (c) XGBoost.

Figure 7a shows the prediction accuracy of the RF model. This model can capture some basic patterns in the data. However, significant deviations are present in the regions of high cycle fatigue, particularly at certain extreme points. The results indicate that the RF model may lack sufficient generalization ability when dealing with complex nonlinear data, leading to difficulties in accurately capturing finer variations. In general, the RF model is essentially an ensemble learning method based on multiple decision trees. However, the presence of noisy data or extreme values limits its generalization and resistance to noise. For SVM, complex decision boundaries in high-dimensional feature space can be constructed. However, in terms of fatigue life prediction, the SVM model produces more non-overlapping points, as shown in Figure 7b. The prediction accuracy of the SVM model is also poor. This may be attributed to overfitting or underfitting when high-dimensional complex data are handled. Especially for a large number of features and nonlinear data distributions, the prediction accuracy of the SVM decreases. Compared to RF and SVM, most actual and predicted points overlap present in the fatigue life prediction of the XGBoost model, as shown in Figure 7c. The results show that the XGBoost model exhibits high prediction accuracy of the fatigue life. XGBoost employs an ensemble of decision trees based on gradient boosting, which provides strong nonlinear fitting capabilities and efficiently handles noisy data and feature importance evaluation. The excellent generalization ability of the XGBoost model is reflected not only in its adaptability to the training set but also in its robust prediction performance on the test set.

Figure 8 shows the quantitative relationship between the predicted and actual values of the 2024-T3 Al alloy. Error bands are used to visually display the prediction deviations of each model. The horizontal axis represents the experimentally measured fatigue life values, while the vertical axis shows the predicted results from each model. Each image contains three important lines. The first line is an ideal line that is perfectly aligned between predicted and actual values. The second line is the ±1.25 error band. The third line is ±1.5 error band. Ideally, all data points should fall close to the ideal line. If data points fall within the error bands, the model exhibits high prediction accuracy at that point. However, points outside the error bands indicate large prediction errors. Figure 8a shows that most predicted values of the RF model fall within the ±1.5 error band. The results indicate that the error of the RF model is large. Notably, extreme value predictions are larger. Figure 8b shows the most predicted points within the ±1.5 error band. The limitation of parameter tuning and the nonlinear feature distribution in high-dimensional space decreases the accuracy. The results exhibit similar behavior to the RF. For the XGBoost model, all data are concentrated within the ±1.25 error band, as shown in Figure 8c. The results indicate that XGBoost has high prediction accuracy. The advantage of XGBoost is its ability to handle high-dimensional nonlinear data while balancing accuracy and generalization. This makes it a powerful tool for handling complex engineering tasks.

Figure 8. Fatigue life predictions vs experimental results, (a) RF, (b) SVM, and (c) XGBoost.

To further quantify the performance of the models, the R² and MAPE metrics are calculated. The performance indicators in Table 4 provide a detailed quantitative evaluation of each model’s ability to predict fatigue life. The R² and MAPE of the XGBoost model are 0.93 and 16.34%, respectively. The highest R² indicates that the model shows excellent fitting between the predicted and actual fatigue life of the 2024-T3 Al alloy. The MAPE of the average error between the predictions and actual values of XGBoost is only 16.34%. The lowest error shows the robustness and reliability of the XGBoost in capturing complex patterns and nonlinear relationships in the data. The R² of the RF model is 0.91, which is 2.15% lower than that of XGBoost. In addition, the MAPE of the RF model is 22.34%, which is 6% higher than XGBoost’s. The results indicate that the RF explains 91% of the variance, and its average prediction error is noticeably large. The SVM model performs the worst, with an R² of 0.88 and a MAPE of 26.77%. The R² of 0.88 shows that SVM only explains 88% of the data variance, which is 5.38% lower than that of the XGBoost and 3.3% lower than that of the RF. In addition, the MAPE of the SVM is the highest among the models, 10.43% higher than that of the XGBoost and 4.43% higher than that of the RF. The results indicate significantly large prediction errors. These differences show that XGBoost is better suited for handling complex fatigue life prediction tasks.

Table 4. Comparison of prediction accuracy for different models.

XGBoost exhibits significant superiority over SVM and RF in terms of prediction accuracy and stability. To further enhance its performance, the Particle Swarm Optimization (PSO) algorithm is used to optimize XGBoost’s hyperparameters. The goal of this process is to find better parameter configurations, which could improve the model’s generalization ability and prediction accuracy. Key parameters of the XGBoost’s hyperparameters included the number of estimators, learning rate, maximum tree depth, and minimum child weight. The low fitness value indicates better model performance. To mitigate the risk of overfitting to the validation set during hyperparameter optimization, k-fold cross-validation (k = 5) was employed to compute the fitness value in each iteration, ensuring the generalization capability of the optimization process. In each iteration of PSO, the training data was split into five subsets. During each iteration, four subsets were used for model training, while the remaining subset was used for validation. This process was repeated five times so that each subset served as the validation set once. For each particle generated in a PSO generation, the fitness value was computed across the five folds. The average fitness value across these five folds was then calculated and used as the generation’s average fitness. The best fitness value among the folds was recorded as the minimum fitness for that generation.

Figure 9 exhibits the fitness curve during the PSO’s optimization of XGBoost’s hyperparameters. The curves show how the model’s performance evolves over the generations (iterations). PSO randomly initializes a group of particles to explore the hyperparameter space, searching for the optimal parameter configuration to improve the model’s performance. The horizontal axis represents the number of iterations, while the vertical axis shows the fitness values. The minimum fitness (blue curve) and average fitness (orange curve) reveal the changes in local and global solutions during each generation.

Figure 9. Fitness evolution of XGBoost hyperparameter tuning using Particle Swarm Optimization.

In the early stages of optimization (the first five iterations), the average fitness value rapidly drops from about 165 to around 140. This indicates that PSO extensively explored the hyperparameter space in the initial stage. The steep decline suggests that PSO quickly identified a set of relatively good hyperparameter combinations, which significantly improved the fitness of the XGBoost model. After the fifth iteration, the decline in fitness slows, and the values begin to stabilize. Notably, the minimum fitness remains around 135, suggesting that PSO is approaching the global optimal solution. In the later iterations, the average fitness shows slight fluctuations but generally remains around 140. These fluctuations reflect the fine-tuning process of PSO in the local search space to further optimize the hyperparameters and get as close as possible to the global optimum. The convergence trend of the fitness curve indicates that PSO effectively completes the optimization of XGBoost’s hyperparameters after about 40 iterations.

The best hyperparameter configuration obtained through PSO optimization is as follows. The number of estimators is set to 1235. This high number of trees helps strengthen the decision boundaries of the model and improves its ability to capture complex data patterns. Then, the learning rate is finely tuned to 0.0905. This relatively low learning rate helps prevent the model from overshooting the optimal solution during training and ensures stable convergence. Additionally, the maximum tree depth is set to 3.00. This relatively shallow depth helps control the model’s complexity and avoids overfitting while still maintaining sufficient performance through the ensemble of many trees. Finally, the minimum child weight was set to 1.00, ensuring that each node split includes a sufficient number of samples, thus improving the model’s robustness, especially when dealing with imbalanced data.

After using PSO to optimize the hyperparameters of the XGBoost model, the PSO-XGBoost model is obtained. The PSO-XGBoost model is then trained for fault diagnosis. Figure 10 presents a plot comparing the predicted and experimental results for the fatigue life of the 2024-T3 Al alloy. The points are plotted against the ideal 45-degree line, which represents perfect agreement between predicted and actual values. Compared to the XGBoost model, the points for the PSO-XGBoost model show a much closer alignment with the ideal line, particularly within the ±1.25 and ±1.5 error bands. This indicates a marked improvement in prediction accuracy. The reduction in scatter and tighter clustering of points near the ideal line suggests that the PSO-XGBoost model has effectively decreased prediction errors and has achieved better generalization. Figure 11 further supports this conclusion by comparing the predicted and actual fatigue life on a logarithmic scale. The PSO-XGBoost model shows a significantly greater overlap between predicted and actual values across the entire test set. This overlap confirms the model’s improved capacity to capture the nonlinear relationships in the data, especially for specimens with extreme or highly variable fatigue life. The orange lines representing predicted values closely follow the blue dots of the actual values, indicating a strong correlation and high prediction accuracy.

Figure 10. Fatigue life predictions vs. experimental results for PSO-XGBoost.

Figure 11. Comparison of actual and predicted values for PSO-XGBoost.

The performance of the model, based on two key metrics, is quantitatively analyzed. The R² and MAPE are listed in Table 5. For the XGBoost model, the R² value is 0.93, while the MAPE is 16.34%, reflecting good performance in predicting fatigue life. However, after PSO optimization, the PSO-XGBoost model achieves an R² of 0.96, showing a 3% improvement in the variance explained by the model. The MAPE drops to 11.89%, reflecting a substantial reduction in the average prediction error. This improvement demonstrates that PSO’s global hyperparameter search significantly enhances the model’s predictive accuracy by fine-tuning parameters such as the number of estimators, learning rate, and tree depth.

Table 5. The results of XGBoost and PSO-XGBoost.

As shown in Figure 10 and Figure 11, as well as Table 4, the PSO-XGBoost model outperforms the XGBoost, RF, and SVM models in all key evaluation metrics. The PSO-XGBoost model is able to closely match the predicted values with the actual experimental results, which highlights its excellent accuracy and robustness.

4.2. Physical Model Parameter Optimization

The PSO-XGBoost model is identified as the optimal choice for predicting the fatigue life of the 2024-T3 Al alloy. Then, the parameters C, m, and ΔK_IC in the physical model are determined using the predictive values of the model. The PSO-XGBoost model is first employed to predict the fatigue life of the 2024-T3 Al alloy, and the S-N curve is subsequently plotted based on the predicted values, as shown in Figure 12.

Figure 12. Fatigue life prediction using PSO-XGBoost model.

The L-BFGS-B (Limited-memory Broyden–Fletcher–Goldfarb–Shanno with Box constraints) algorithm is employed to optimize the parameters of the physical model. The L-BFGS-B algorithm is a numerical method suitable for large-scale optimization problems [44]. It uses limited memory and gradient information to approximate the Hessian matrix, achieving efficient parameter optimization. This algorithm not only considers parameter boundary constraints but also effectively handles high-dimensional problems. The application of the L-BFGS-B algorithm to Equation (15) contributes to the optimal values for the parameters C, m, and ΔK_IC, which are determined to be 1.44 × 10⁻¹⁰, 2.60, and 36.5 MPa·m^1/2, respectively. For the 2024-T3 Al alloy, according to the literature and experimental data, the value of parameter C is typically in the order of from 10⁻¹⁰ to 10⁻⁹. The parameter m, which describes the sensitivity of the crack growth rate to changes in the stress intensity factor range, generally falls between 2 and 4 [45]. The parameter ΔK_IC of the 2024-T3 Al alloy generally ranges between 35 MPa·m^1/2 and 40 MPa·m^1/2 [46]. Therefore, the parameter values obtained in this study are both accurate and reasonable. Furthermore, the parameter solving method for the physical model of fatigue life prediction of the 2024-T3 Al alloy proposed in this study is accurate. In addition, the fatigue life prediction equation for the 2024-T3 aero Al alloy considering different surface roughness is ultimately obtained.

5. Conclusions

(1): A physical model was established using the energy method of fracture mechanics. Based on the fatigue fracture characteristics of the 2024-T3 Al alloy, the failure mechanism under the coupling effect of dislocation slip and surface roughness was revealed. Then, the fatigue life prediction equation was established by considering the energy changes during the fatigue crack initiation and propagation. The parameters of the equation include material constants and fracture toughness.
(2): The combination of PSO and XGBoost improved the prediction accuracy of the fatigue life of the 2024-T3 Al alloy. By analyzing the accuracy of RF, SVM, and XGBoost in the fatigue life prediction, it is found that the XG-Boost possesses a high R² and low MAPE. Thus, the XGBoost model was selected to predict the fatigue life. Subsequently, the PSO algorithm was employed to optimize the hyperparameters of the XG-Boost model, resulting in improved prediction accuracy.
(3): A physical equation for predicting the fatigue life of the 2024-T3 Al alloy was proposed. Using the fatigue life predictions from the PSO-XGBoost model, the key parameters of the physical fatigue life prediction model were determined. The values of the parameters align with existing experimental data for the 2024-T3 Al alloy. This implied that the physical model of fatigue life proposed in this study is reasonable.

This study established a physical model for fatigue life based on the physical relationship between surface roughness and initial crack length. Considering factors such as the shape and surface defects of structural components, an equivalent physical relationship between the factors and the initial crack length is established. Then, using the fatigue life prediction method proposed in this study, the fatigue life of metallic structural components in the equipment manufacturing industry can be accurately predicted.

Author Contributions

Z.L.: Writing—original draft, conceptualization; H.Y.: methodology, software, supervision; C.Z.: conceptualization, supervision; W.D.: data curation, investigation; C.G.: formal analysis, funding acquisition; Q.L.: supervision, validation; J.Z.: data curation, project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Natural Science Foundation of China (Grant No. 52404124), China Postdoctoral Science Foundation (2023TQ0145 and 2023M731481), Scientific Study Project for Institutes of Higher Learning, Liaoning Province (JYTQN2023195), and Liaoning Provincial Department of Science and Technology Doctoral Scientific Startup Fund (2023-BS-204).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (Grant No. 52404124), China Postdoctoral Science Foundation (2023TQ0145 and 2023M731481), Scientific Study Project for Institutes of Higher Learning, Liaoning Province (JYTQN2023195), Liaoning Provincial Department of Science and Technology Doctoral Scientific Startup Fund (2023-BS-204).

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature

XGBoost	Extreme Gradient Boosting	N_f	Fatigue Life
SVM	Support Vector Machine	G	Energy Release Rate
RF	Random Forest	R²	Coefficient of Determination
PSO	Particle Swarm Optimization	RMSE	Root Mean Square Error
SEM	Scanning Electron Microscope	MAPE	Mean Absolute Percentage Error
Ra	Surface Roughness	σ_max	Maximum Cyclic Stress
E	Elastic Modulus	a	Crack Length
σ_b	Tensile Strength	ΔK	Stress Intensity Factor
σ_S	Yield Strength	Y	Shape Factor
ΔK_IC	Stress Intensity Factor Range	k_t	Stress Concentration Coefficient

References

Zhong, X.-C.; Xie, R.-K.; Qin, S.-H.; Zhang, K.-S. A process-data-driven BP neural network model for predicting interval-valued fatigue life of metals. Eng. Fract. Mech. 2022, 276, 108918. [Google Scholar] [CrossRef]
Wang, Y.; Zhu, Z.; Sha, A.; Hao, W. Low cycle fatigue life prediction of titanium alloy using genetic algorithm-optimized BP artificial neural network. Int. J. Fatigue 2023, 172, 107609. [Google Scholar] [CrossRef]
Song, H.; Liu, J.; Zhang, H.; Du, J. Multi-source data driven fatigue failure analysis and life prediction of pre-corroded aluminum–lithium alloy 2050-T8. Eng. Fract. Mech. 2023, 292, 109626. [Google Scholar] [CrossRef]
Fan, J.-L.; Zhu, G.; Zhu, M.-L.; Xuan, F.-Z. A data-physics integrated approach to life prediction in very high cycle fatigue regime. Int. J. Fatigue 2023, 176, 107917. [Google Scholar] [CrossRef]
Zhou, C.; Wang, H.; Hou, S.; Han, Y. A hybrid physics-based and data-driven method for gear contact fatigue life prediction. Int. J. Fatigue 2023, 175, 107763. [Google Scholar] [CrossRef]
Nowell, D.; Nowell, P. A machine learning approach to the prediction of fretting fatigue life. Tribol. Int. 2020, 141, 105913. [Google Scholar] [CrossRef]
Lian, Z.; Li, M.; Lu, W. Fatigue life prediction of aluminum alloy via knowledge-based machine learning. Int. J. Fatigue 2022, 157, 106716. [Google Scholar] [CrossRef]
Cui, H.; Han, Q. Fatigue Damage Mechanism and Fatigue Life Prediction of Metallic Materials. Metals 2023, 13, 1752. [Google Scholar]
Kashyzadeh, K.R.; Ghorbani, S. New neural network-based algorithm for predicting fatigue life of aluminum alloys in terms of machining parameters. Eng. Fail. Anal. 2023, 146, 107128. [Google Scholar] [CrossRef]
Chen, H.; Yao, S.; Yang, Y.; Li, Y.; Xu, S.; Zhang, R. Fatigue Life Prediction of Aluminum Alloys Based on Surface and Internal Defects. J. Mater. Eng. Perform. 2023, 32, 8687–8699. [Google Scholar] [CrossRef]
Chabouk, E.; Shariati, M.; Kadkhodayan, M.; Masoudi Nejad, R. Fatigue assessment of 2024-T351 aluminum alloy under uniaxial cyclic loading. J. Mater. Eng. Perform. 2021, 30, 2864–2875. [Google Scholar] [CrossRef]
Cauthen, C.; Anderson, K.; Avery, D.; Baker, A.; Williamson, C.; Daniewicz, S.; Jordan, J.B. Fatigue crack nucleation and microstructurally small crack growth mechanisms in high strength aluminum alloys. Int. J. Fatigue 2020, 140, 105790. [Google Scholar] [CrossRef]
Wisner, B.; Kontsos, A. Investigation of particle fracture during fatigue of aluminum 2024. Int. J. Fatigue 2018, 111, 33–43. [Google Scholar] [CrossRef]
Zhan, Z.; Hu, W.; Meng, Q. Data-driven fatigue life prediction in additive manufactured titanium alloy: A damage mechanics based machine learning framework. Eng. Fract. Mech. 2021, 252, 107850. [Google Scholar] [CrossRef]
Sai, N.J.; Rathore, P.; Chauhan, A. Machine learning-based predictions of fatigue life for multi-principal element alloys. Scr. Mater. 2023, 226, 115214. [Google Scholar] [CrossRef]
He, L.; Wang, Z.; Akebono, H.; Sugeta, A. Machine learning-based predictions of fatigue life and fatigue limit for steels. J. Mater. Sci. Technol. 2021, 90, 9–19. [Google Scholar] [CrossRef]
Zhang, X.-C.; Gong, J.-G.; Xuan, F.-Z. A deep learning based life prediction method for components under creep, fatigue and creep-fatigue conditions. Int. J. Fatigue 2021, 148, 106236. [Google Scholar] [CrossRef]
Zhang, J.; Zhu, J.; Guo, W.; Guo, W. A machine learning-based approach to predict the fatigue life of three-dimensional cracked specimens. Int. J. Fatigue 2022, 159, 106808. [Google Scholar] [CrossRef]
Pałczyński, K.; Skibicki, D.; Pejkowski, Ł.; Andrysiak, T. Application of machine learning methods in multiaxial fatigue life prediction. Fatigue Fract. Eng. Mater. Struct. 2023, 46, 416–432. [Google Scholar] [CrossRef]
Zhan, Z.; Li, H. A novel approach based on the elastoplastic fatigue damage and machine learning models for life prediction of aerospace alloy parts fabricated by additive manufacturing. Int. J. Fatigue 2021, 145, 106089. [Google Scholar] [CrossRef]
Raja, A.; Chukka, S.T.; Jayaganthan, R. Prediction of fatigue crack growth behaviour in ultrafine grained al 2014 alloy using machine learning. Metals 2020, 10, 1349. [Google Scholar] [CrossRef]
Wang, L.; Zhu, S.-P.; Luo, C.; Liao, D.; Wang, Q. Physics-guided machine learning frameworks for fatigue life prediction of AM materials. Int. J. Fatigue 2023, 172, 107658. [Google Scholar] [CrossRef]
Hu, M.; Tan, Q.; Knibbe, R.; Xu, M.; Jiang, B.; Wang, S.; Li, X.; Zhang, M. Recent applications of machine learning in alloy design: A review. Mater. Sci. Eng. R Rep. 2023, 155, 100746. [Google Scholar] [CrossRef]
Li, H.; Zhang, J.; Hu, L.; Su, K. Notch fatigue life prediction of micro-shot peened 25CrMo4 alloy steel: A comparison between fracture mechanics and machine learning methods. Eng. Fract. Mech. 2023, 277, 108992. [Google Scholar] [CrossRef]
Wang, H.; Li, B.; Xuan, F.-Z. Fatigue-life prediction of additively manufactured metals by continuous damage mechanics (CDM)-informed machine learning with sensitive features. Int. J. Fatigue 2022, 164, 107147. [Google Scholar] [CrossRef]
Wang, H.; Li, B.; Gong, J.; Xuan, F.-Z. Machine learning-based fatigue life prediction of metal materials: Perspectives of physics-informed and data-driven hybrid methods. Eng. Fract. Mech. 2023, 284, 109242. [Google Scholar] [CrossRef]
Dai, W.; Liu, Z.; Li, C.; He, D.; Jia, D.; Zhang, Y.; Tan, Z. Fatigue life of micro-arc oxidation coated AA2024-T3 and AA7075-T6 alloys. Int. J. Fatigue 2019, 124, 493–502. [Google Scholar] [CrossRef]
Fu, R.; Ling, C.; Zheng, L.; Zhong, Z.; Hong, Y. Continuum damage mechanics-based fatigue life prediction of L-PBF Ti-6Al-4V. Int. J. Mech. Sci. 2024, 273, 109233. [Google Scholar] [CrossRef]
Dai, W.; Zhang, C.; Guo, C.; Li, Z.; Yue, H.; Li, Q.; Zhang, J.; Shang, Z. Effect of grit blasting on fatigue behavior of 2024-T3 aero Al alloy. J. Mater. Res. Technol. 2024, 32, 519–529. [Google Scholar] [CrossRef]
Suraratchai, M.; Limido, J.; Mabru, C.; Chieragatti, R. Modelling the influence of machined surface roughness on the fatigue life of aluminium alloy. Int. J. Fatigue 2008, 30, 2119–2126. [Google Scholar] [CrossRef]
Pegues, J.; Roach, M.; Williamson, R.S.; Shamsaei, N. Surface roughness effects on the fatigue strength of additively manufactured Ti-6Al-4V. Int. J. Fatigue 2018, 116, 543–552. [Google Scholar] [CrossRef]
Li, C.; Dai, W.; Zhang, H.; Liu, Y.; Zhang, Y. Effect of initial forging temperature on mechanical properties and fatigue behavior of EA4T steel. Eng. Fract. Mech. 2020, 238, 107287. [Google Scholar] [CrossRef]
Tanaka, K.; Mura, T. A dislocation model for fatigue crack initiation. J. Appl. Mech. Mar. 1981, 48, 97–103. [Google Scholar] [CrossRef]
Lavogiez, C.; Dureau, C.; Nadot, Y.; Villechaise, P.; Hémery, S. Crack initiation mechanisms in Ti-6Al-4V subjected to cold dwell-fatigue, low-cycle fatigue and high-cycle fatigue loadings. Acta Mater. 2023, 244, 118560. [Google Scholar] [CrossRef]
Verma, R.; Kumar, P.; Jayaganthan, R.; Pathak, H. Extended finite element simulation on Tensile, fracture toughness and fatigue crack growth behaviour of additively manufactured Ti6Al4V alloy. Theor. Appl. Fract. Mech. 2022, 117, 103163. [Google Scholar] [CrossRef]
Carpinteri, A.; Montagnoli, F. Scaling and fractality in subcritical fatigue crack growth: Crack-size effects on Paris′ law and fatigue threshold. Fatigue Fract. Eng. Mater. Struct. 2020, 43, 788–801. [Google Scholar] [CrossRef]
Wang, J.; Zhang, Y.; Sun, Q.; Liu, S.; Shi, B.; Lu, H. Giga-fatigue life prediction of FV520B-I with surface roughness. Mater. Des. 2016, 89, 1028–1034. [Google Scholar] [CrossRef]
Moghaddam, T.B.; Soltani, M.; Shahraki, H.S.; Shamshirband, S.; Noor, N.B.M.; Karim, M.R. The use of SVM-FFA in estimating fatigue life of polyethylene terephthalate modified asphalt mixtures. Measurement 2016, 90, 526–533. [Google Scholar] [CrossRef]
Li, A.; Baig, S.; Liu, J.; Shao, S.; Shamsaei, N. Defect criticality analysis on fatigue life of L-PBF 17-4 PH stainless steel via machine learning. Int. J. Fatigue 2022, 163, 107018. [Google Scholar] [CrossRef]
Gan, L.; Wu, H.; Zhong, Z. Fatigue life prediction considering mean stress effect based on random forests and kernel extreme learning machine. Int. J. Fatigue 2022, 158, 106761. [Google Scholar] [CrossRef]
Gao, T.; Zhan, Z.; Hu, W.; Meng, Q. A novel damage mechanics and XGBoost based approach for HCF life prediction of cast magnesium alloy considering internal defect characteristics. Int. J. Fatigue 2024, 182, 108220. [Google Scholar] [CrossRef]
Liu, X.; Zhao, X.; Shangguan, W.B. Fatigue life prediction of natural rubber components using an artificial neural network. Fatigue Fract. Eng. Mater. Struct. 2022, 45, 1678–1689. [Google Scholar] [CrossRef]
Wang, Q.; Yao, G.; Kong, G.; Wei, L.; Yu, X.; Jianchuan, Z.; Luo, L. A data-driven model for predicting fatigue performance of high-strength steel wires based on optimized XGBOOST. In Engineering Failure Analysis; Elsevier: Amsterdam, The Netherlands, 2024; p. 108710. [Google Scholar]
Morales, J.L.; Nocedal, J. Remark on “Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization”. ACM Trans. Math. Softw. (TOMS) 2011, 38, 1–4. [Google Scholar] [CrossRef]
Gairola, S.; Verma, R.; Jayaganthan, R. Study on fatigue and fracture behavior of Al 2024 alloy through XFEM and stress-life approach. Procedia Struct. Integr. 2023, 46, 182–188. [Google Scholar] [CrossRef]
Wang, S.; Li, N.; Chi, X. Dynamic Fractural Toughness of 2024-T3 Aluminum Alloy. J. Netshape Form. Eng. 2017, 9, 72–78. [Google Scholar]

Figure 1. Geometry of the fatigue specimens.

Figure 2. Fatigue test machine in this study.

Figure 3. Fatigue life under the eight stress levels.

Figure 4. Fatigue fracture of the 2024-T3 Al alloy.

Figure 5. Schematic diagram of fatigue failure mechanism of the 2024-T3 Al alloy based on dislocation slip.

Figure 6. Strategy diagram of fatigue life prediction for Al alloy.

Figure 7. Comparison of actual and predicted values, (a) RF, (b) SVM, and (c) XGBoost.

Figure 8. Fatigue life predictions vs experimental results, (a) RF, (b) SVM, and (c) XGBoost.

Figure 9. Fitness evolution of XGBoost hyperparameter tuning using Particle Swarm Optimization.

Figure 10. Fatigue life predictions vs. experimental results for PSO-XGBoost.

Figure 11. Comparison of actual and predicted values for PSO-XGBoost.

Figure 12. Fatigue life prediction using PSO-XGBoost model.

Table 1. Chemical composition of 2024-T3 Al alloy (wt.%).

Cu	Si	Fe	Mn	Mg	Zn	Cr	Ti	Other	Al
3.8~4.9	0.5	0.5	0.3~0.9	1.2~1.8	0.25	0.1	0.15	0.15	Other

Table 2. Mechanical properties of 2024-T3 Al alloy.

Elastic Modulus E (GPa)	Tensile Strength σ_b (MPa)	Yield Strength σ_S (MPa)	Elongation δ (%)
74.0	466	333	22.8

Table 3. Statistical analysis of fatigue life.

Minimum (N)	Maximum (N)	Mean (N)	Median (N)	Standard Deviation (N)
15,161.00	827,501.00	141,953.15	76,693.00	174,688.49

Table 4. Comparison of prediction accuracy for different models.

ML Model	R²	MAPE [%]
RF	0.91	22.34
SVM	0.88	26.77
XGBoost	0.93	16.34

Table 5. The results of XGBoost and PSO-XGBoost.

ML Model	R²	MAPE [%]
XGBoost	0.93	16.34
PSO-XGBoost	0.96	11.89

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.