1. Introduction
Steel is among the most common engineering materials because it possesses a desirable combination of strength, plasticity, availability, and cost-effectiveness [
1]. Steel is used in a wide array of structural and mechanical applications that consist of bridges, automobiles, pipelines, aircraft, and pressure vessels [
2]. Steels with high static strength still experience gradual accumulation of damage over time under cyclic loading and eventually fail through fatigue [
3]. Fatigue is a condition in which material under cyclic or fluctuating loading is broken at a loading level below its tensile strength [
4]. The outcome is a sudden and catastrophic failure in most instances without noticeable prior deformation and warning signs, so it is an important engineering design and maintenance factor [
5].
The fatigue response of steel in cyclic loading is affected by a broad variety of parameters, including chemistry, microstructure, loading conditions, shape, fabrication processes, and heat treatment parameters [
6]. These parameters interact with each other in a complex manner to decide how cracks form and evolve. With so many influencing parameters, accurately predicting fatigue life is an ongoing challenge in engineering [
7]. Historical empirical methods like load vs. cycle number (S-N curves) yield valuable but limited information, especially when complex material histories and processing conditions exist [
8]. Developing more accurate predictions requires more subtle and data-driven methods to capture the interrelationships among various parameters that influence fatigue behavior [
9].
One of the most influential factors affecting steel’s fatigue performance is its thermal and mechanical treatment history, specifically heat treatment processes and deformation processes such as rolling and forging [
10]. Parameters including total heat treatment time (THT) and reduction ratio (RedRatio) play a very significant role in determining the final microstructure and grain and residual stress orientation in steel and hence affect how and when cracking develops during subsequent cyclic loading [
11]. For instance, faulty heat treatment produces coarse grains and residual tensile stresses that reduce crack initiation life [
12]. Similarly, the amount of plastic deformation induced during rolling or forging measured by the reduction ratio increases and shortens fatigue life based on its level and homogeneity [
13].
Experimental research and industry observation have repeatedly indicated that changes in process conditions result in significant changes in fatigue life, even for similar material-grade specimens [
14]. However, quantification of each input parameter’s relative influence on fatigue behavior is still challenging [
15]. Multicollinearity in variables, noise in material property measurements, and complex interdependencies among them add to this challenge [
16]. This emphasizes the need for statistically rigorous analysis approaches that can evaluate each causative factor’s relative importance in an interpretable and transparent manner [
17].
The development of computational modeling over the past few years has made it possible to create more intricate fatigue life prediction models that utilize a larger variety of input variables and have more flexibility to incorporate practical data. While such forecasts could be very beneficial, the main defect that still exists is the limited visibility of the process from inputs to outputs [
18]. That is, even if predictions about fatigue life are accurate, it is usually unknown which variables are causing them and to what degree [
19]. Such a lack of interpretation could constrain usefulness for pragmatic decision-making applications such as finding optimal manufacturing settings or assessing quality in an industry setting [
20,
21].
In this case, an accurate measurement of input significance is no longer desirable but imperative [
22]. Understanding the main factors that influence the fatigue behavior of materials allows engineers and scientists to focus on the improvement of those parameters, improving process control, and reducing the variability of performance [
23]. Furthermore, with each parameter’s contributions being understood, irrelevant variables can be removed from the prediction model, simplifying calculations, minimizing computational load, and maximizing resistance to overfitting and noise [
24]. These insights prove particularly valuable when testing and data gathering are restricted, and each input variable is an expenditure in measuring effort or equipment [
25,
26,
27,
28].
The purpose of this study is to explore the relative significances of variables that influence steel component fatigue life, specifically processing parameters. Through an investigation into attributes including RedRatio, THT, and others, this work is designed to identify variables that play the most significant roles in fatigue behavior and how their influence is quantified. The research depends on detailed data that is produced from experiments on steel under controlled conditions focusing on steel fatigue to thoroughly investigate the factors that govern the performance of the material under a repeating load.
To meet this goal, the research employs high-level analytical methodologies to determine the individual and collective implications of multiple input variables on fatigue performance. This includes measuring each variable’s variance and contribution, identifying leading factors, and evaluating how changes in major parameters affect the system response. By such feature-based understanding and quantification, the study makes a great deal of contributions to the very nature of fatigue behavior in steel and also to the provision of suggestions that make the performance improved and the process go better. Ultimately, it seeks to build a link between observation-based data and pragmatic engineering applications to guide further design and manufacture-based decisions for steel components with improved resistance to fatigue.
1.1. Related Work
Huang et al. [
5] examine the fatigue properties of steel parts manufactured by Wire-Arc Directed Energy Deposition (DED), also referred to as Wire-Arc Additive Manufacturing (WAAM), a high-efficiency metal 3D printing technique commonly being evaluated for structural applications because of its cost-effectiveness and scalability. Although those benefits sound great, there is a lack of data on the structural properties of WAAM-manufactured parts, particularly their fatigue properties, which are crucial for ensuring long-term durability in practical engineering applications. To fill this gap, an extensive series of 75 uniaxial high-cycle fatigue tests on as-built (original roughness surface) and machined (smoothed surface) WAAM steel coupons was executed. Tests were conducted over various ranges of stresses and ratios of stresses (R = 0.1 to 0.4). Numerical modeling was also executed to explore the effect of as-built sample surface-induced stress concentrations. The examination utilized approaches based on S-N (stress life) and CLD (constant life diagram). The results indicated that as-built undulations significantly deteriorate fatigue performance by reducing endurance limit and fatigue life by about 35% and 60%, respectively, compared to machined specimens. Nevertheless, changes in stress ratio did not significantly affect fatigue strength. Characteristically, as-built WAAM specimens were found to behave like conventional steel welds, while machined WAAM specimens behaved similarly to structural steel S355. Last but not least, preliminary S-N curves for both local and nominal stresses were suggested and served as a baseline reference for future structural application design using WAAM steel.
Zhai et al. [
29] concentrate on developing Structural Health Monitoring (SHM) in civil structures, specifically identifying fatigue cracks in steel-girder bridges. Visual inspections based on traditional approaches take considerable time and manual labor. Although computer vision and machine learning-based approaches offer an alternative solution with greater efficiency, they are hindered by the lack of high-quality image data from real damaged structures, particularly those suffering from fatigue and failing in service. As a solution to counter data paucity, researchers suggest that synthetic data augmentation be combined with real-world image databases. First, 3D graphical models with randomly generated textures representing simulated fatigue cracks are produced on steel surfaces. These are rendered as synthetic images under differing lighting and camera settings, closely similar to actual inspection conditions. An FCN is trained in both configurations—one with just real images and the other with real and synthetic data combined. The experiments exhibit that using synthetic images to enrich the dataset greatly enhances crack detection performance. In particular, Intersection over Union (IoU) increases from 35% to 40%, and accuracy increases from 49% to 62% through the inclusion of synthetic data. This is a clear demonstration that synthetic data can be an innovative application to solve the shortage of data in SHM (Structural Health Monitoring) applications and to facilitate the development of better and more scalable machine learning-based crack detection in steel infrastructure. Sousa et al. [
30] study the fatigue of bonded single lap joints (SLJs) by combining the investigation of three main factors: joint morphology (adhesive thickness and overlap length), substrate material (GFRP, steel, and CFRP), and adhesive (epoxy and methacrylate). They opened the door to why each variable would affect the joint’s lifetime under cyclical loading by their controlled cyclic fatigue experiments. Although methacrylate had less strength at static rupture, it was the one that showed the longest fatigue life, up to 20 times under the same load ratio, even when the results were corrected to the ultimate static failure load. However, it was substrate flexibility that stole the show, taking GFRP to the limelight as the cause of a drastic drop in fatigue life (more than 10 times) in comparison with steel. What is really interesting is that the fatigue life of dissimilar material joints improved due to stress redistribution by the flexibility of the adhesive; finite element analysis confirmed this. In CFRP joints with an epoxy adhesive, the reduction of the fatigue performance due to interfacial failure was observed. The geometry discloses that the thicker the adhesives and the bigger the overlaps, the lower the fatigue life, which is contrary to the general assumption, as several parametric studies have been described. Almost all the aspects of the overlap length are changed due to the variation of the parameters, as is given in the parametric studies. The studies here are presented as a design and a materials selection guide for achieving an increase in the fatigue resistance of structural adhesive joints.
In contrast, the present study advances the field by combining ensemble gradient boosting (Histogram and Categorical Gradient Boosting) with novel metaheuristic optimizers (Prairie Dog Optimization and Wild Geese Algorithm) to enhance model accuracy and generalization. Furthermore, Shapley Additive Explanations (SHAP) are employed to quantify feature importance and improve model interpretability—addressing one of the major limitations of prior ML-based fatigue models. To the best of the authors’ knowledge, this is the first study integrating such hybrid optimization–boosting frameworks for fatigue life prediction of steel while providing an explainable analysis of influential factors such as reduction ratio and heat treatment time.
1.2. Study Objective and Novelty
This study presents a new approach to understanding steel fatigue through a change in paradigm from traditional surface-based or material-based evaluations to an examination based on process features. In contrast to previous research that largely dealt with surface conditions, types of bonds, or material interfaces, a focus is here developed on how all inputs to manufacture interact to affect fatigue performance and, more specifically, which inputs have the greatest bearing on endurance over time. What distinguishes this work is that it combines superior hybrid modeling approaches with a strong framework for interpretation. Histogram Gradient Boosting (HGB) and Categorical Gradient Boosting (CAT) models are combined with two nature-inspired optimization algorithms, Prairie Dog Optimization and Wild Geese Algorithm, to form four high-powered hybrid structures. These models are not just trained to forecast fatigue behavior with great accuracy but also to be able to explain and interpret the influence of each input parameter. This capability of bi-directional turns traditional black box modeling into an explanatory and insightful process. To find out on an objective basis which properties of input most significantly influence the fatigue life, the feature importance is computed via SHAP values. Using this methodology, it becomes achievable to assess input parameters arranged in order of their influence on model output. Rather than treating variables as the same or assigning them similar treatments, the method single out those attributes that require going into in-depth while designing and manufacturing and those that bring variation of the system and have the most significant impact on fatigue consequences. The core novelty in this work is its capability to merge modeling with a clear understanding. The findings improve prediction precision and further, the engineers as well as the decision-makers can see where efforts/work are required. This minimizes unnecessary experimentation, increases structural reliability, and results in more effective design approaches. In practice, it fills an essential gap between prediction based on data and engineering intuition. With it, manufacturers and scientists can go beyond trial and error by providing a transparent and interpretable route toward understanding steel’s fatigue behavior. By making the contributions of each input variable obvious, it paves the way for more intelligent and resilient part design within structural applications and ultimately enhances safety, lifespan, and performance in real-world structures.
3. Result and Discussion
In this research work, a thorough appraisal was made to forecast steel fatigue behavior based on a feature-rich data set involving parameters for thermal treatment, chemical constituents, and measurement properties. The features that were selected refer to the variables , and , and also to the chemical constituents and , and form parameters such as RedRatio and diameters The target variable was measured as the number of cycles to failure. Two innovative learning algorithms, HGB and CAT, were utilized to model the fatigue response. They were combined with two nature-inspired optimizations, namely PDO and WGA, and four hybrid prediction models were generated: HGPD, HGGW, CAPD, and CAGW. Each hybrid model sought to enhance learning accuracy and generalizability by parameter tuning with optimizations. The performance of these models was deeply inspected by a range of statistical measures including MAE, RMSE, R2, VAF, SI, SMAPE, and U95. These metrics provided a fair and transparent evaluation of the model’s efficacy in predicting complex fatigue behavior.
Figure 4 illustrates the RMSE optimization trajectories for the four predictive models (CAPD, CAGW, HGPD, and HGGW) over 200 algorithmic iterations. Each curve represents a single deterministic optimization process rather than multiple independent runs. During each optimization, a population of 30 individuals was initialized, and the process converged after 200 iterations. To ensure stability, three independent optimization rounds were executed, and the trajectory with the best performance was selected for presentation. Among the models, the HGGW framework achieved the lowest RMSE (14.02), demonstrating superior convergence and predictive stability.
To ensure a fair and efficient optimization process, all hyperparameters were tuned using a random search strategy with 200 iterations. The following search spaces were employed for the hybrid models: epochs ∈ [10, 100], learning rate ∈ [0.0001, 0.01], number of membership functions (num_mf) ∈ [2, 10], alpha (regularization coefficient) ∈ [1 × 10
−8, 0.1], and length scale ∈ [1, 50]. The selected parameter ranges were determined through preliminary experiments to balance computational efficiency and coverage of plausible values reported in previous studies. The optimal values obtained for each model are summarized in
Table 3. Each optimization process was initialized with a population size of 30 and executed for a maximum of 200 iterations. The exploration–exploitation coefficient (β) was fixed at 0.6, and the cognitive/social coefficients (c
1, c
2) were set to 1.5 each to maintain balanced information sharing among candidate solutions. Random numbers were drawn from a uniform distribution [0, 1], and the stopping criterion was defined as a change in RMSE less than 10
−6 across ten consecutive iterations. These settings ensured robust convergence while maintaining computational efficiency.
Table 4 presents a detailed comparative assessment of six prediction models, CAT, CAGW, CAPD, HGB, HGGW, and HGPD, across training, validation, and test phases in steel fatigue prediction. The evaluation parameters are R
2 (a measure of determination), RMSE, MAE, VAF, SI, U95, and SMAPE, which collectively give a complete overview of each model’s accuracy, reliability, and generalizability. Among all models, RBGW performs better than the others throughout all phases. It obtains the best R
2 (validation 0.998 and test 0.997), minimum RMSE (validation 16.34 and test 15.41), and lowest MAE and SI, demonstrating the least prediction error and maximum robustness. These outcomes show that RBGW is most suited to providing predictions of steel’s fatigue behavior with maximum consistency and accuracy. On the other hand, the RBF is the worst performer among other models. It has poor results on both validation and test, as it gives the lowest values of R
2 (0.887 and 0.837) and the highest RMSE and U95, which indicate less stability and more uncertainty. The practical implications of these findings arise from achieving precise modeling of steel behavior under cyclic load conditions. By recognizing RBGW as the best model, engineers can now make better predictions of service life and design safer, more resilient steel structures. This is particularly important in aerospace, transportation, and civil engineering, where early onset fatigue failure causes dire consequences. Choosing a correct model, therefore, goes a direct step towards safety, economy, and material efficiency.
Among all hybrid configurations, the HGGW model (Histogram Gradient Boosting optimized by the Wild Geese Algorithm) exhibited the highest predictive accuracy with an R2 value of 0.998. This superior performance can be attributed to the effective interaction between the gradient boosting framework and the adaptive search dynamics of the WGA optimizer. The WGA employs a dynamic group-based search mechanism that balances global exploration and local exploitation more efficiently than PDO. This allows WGA to escape local minima and identify more optimal hyperparameter combinations for the HGB model, particularly for parameters such as learning rate, maximum depth, and number of leaf nodes. Moreover, the inherent regularization and histogram-based discretization in the HGB algorithm complement WGA’s adaptive convergence behavior, reducing overfitting and improving generalization on unseen fatigue data. The resulting synergy enhances the model’s robustness and stability across different random data splits. Overall, the HGGW model’s outstanding performance arises from the combination of WGA’s efficient parameter search and HGB’s structured ensemble learning, yielding a well-balanced model that captures nonlinear relationships while maintaining predictive smoothness and resistance to noise.
The six scatter plots in
Figure 5 evaluate the prediction performance of six smart models on the correlation between measured and predicted values. Each model is measured using the R
2 and RMSE on training, testing, and validation datasets. Among all models, the HGGW model produces the most accurate and stable predictions with the highest R
2 values close to 0.998 and extremely low RMSE (e.g., 14.02 for training and 15.41 for testing), representative of high generalizability. CAGW performs strongly (up to R
2 = 0.958 and RMSE ≈ 42), with data points close together clustering around the line Y = X. CAPD and HGPD show moderate performance with near 0.94 values for R
2 and 45–52 for RMSE. However, HGB and CAT give an inferior performance with lower values for R
2 (as 0.837 and 0.903) and higher RMSE (up to 74.50 and 62.47), indicating more divergence from measured values. The region shaded out between Y = 0.8X and Y = 1.2X is an acceptable prediction zone. The performance indicates that HGGW is superior to others and is the most general and ideal model for performing the work.
Figure 6 clearly shows the distribution of error in six intelligent modeling approaches in predicting steel’s fatigue life during training, validation, and testing periods. The error in the HGB model is extremely volatile, with up to ±60%, presenting it with a lack of stability, high sensitivity to changes in inputs, and poor generalization to novel data points. The same is true for the CAT model, which exhibits high error variation during its training period, thereby exposing its shortcomings in modeling noise and nonlinear fatigue data. In contrast, HGGW, CAGW, and CAPD models exhibit superior stability and accuracy. HGGW is particularly good with errors confined to a small ±4% margin in all phases, indicating high prediction accuracy and strong generalizability. CAGW with errors below ±30% was found to have better stability and reliability of forecasts than the baselines. CAPD also demonstrates a similar performance with less fluctuation during the test phase, thus having more stability. Still, HGPD is on average limited and characterizes test errors that exceed 80%, which suggests the presence of overfitting or the incapability of adapting to new inputs. These outcomes disclose the potential of hybrid and adaptive modeling methods. The operations like HGGW, CAGW, and CAPD have a significant effect on performance and generalization of the solution of complex fatigue cases. Such upgrades become especially relevant in the real-world scenario where the safety margin is low and the failure risk is high; therefore, these models can be considered as reliable and trustworthy tools for monitoring metal structural fatigue life.
Figure 7 displays a comparison of different models by means of their Regression Error Characteristic (REC) curves. The REC curve is a graphical representation of prediction error distributions giving cues about model accuracy and reliability. The AUC, from which the lower the area under the curve the better a model is, is the main quantitative aspect of each curve. HGGW has the highest AUC value of 0.985, which is an indicator of its better capability to predict fatigue trends with less errors. The models are implemented with a mix of innovative fuzzy logic and neural network techniques to disclose the intricate and nonlinear nature of metal fatigue. In practice, precise prediction of fatigue is paramount in guaranteeing structural reliability and security in various industries, including aerospace, automobile, and civil engineering. Inaccurate predictions lead to disastrous failures and time-wasting downtime. For this reason, high AUC-rated models such as HGGW and HGPD provide valuable tools for preventive maintenance and lifespan maximization. The results demonstrate the real-life application utility of hybrid intelligent models in predictive maintenance approaches toward maximizing engineered system lifetime, safety, and cost-effectiveness.
Figure 8 presents the relative influence of individual input parameters on steel fatigue life as computed using the Cosine Amplitude Method (CAM). CAM quantifies the linear association between each input vector and the target variable by measuring their cosine similarity. This approach highlights global proportional trends between process variables and fatigue behavior. Among all inputs, the number of thermal cycles (NT) exhibits the highest CAM coefficient (0.503), signifying that repeated heat-treatment cycles exert the strongest linear correlation with fatigue performance. This finding agrees with metallurgical evidence that cyclic heating and cooling promote microstructural instability, leading to crack initiation and fatigue degradation. Secondary contributors include Chromium content (Cr, 0.233) and cooling time (Ct, 0.130), both of which influence grain refinement and residual stress formation during processing. In contrast, parameters such as reduction ratio (RedRatio) and Molybdenum content (Mo) show minimal direct linear correlation with fatigue life, indicating weaker overall trends in the dataset.
From an engineering perspective, identifying such strongly correlated variables is essential for prioritizing process control. In industrial sectors where steel components endure repeated mechanical and thermal loading—such as aerospace landing gear, automotive shafts, and turbine blades—focusing on parameters like NT and Cr can significantly improve fatigue resistance, extend service life, and enhance operational safety.
Figure 9 illustrates feature importance derived from the Shapley Additive Explanations (SHAP) analysis, which provides a model-based interpretation of how each variable influences the predictions of the hybrid boosting models. Unlike CAM, SHAP evaluates each feature’s marginal contribution to the predicted fatigue life by averaging its effect across all possible combinations of features. This enables the capture of nonlinear interactions and synergistic effects among thermal, chemical, and geometric parameters. According to the SHAP analysis, Chromium (Cr) exhibits the highest mean absolute SHAP value, confirming it as the most influential variable in predicting fatigue performance. This aligns with Chromium’s known role in enhancing hardenability, corrosion resistance, and crack-growth retardation in steels. Nickel (Ni) and TT1 (first-stage tempering temperature) follow closely, reflecting their contribution to toughness and microstructural stability. Moderate importance is also observed for Carbon (C), Manganese (Mn), and Molybdenum (Mo), which jointly affect solid-solution strengthening. Conversely, NT, THT, and specimen-geometry variables (dB, dC) exhibit lower SHAP magnitudes, implying limited incremental predictive power once other interacting variables are accounted for.
The differences between CAM and SHAP rankings arise from their underlying principles: CAM measures global linear correlations, whereas SHAP quantifies nonlinear, model-learned contributions. Therefore, while NT appears most prominent in CAM due to its strong direct correlation with fatigue life, SHAP identifies Cr and Ni as dominant when nonlinear metallurgical interactions are considered. Together, these complementary analyses provide a comprehensive understanding of feature influence—from simple linear trends to complex coupled effects—guiding the optimization of processing parameters for improved fatigue durability.
SHAP was selected for this study because of its model-agnostic and additive properties, which make it especially suitable for tree-based ensemble learners such as HGB and CAT. Its ability to quantify each feature’s marginal contribution with exact additivity ensures consistent, physically interpretable explanations that bridge data-driven predictions with metallurgical understanding.
To further ensure reliability, potential multicollinearity among metallurgical variables (e.g., Ni, Cr, Mo) was analyzed during pre-processing, and redundant inputs were minimized through normalization and correlation screening. Consequently, the variations between CAM and SHAP rankings stem solely from methodological differences rather than dataset inconsistencies. While both approaches offer valuable insights, SHAP provides the more comprehensive interpretive framework for this study, as it captures nonlinear dependencies and feature interactions intrinsic to the hybrid boosting models. CAM serves as a complementary tool to visualize global linear trends and to cross-validate feature relevance.
To benchmark the performance of the proposed hybrid models,
Table 5 compares the results of this study with several recently published fatigue life prediction models employing different machine learning frameworks. Zahran et al. [
39] utilized a Gradient Boosting Regressor (GBR) and obtained an R
2 of 0.9332, while He et al. [
40] implemented a Convolutional Neural Network (CNN) with R
2 = 0.972. Zhang et al. [
41] applied a Deep Neural Network (DNN) model with relatively lower accuracy (R
2 = 0.893), and Guo et al. [
42] adopted a Regression Tree (RT) yielding R
2 = 0.9484. Huang et al. [
43] combined CNN and LSTM architectures, achieving R
2 = 0.9719. In contrast, the present HGGW model (Histogram Gradient Boosting + Wild Geese Algorithm) achieved an R
2 of 0.995, demonstrating higher predictive accuracy than the previously reported models. The improvement can be attributed to the synergistic optimization of hyperparameters via WGA and the strong nonlinear learning capability of Histogram Gradient Boosting, which together enhance generalization and convergence stability. This comparison highlights that the proposed hybrid optimization–boosting strategy represents a notable advancement in data-driven fatigue life prediction methods.
4. Discussion
This study employed a feature-based approach to enhance both the accuracy and interpretability of fatigue life prediction in steel. By combining advanced ensemble learners with nature-inspired optimization algorithms, the research successfully captured the nonlinear relationships among metallurgical, thermal, and geometric factors affecting fatigue behavior. Among the proposed hybrid models, the HGGW model demonstrated the highest performance with minimal error and superior generalization, validating the effectiveness of coupling gradient boosting with metaheuristic optimization. A key strength of this work lies not only in its predictive performance but also in its interpretability. The application of SHAP (Shapley Additive Explanations) analysis allowed a detailed evaluation of each feature’s contribution to fatigue life, revealing both expected and novel patterns. Results indicated that parameters such as Chromium (Cr), Nickel (Ni), and tempering temperature (TT1) exert the most significant influence, confirming their metallurgical roles in strengthening mechanisms, residual stress control, and microstructural refinement. At the same time, the Cosine Amplitude Method (CAM) analysis highlighted the number of heat-treatment cycles (NT) as the most linearly correlated factor. The integration of both SHAP and CAM therefore provided complementary insights—one capturing nonlinear model-driven interactions and the other linear global trends.
From an engineering perspective, these findings have practical implications for material design and process optimization. Understanding how variables such as heat-treatment cycles, alloy composition, and reduction ratio interact enables engineers to fine-tune manufacturing parameters to maximize fatigue resistance. Such data-driven interpretability transforms predictive models from “black boxes” into tools for informed decision-making in industrial applications such as aerospace, transportation, and civil infrastructure. This research also contributes methodologically by demonstrating how hybrid optimization techniques—specifically PDO and the WGA—can effectively tune model hyperparameters to achieve superior results without manual calibration. This approach strengthens model robustness and adaptability for complex materials problems where parameter interactions are highly nonlinear.
However, the study also has certain limitations. The dataset employed was sourced from an existing repository, meaning it may not fully capture temporal or environmental variations present in real-world conditions. Moreover, the computational cost associated with metaheuristic optimization increases with data complexity, which could limit scalability for larger industrial datasets. Future work should therefore explore real-time adaptive learning frameworks capable of integrating live sensor data from structural health monitoring systems and develop physically informed features that combine experimental metallurgy with machine learning insights. The present analysis focused on low-cycle fatigue (LCF) data with fatigue lives below approximately 104 cycles. Although the proposed hybrid models exhibited strong predictive performance within this regime, high-cycle and very-high-cycle fatigue (HCF/VHCF) conditions often present greater variability and scatter. Future studies should extend the model framework to include broader fatigue life ranges and multiple loading regimes to fully assess its generalization capability and robustness.
While the current study focuses on steel fatigue under typical laboratory conditions, the proposed hybrid models can potentially be extended to more extreme scenarios, including ultra-high-cycle fatigue (UHCF) and corrosive environments, as well as to other engineering materials such as aluminum alloys or composites. Future research will explore transferring learning approaches to leverage pretrained models on steel datasets for rapid adaptation to new materials and fatigue regimes. This approach can facilitate model generalization and reduce the experimental effort required to develop reliable predictive models across diverse material systems.