Next Article in Journal
Effect of Height Difference Between Adjacent Liquid Injection Holes on Wetting Body Evolution of Ion-Absorbed Rare Earth In Situ Leaching Ore
Previous Article in Journal
Investigation of the Mechanical Properties and Friction Coefficient of Cr/CrTiAl and Cr/(CrTiAl)N/CrTiAl PVD Coatings Deposited on 42CrMo4 QT Steel
Previous Article in Special Issue
Synthetic Rebalancing of Imbalanced Macro Etch Testing Data for Deep Learning Image Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning-Based Prediction of Young’s Modulus in Ti-Alloys

1
Department of Mathematics and Natural Sciences, International University of Science and Technology in Kuwait, Ardiya 92400, Kuwait
2
Energy and Building Research Centre, Kuwait Institute for Scientific Research, P.O. Box 24885, Safat 13109, Kuwait
3
School of Engineering, The University of Waikato, Private Bag 3105, Hamilton 3240, New Zealand
*
Author to whom correspondence should be addressed.
Metals 2026, 16(2), 233; https://doi.org/10.3390/met16020233
Submission received: 27 January 2026 / Revised: 15 February 2026 / Accepted: 17 February 2026 / Published: 19 February 2026
(This article belongs to the Special Issue Machine Learning Models in Metals (2nd Edition))

Abstract

This study explores the use of machine learning to predict the experimental Young’s modulus of titanium alloys based on their mechanical and microstructural properties. Several regression models were developed and compared, including Random Forest, XGBoost, CatBoost, Multi-Layer Perceptron, and a Stacking Regressor. Among these, Random Forest, XGBoost and CatBoost achieved the most accurate results with R2 values above 0.85. To improve interpretability, SHapley Additive exPlanations were applied to examine which input features most strongly influenced the predictions. The results showed that yield strength, hardness, and the molybdenum equivalent parameter (moe) were among the most influential descriptors. While yield strength and hardness were positively associated with the predicted values, higher moe values corresponded to lower predicted Young’s modulus. This study focuses on the prediction of Young’s modulus, a comparatively less explored elastic property in Ti-alloy machine learning studies and combines systematic model comparison with SHAP-based interpretability to provide physically consistent insights into feature–property relationships.

1. Introduction

For decades, titanium (Ti) and its alloys have played a crucial role in the industry. The unique properties of Ti and its alloys have attracted the attention of the scientific community and the industry, which have worked to design and manufacture new types of Ti alloys [1]. Ti alloys have been widely used in many engineering sectors, such as biomedical [2], marine [3], structural [4], and aerospace [5] due to their low density, high strength, corrosion resistance, biocompatibility, and relatively low Young’s modulus (YM) [6].
Depending on the temperature, Ti forms in two allotropic structures; at low temperatures, Ti crystallizes in a hexagonal close-packed (HCP) Ti-α phase. While at high temperatures, Ti crystallizes in a body-centered cubic (BCC) Ti-β phase [7]. The crystal structure change influences the Ti properties, such as ultimate tensile strength (UTS), yield stress (YS), elongation [8], hardness, and Young’s Modulus (YM) [9]. For Ti alloys, the addition of alloying elements plays a crucial role in changing the microstructure of Ti. This is very important in terms of designing an alloy for specific applications. For instance, Ti-6Al-4V, which is the most Ti alloy that has been used in the industry due to its unique properties, has a balanced mixture of the Ti-α phase and Ti-β phase, creating a lamellar microstructure (α + β) [10]. YM (a measure of elasticity) is one of the most important properties of any metal. For instance, in the biomedical field, researchers are still trying to achieve a Ti implant with a low YM value that can be close to the YM value of the bone; the stiff Ti implant can reduce the mechanical stress on the lower bone, resulting in bone resorption [11]. Commercial pure Ti (CP-Ti) has a YM value of 116 GPa [12], while Ti-6Al-4V has a YM value of 110 GPa [13]. On the other hand, Ti-xNb-xZr β alloys have a YM value ranging between 61 and 75 GPa [14] and the Ti-39Nb alloy has a YM value of 39 GPa [15].
In recent years, machine learning (ML) has played a strong role in the prediction of several properties of different classes of materials, including polymers [16], ceramics [17], metals [18], and composites [19]. Since 1995, the mechanical properties of metal matrix composites have been investigated by artificial neural networks [20]. This has been followed by a series of investigations with the same method [21,22,23]. More recently, advanced regression models such as Random Forest (RF), XGBoost, CatBoost, and neural networks have been applied to problems such as predicting low-modulus Ti alloys [24], sintered density of bronze [25] and Cu-Al Alloys [26], ultimate tensile strength of steels [27], and the ML-driven design and optimization of NiTi-based shape memory alloys with enhanced elastocaloric performance [28]. Accordingly, this study aims to predict the YM of Ti alloys using ML regression models and to identify the most influential governing features through model interpretability analysis.
Although the machine learning techniques employed in this work are well established, the contribution of the present study lies in its focused investigation of Young’s modulus, which has received comparatively less attention than strength-related properties in previous Ti-alloy machine learning studies. In addition, the systematic comparison of multiple models combined with SHAP-based analysis enables the prediction results to be examined in relation to physically meaningful descriptors, such as phase constitution and mechanical properties. This approach supports a more transparent interpretation of data-driven Young’s modulus prediction using a consolidated experimental dataset.

2. Materials and Methods

In this section, general information about the database used in the study is presented, followed by data analysis based on the key mechanical and microstructural properties of Ti alloys. Several machine learning regression models, including ensemble and neural network approaches, were applied to predict the YM of Ti-alloys. In addition, feature importance was evaluated using SHAP (SHapley Additive exPlanations) to provide a more interpretable understanding of the predictions. The overall procedure followed in this study is illustrated in Figure 1.

2.1. Database Description

An experimental dataset containing mechanical properties of 282 distinct multicomponent Ti-based alloys, together with their associated microstructural and phase-related descriptors, was used in this study. The dataset includes mechanical indicators such as yield strength and hardness, compositional and phase-related variables reflecting alloy chemistry and phase constitution, and selected microstructural features reported in the original sources. A detailed description of the full dataset and data collection procedure can be found in the related reference [29].

2.2. Data Processing and Analysis

Before starting the prediction and regression analysis, the database was examined in detail. As can be seen in [29], the table contains many mechanical properties and microstructural features of Ti-alloys. Some of the reported properties were not directly relevant to the objective of predicting YM. To better understand the relationships among the variables, a correlation analysis was performed between the numerical features and the target variable. Based on this analysis, features showing very weak association with YM or strong overlap with other variables were excluded from the modeling stage to avoid redundancy and maintain a clearer input structure. The numerical properties considered in this study, along with their corresponding descriptions, are summarized in Table 1. A correlation heatmap of the remaining independent variables is presented in Figure 2, while the distribution curves of the numerical properties are shown in Figure 3.
For the numerical properties, count (number of parameters which is available in the data), mean, standard deviation, minimum value, 50% of values, and maximum value were determined and are shown in Table 2. The descriptive statistics reflect the actual number of available records for each feature, as some experimental values were not reported in the source dataset.
Categorical properties and their descriptions can be found in Table 3. The count plots (Figure 4) show how often each category appears in the dataset. The parameter “moe” denotes the molybdenum equivalent, a compositional index commonly used to quantify β-phase stability in titanium alloys, while “moe_class” represents its categorical classification. In the moe_class feature, most samples belong to the ‘meta’ and ‘near’ groups, while the others are less common. For the Ph1 (primary phase), ‘Ti-beta’ is the most frequent by far. In Ph2, ‘Ti-alpha-dp’ and ‘Ti-omega’ appear the most, but some categories have only a few samples. The Ph3 feature has very few entries overall, with ‘Ti-omega’ being the most common. Finally, for the Condition feature, most samples were either solution treated and water quenched (ST + WQ) or used in their as-cast form. These plots show that the dataset is not evenly distributed across categories, which may affect model performance.
Before starting the regression analysis, missing values were handled. Numerical features were completed using the median value of each variable, while categorical features were filled using the most frequent category. To verify that the results were not strongly dependent on this choice, an additional test using mean imputation for numerical variables was performed. The overall predictive performance remained at a similar level, with only minor changes in the relative ranking of the tree-based models.
Since the machine learning algorithms work only on the numerical data, categorical features must be converted into the numerical features before feeding them to the machine learning algorithm. For this, we applied the “one hot encoding” in Python (version 3.12.2). One hot encoding technique is used to represent categorical variables as numerical values in a machine learning model especially with RF [30].

2.3. Machine Learning Models

In this study, five regression models were applied to predict the YM of Ti-alloys based on their mechanical and microstructural properties. These models included RF, XGBoost, CatBoost, a Multi-Layer Perceptron (MLP), and a Stacking Regressor ensemble.
RF and XGBoost were selected due to their strong performance in modeling nonlinear relationships in tabular datasets. CatBoost was included because of its ability to effectively handle categorical features without extensive preprocessing. MLP was employed to evaluate the performance of a neural network–based approach on the same dataset. In addition, a stacking regressor was implemented by combining the four base models, with linear regression used as the final estimator.

2.4. Model Training and Evaluation Metrics

The dataset was randomly divided into training (80%) and testing (20%) subsets using a fixed random seed (random_state = 42) to ensure reproducibility. Data preprocessing included median imputation for numerical variables, one-hot encoding for categorical features, and feature scaling for neural network compatibility. Unless otherwise specified, models were trained using default hyperparameter settings. To examine the stability of the results, a 5-fold cross-validation was additionally performed, and the overall performance trends were consistent with those obtained from the hold-out split.
To evaluate the predictive performance of each ML algorithm, the statistical metrics mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), and R-squared (R2) score were calculated. These metrics’ formulas are given by Equations (1)–(4), respectively, where y is the target value, y ¯ is the mean, y ^ is the corresponding model estimate, f ^ ( x i ) is the prediction for the ith observation and n is the size of the validation set.
M A E = 1 n i = 1 n y i y i ^
M S E = 1 n i = 1 n y i f ^ ( x i ) 2
R M S E = 1 n i = 1 n y i y i ^ 2
R 2 = 1 R S S T S S
where R S S = i = 1 n y i y i ^ 2 is the residual sum of squares and T S S = i = 1 n y i y ¯ 2 the total sum of squares [31,32], respectively. These values help us understand how close the model’s predictions are to the actual values and how well the model explains the overall variation in the data.

3. Results and Model Evaluation

3.1. Performance Comparison of ML Models

Table 4 presents the performance scores of all five models: RF, XGBoost, CatBoost, MLP, and the Stacking Regressor. Among them, the RF model gave the most accurate predictions, with the lowest error values and the highest R2 score. XGBoost and CatBoost demonstrated comparable predictive accuracy, with XGBoost showing a slightly better overall performance based on R2, MSE, and RMSE.
Based on the results in Table 4, RF achieved the highest predictive accuracy and was therefore selected for further analysis.

3.2. Random Forest Prediction Results

Figure 5 presents a comparison between the experimental YM values and the corresponding predictions obtained using the Random Forest model. The predicted and experimental values show close agreement across the tested samples. The comparative results for the first 10 samples are listed in Table 5.

3.3. SHAP-Based Feature Importance Analysis

SHAP (SHapley Additive exPlanations) was employed to analyze the contribution of individual input features to the predicted YM.
Among the models tested, SHAP was applied to CatBoost because it is a tree-based algorithm compatible with SHAP’s TreeExplainer, which enables efficient and reliable interpretation of feature contributions. Although RF achieved the highest predictive performance, CatBoost was selected for interpretability due to its stable SHAP behavior and comparable feature importance patterns.
Figure 6 presents the SHAP summary plot, where each point shows how much a feature influenced the prediction for one sample. The color indicates the feature value: red for high and blue for low. Points on the right side of the vertical line indicate a positive impact on the prediction (increasing the predicted YM), while points on the left indicate a negative impact. YS was the most influential feature, followed by molybdenum equivalent parameter (moe), hardness (HV), and microstructural phases such as F3 and P1. Although the moe feature appears among the top three most important predictors, its SHAP values are consistently negative. This indicates that moe has a strong but inverse effect on YM; higher moe-related values are associated with lower YM predictions.
Figure 7 displays the top 10 features based on their average SHAP value. These results confirm that both mechanical properties and microstructural features play an important role in predicting YM in Ti-alloys. SHAP analysis was used to examine the contribution of individual input features to the model output.
The strong influence of YS and HV on YM is consistent with their known relationship to atomic bonding strength and phase constitution. In contrast, the negative contribution of the moe feature may be related to the elastic behavior of β-phase-stabilized Ti alloys.

4. Discussion

The results indicate that machine learning models can be used to model relationships between mechanical, microstructural, and phase-related features affecting the YM of multicomponent Ti-alloys. RF provided the highest predictive accuracy, followed by XGBoost and CatBoost with similar performance, while the Stacking Regressor and the MLP showed lower accuracy. The performance trends observed for the tree-based models may be related to their suitability for structured experimental datasets.
The SHAP-based feature importance analysis was used to examine the contribution of individual input features to the model predictions. Yield strength and Vickers hardness were found to be among the most influential positive contributors to YM. This observation reflects indirect associations rather than causal relationships, as Young’s modulus is fundamentally governed by atomic bonding and crystal structure. The prominence of these mechanical properties in the model may therefore be related to their correlation with alloy chemistry, phase constitution, and other descriptors that influence elastic behavior.
The molybdenum equivalent parameter (moe), which represents β-phase stability based on alloy composition, was found to contribute negatively to the predicted YM. This behavior is consistent with the elastic response of β-stabilized and metastable β Ti alloys, which typically exhibit lower elastic moduli than α or α + β alloys. The SHAP results therefore suggest that variations in phase stability captured by moe are reflected in the predicted elastic behavior.
Young’s modulus in titanium alloys is often interpreted in terms of bonding characteristics and phase constitution, and in some cases simple empirical relations are used to provide approximate estimates. However, such approaches typically require detailed information about phase fractions or elastic constants of individual phases, which are not consistently available in compiled experimental datasets like the one used here. The present study does not aim to replace physically based interpretations, but rather to explore how experimentally reported descriptors can be utilized within a data-driven framework. The focus is placed on comparing modeling approaches and examining their behavior, rather than on proposing a new empirical formulation.
The combined use of predictive modeling and SHAP-based analysis suggests that the proposed framework is suitable for studying relationships between input features and YM. Given that the present work relies on a compiled experimental dataset, data-driven approaches may be considered as supportive tools for the preliminary analysis of Ti-alloys with different elastic properties.
It should also be noted that Young’s modulus is primarily governed by atomic bonding and crystal structure and is generally less sensitive to microstructural variation than strength-related properties. The variability observed in the experimental YM values considered in this study arises mainly from differences in alloy chemistry, phase constitution, and measurement conditions reported across literature sources. As the dataset was compiled from multiple experimental studies, variations in testing methods and reporting practices are unavoidable. Missing or incomplete entries were handled using a consistent imputation strategy, as described in Section 2, while acknowledging that residual noise and uncertainty remain. These aspects should be considered when interpreting the reported prediction accuracy and R2 values, which reflect both model performance and the inherent variability of the underlying experimental data.

5. Conclusions

In this study, machine learning models were applied to predict the Young’s modulus of multicomponent titanium alloys using mechanical, microstructural, and phase-related descriptors. The results show that data-driven approaches can capture relationships between these features and elastic behavior, with tree-based ensemble models providing reliable predictive performance. SHAP analysis indicated that yield strength, hardness, and phase-stability-related descriptors contribute meaningfully to the predictions, reflecting known structure–property trends in titanium alloys.
While the dataset was compiled from multiple literature sources and therefore contains inherent variability, the findings demonstrate that machine learning can serve as a useful complementary tool for analyzing experimentally reported alloy data. Future work may focus on expanding the dataset and incorporating additional physically informed descriptors to further improve robustness and generalization.

Author Contributions

Conceptualization, S.D., Y.A. and L.B.; methodology, S.D., Y.A. and L.B.; data curation and analysis, S.D., Y.A. and L.B.; visualization, S.D., Y.A. and L.B.; writing—review and editing, S.D., Y.A. and L.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All data related to this study are available from the author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MLMachine Learning
YMYoung’s modulus
YSYield strength
UTSUltimate tensile strength
HVVickers hardness
DARDeformation at rupture
moeMolybdenum equivalent parameter
RFRandom Forest
MLPMulti-Layer Perceptron
SHAPSHapley Additive exPlanations

References

  1. Pushp, P.; Dasharath, S.M.; Arati, C. Classification and applications of titanium and its alloys. Mater. Today Proc. 2022, 54, 537–542. [Google Scholar] [CrossRef]
  2. Gummadi, J.; Alanka, S. A review on titanium and titanium alloys with other metals for biomedical applications prepared by powder metallurgy techniques. Mater. Today Proc. 2023, in press. [Google Scholar] [CrossRef]
  3. Yan, S.; Song, G.-L.; Li, Z.; Wang, H.; Zheng, D.; Cao, F.; Horynova, M.; Dargusch, M.S.; Zhou, L. A state-of-the-art review on passivation and biofouling of Ti and its alloys in marine environments. J. Mater. Sci. Technol. 2018, 34, 421–435. [Google Scholar] [CrossRef]
  4. Kang, L.; Yang, C. A Review on High-Strength Titanium Alloys: Microstructure, Strengthening, and Properties. Adv. Eng. Mater. 2019, 21, 1801359. [Google Scholar] [CrossRef]
  5. Williams, J.; Boyer, R. Opportunities and Issues in the Application of Titanium Alloys for Aerospace Components. Metals 2020, 10, 705. [Google Scholar] [CrossRef]
  6. Alshammari, Y.; Yang, F.; Bolzoni, L. Fabrication and characterisation of low-cost powder metallurgy Ti-xCu-2.5Al alloys produced for biomedical applications. J. Mech. Behav. Biomed. Mater. 2022, 126, 105022. [Google Scholar] [CrossRef]
  7. Akbarpour, M.R.; Mirabad, H.M.; Hemmati, A.; Kim, H.S. Processing and microstructure of Ti-Cu binary alloys: A comprehensive review. Prog. Mater. Sci. 2022, 127, 100933. [Google Scholar] [CrossRef]
  8. Alshammari, Y.; Jia, M.; Yang, F.; Bolzoni, L. The effect of α+ β forging on the mechanical properties and microstructure of binary titanium alloys produced via a cost-effective powder metallurgy route. Mater. Sci. Eng. A 2019, 769, 138496. [Google Scholar] [CrossRef]
  9. Cardoso, G.C.; Buzalaf, M.A.; Correa, D.R.; Grandini, C.R. Effect of Thermomechanical Treatments on Microstructure, Phase Composition, Vickers Microhardness, and Young’s Modulus of Ti-xNb-5Mo Alloys for Biomedical Applications. Metals 2022, 12, 788. [Google Scholar] [CrossRef]
  10. Semenova, I.; Polyakov, A.; Gareev, A.; Makarov, V.; Kazakov, I.; Pesin, M. Machinability Features of Ti-6Al-4V Alloy with Ultrafine-Grained Structure. Metals 2023, 13, 1721. [Google Scholar] [CrossRef]
  11. Klinge, L.; Kluy, L.; Spiegel, C.; Siemers, C.; Groche, P.; Coraça-Huber, D. Nanostructured Ti-13Nb-13Zr alloy for implant application—Material scientific, technological, and biological aspects. Front. Bioeng. Biotechnol. 2023, 11, 1255947. [Google Scholar] [CrossRef] [PubMed]
  12. Karre, R.; Niranjan, M.K.; Dey, S.R. First principles theoretical investigations of low Young’s modulus beta Ti–Nb and Ti–Nb–Zr alloys compositions for biomedical applications. Mater. Sci. Eng. C 2015, 50, 52–58. [Google Scholar] [CrossRef] [PubMed]
  13. He, Z.; He, H.; Lou, J.; Li, Y.; Li, D.; Chen, Y.; Liu, S. Fabrication, Structure and Mechanical and Ultrasonic Properties of Medical Ti6Al4V Alloys Part I: Microstructure and Mechanical Properties of Ti6Al4V Alloys Suitable for Ultrasonic Scalpel. Materials 2020, 13, 478. [Google Scholar] [CrossRef] [PubMed]
  14. Ozan, S.; Lin, J.; Zhang, Y.; Li, Y.; Wen, C. Cold rolling deformation and annealing behavior of a β-type Ti–34Nb–25Zr titanium alloy for biomedical applications. J. Mater. Res. Technol. 2020, 9, 2308–2318. [Google Scholar] [CrossRef]
  15. Meng, Q.; Zhang, J.; Huo, Y.; Sui, Y.; Zhang, J.; Guo, S.; Zhao, X. Design of low modulus β-type titanium alloys by tuning shear modulus C44. J. Alloys Compd. 2018, 745, 579–585. [Google Scholar] [CrossRef]
  16. Doan Tran, H.; Kim, C.; Chen, L.; Chandrasekaran, A.; Batra, R.; Venkatram, S.; Kamal, D.; Lightstone, J.P.; Gurnani, R.; Shetty, P.; et al. Machine-learning predictions of polymer properties with Polymer Genome. J. Appl. Phys. 2020, 128, 171104. [Google Scholar] [CrossRef]
  17. Han, T.; Huang, J.; Sant, G.; Neithalath, N.; Kumar, A. Predicting mechanical properties of ultrahigh temperature ceramics using machine learning. J. Am. Ceram. Soc. 2022, 105, 6851–6863. [Google Scholar] [CrossRef]
  18. Ghetiya, N.D.; Patel, K.M. Prediction of Tensile Strength in Friction Stir Welded Aluminium Alloy Using Artificial Neural Network. Procedia Technol. 2014, 14, 274–281. [Google Scholar] [CrossRef]
  19. Liu, J.; Zhang, Y.; Zhang, Y.; Kitipornchai, S.; Yang, J. Machine learning assisted prediction of mechanical properties of graphene/aluminium nanocomposite based on molecular dynamics simulation. Mater. Des. 2022, 213, 110334. [Google Scholar] [CrossRef]
  20. Kibrete, F.; Trzepieciński, T.; Gebremedhen, H.S.; Woldemichael, D.E. Artificial Intelligence in Predicting Mechanical Properties of Composite Materials. J. Compos. Sci. 2023, 7, 364. [Google Scholar] [CrossRef]
  21. Lee, J.A.; Almond, D.P.; Harris, B. The use of neural networks for the prediction of fatigue lives of composite materials. Compos. Part A Appl. Sci. Manuf. 1999, 30, 1159–1169. [Google Scholar] [CrossRef]
  22. Altinkok, N.; Koker, R. Neural network approach to prediction of bending strength and hardening behaviour of particulate reinforced (Al–Si–Mg)-aluminium matrix composites. Mater. Des. 2004, 25, 595–602. [Google Scholar] [CrossRef]
  23. Koker, R.; Altinkok, N.; Demir, A. Neural network based prediction of mechanical properties of particulate reinforced metal matrix composites using various training algorithms. Mater. Des. 2007, 28, 616–627. [Google Scholar] [CrossRef]
  24. Marković, G.; Manojlović, V.; Ružić, J.; Sokić, M. Predicting Low-Modulus Biocompatible Titanium Alloys Using Machine Learning. Materials 2023, 16, 6355. [Google Scholar] [CrossRef] [PubMed]
  25. Kamal, T.; Gouthama; Upadhyaya, A. Machine Learning Based Sintered Density Prediction of Bronze Processed by Powder Metallurgy Route. Met. Mater. Int. 2023, 29, 1761–1774. [Google Scholar] [CrossRef]
  26. Deng, Z.; Yin, H.; Jiang, X.; Zhang, C.; Zhang, K.; Zhang, T.; Xu, B.; Zheng, Q.; Qu, X. Machine leaning aided study of sintered density in Cu-Al alloy. Comput. Mater. Sci. 2018, 155, 48–54. [Google Scholar] [CrossRef]
  27. Dinibutun, S.; Alshammari, Y.; Parol, J.; Bolzoni, L. Machine learning approaches for predicting ultimate tensile strength in 9% Cr steels, International Conference on Civil and Environmental Engineering for Resilient. Smart Sustain. Solut. 2025, 48, 501–509. [Google Scholar] [CrossRef]
  28. Gao, Y.; Hu, Y.; Zhao, X.; Liu, Y.; Huang, H.; Su, Y. Machine-Learning-Driven Design of High-Elastocaloric NiTi-Based Shape Memory Alloys. Metals 2024, 14, 1193. [Google Scholar] [CrossRef]
  29. Salvador, C.A.F.; Maia, E.L.; Costa, F.H.; Escobar, J.D.; Oliveira, J.P. A compilation of experimental data on the mechanical properties and microstructural features of Ti-alloys. Sci. Data 2022, 9, 188. [Google Scholar] [CrossRef]
  30. Hussein, A.; Falcarin, P.; Sadiq, A. Enhancement performance of random forest algorithm via one hot encoding for IoT IDS. Period. Eng. Nat. Sci. PEN 2021, 9, 579–591. [Google Scholar] [CrossRef]
  31. James, G.; Witten, D.; Hastie, T.; Tibshirani, R.; Taylor, J. An Introduction to Statistical Learning, with Applications in Python; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
  32. Bolzoni, L.; Carson, J.K.; Yang, F. Combinatorial structural-analytical models for the prediction of the mechanical behaviour of isotropic porous pure metals. Acta Mater. 2021, 207, 116664. [Google Scholar] [CrossRef]
Figure 1. Workflow for predicting YM of Ti-alloys using machine learning methods and SHAP analysis.
Figure 1. Workflow for predicting YM of Ti-alloys using machine learning methods and SHAP analysis.
Metals 16 00233 g001
Figure 2. Correlation heatmap of independent variables.
Figure 2. Correlation heatmap of independent variables.
Metals 16 00233 g002
Figure 3. Distribution plots of numerical variables. Histograms (red bars) show the frequency of observations, and the red curves represent kernel density estimates illustrating the underlying distribution patterns.
Figure 3. Distribution plots of numerical variables. Histograms (red bars) show the frequency of observations, and the red curves represent kernel density estimates illustrating the underlying distribution patterns.
Metals 16 00233 g003
Figure 4. Count plots of categorical properties. Different colors are used to distinguish categories visually.
Figure 4. Count plots of categorical properties. Different colors are used to distinguish categories visually.
Metals 16 00233 g004
Figure 5. Actual and predicted values of YM.
Figure 5. Actual and predicted values of YM.
Metals 16 00233 g005
Figure 6. SHAP summary plot showing the impact of each feature on the CatBoost model’s prediction of YM. Each point represents a sample; red indicates high feature values, and blue indicates low. Features on the right increase the predicted value, while those on the left decrease it.
Figure 6. SHAP summary plot showing the impact of each feature on the CatBoost model’s prediction of YM. Each point represents a sample; red indicates high feature values, and blue indicates low. Features on the right increase the predicted value, while those on the left decrease it.
Metals 16 00233 g006
Figure 7. SHAP bar plot displaying the top 10 most important features based on their average absolute SHAP values. The plot highlights which features contributed the most to the CatBoost model’s predictions.
Figure 7. SHAP bar plot displaying the top 10 most important features based on their average absolute SHAP values. The plot highlights which features contributed the most to the CatBoost model’s predictions.
Metals 16 00233 g007
Table 1. Numerical properties and their descriptions.
Table 1. Numerical properties and their descriptions.
Property/Numerical VariableDescription
YMExperimental Young modulus
YM_errThe error associated with the Young’s modulus
YSExperimental yield stress
UTSUltimate tensile strength or maximum compression strength (in compression tests)
UTS_errThe error associated with the ultimate tensile strength
DARDeformation at the rupture point or maximum reported strain (negative values for compression tests)
HVExperimental Vickers hardness
HV_errThe error associated with the hardness
F3=1, the material shows a non-linear elastic behavior
Table 2. Statistics of numerical properties.
Table 2. Statistics of numerical properties.
ParameterCountMeanStdMin50%Max
YM24081.6121.84578157
YM_err1283.372.780.3320
YS274722.53344.83130662.51880
UTS193888.4417.323607622400
UTS_err5928.3424.3320108
DAR2608.8325.5−5012.3574
HV152299.9494.73134283560
HV_err1197.45.041629
F32860.220.41001
Table 3. Categorical properties and their descriptions.
Table 3. Categorical properties and their descriptions.
Property/Categorical VariableDescription
moe_classA classification based on the β-phase stability. Possible values are: “rich”, “near”, “meta”, “stable” or “other”.
ph1Identification of the predominant phase (matrix).
ph2Identification of the secondary phase.
ph3Identification of the tertiary phase.
conditionProcessing conditions to which the material was subjected: ST-WQ (solution treated and water quenched), ST-AC (ST and air-cooled), ST-FC (ST and furnace cooled), PM (powder-metallurgy), As-cast.
Table 4. Performance comparison of machine learning models in predicting the YM of Ti-alloys. The evaluation metrics include MAE, MSE, RMSE, and R2. The RF model achieved the highest accuracy among all models tested.
Table 4. Performance comparison of machine learning models in predicting the YM of Ti-alloys. The evaluation metrics include MAE, MSE, RMSE, and R2. The RF model achieved the highest accuracy among all models tested.
ModelR2MAEMSERMSE
Random Forest0.86086.503867.64748.2248
XGBoost0.85286.663871.55318.4589
CatBoost0.85236.533771.79968.4735
MLP0.51279.4196236.828115.3892
Stacking Regressor0.74717.9212122.880711.0852
Table 5. Actual and predicted values of YM for the first 10 tested samples.
Table 5. Actual and predicted values of YM for the first 10 tested samples.
SampleActual ValuePredicted ValueDifference
19090.4557−0.4557
280.3680.8105−0.4505
38485.8223−1.8223
45655.73920.2608
59296.8641−4.8641
653.762.5383−8.8383
77584.8045−9.8045
86668.7397−2.7397
9124121.65582.3442
1080.3668.427911.9321
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dinibutun, S.; Alshammari, Y.; Bolzoni, L. Machine Learning-Based Prediction of Young’s Modulus in Ti-Alloys. Metals 2026, 16, 233. https://doi.org/10.3390/met16020233

AMA Style

Dinibutun S, Alshammari Y, Bolzoni L. Machine Learning-Based Prediction of Young’s Modulus in Ti-Alloys. Metals. 2026; 16(2):233. https://doi.org/10.3390/met16020233

Chicago/Turabian Style

Dinibutun, Seza, Yousef Alshammari, and Leandro Bolzoni. 2026. "Machine Learning-Based Prediction of Young’s Modulus in Ti-Alloys" Metals 16, no. 2: 233. https://doi.org/10.3390/met16020233

APA Style

Dinibutun, S., Alshammari, Y., & Bolzoni, L. (2026). Machine Learning-Based Prediction of Young’s Modulus in Ti-Alloys. Metals, 16(2), 233. https://doi.org/10.3390/met16020233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop