Machine Learning Assisted Prediction of Microstructures and Young’s Modulus of Biomedical Multi-Component β -Ti Alloys

: Recently, the development of β -titanium (Ti) alloys with a low Young’s modulus as human implants has been the trend of research in biomedical materials. However, designing β -titanium alloys by conventional experimental methods is too costly and inefﬁcient. Therefore, it is necessary to propose a method that can efﬁciently and reliably predict the microstructures and the mechanical properties of biomedical titanium alloys. In this study, a machine learning prediction method is proposed to accelerate the design of biomedical multi-component β -Ti alloys with low moduli. Prediction models of microstructures and Young’s moduli were built at ﬁrst. The performances of the models were improved by introducing new experimental data. With the help of the models, a Ti– 13Nb–12Ta–10Zr–4Sn (wt.%) alloy with a single β -phase microstructure and Young’s modulus of 69.91 GPa is successfully developed. This approach could also be used to design other advanced materials.


Introduction
Titanium (Ti) alloys are widely used in biomedical applications, such as bone-fixation devices, owing to their excellent biocompatibility, corrosion resistance, low density, and superior overall mechanical properties [1,2].The Ti-6Al-4V (wt.%) alloy has long been used as a human implant material.However, the Young's modulus of Ti-6Al-4V (110 GPa) is much higher than the modulus of human bone (<35 GPa).The differences in Young's modulus between alloys and bones could lead to the stress shielding effect and degradation of bones [3].Moreover, aluminum and vanadium elements are not ideal for human implantations, because they are cytotoxic and could cause a range of physiological disorders with long-term exposure to the physiological environment [4,5].In the last several years, some researchers found that β-Ti alloys have high strength, excellent biocompatibility, and low Young's modulus, in comparison to α-Ti (e.g., pure Ti) or α + β-Ti (e.g., Ti-6Al-4V), which could have significant potential in biomedical applications [6,7].However, it has been generally accepted that it is difficult to achieve an excellent synergy between low modulus and phase stability in β-Ti alloys.Moreover, the complex mechanism of the influence of alloy elements on Young's modulus is also a challenge when designing new alloys.Therefore, a large number of studies have attempted to improve the alloys.Most of these studies are based on experimental methods [8][9][10].Moreover, some researchers have tried to design alloys by empirical calculation methods, such as the molybdenum equivalence (Mo eq ) [11][12][13] and d-electron theory [14,15].Computational material methods (including CALPHAD [16,17], the density functional theory (DFT) [18,19], and molecular dynamics (MDs) [20,21]) are also used.However, these methods are inefficient or require high computational resources.In this case, it necessary to find a new approach that can accurately and efficiently predict the Young's modulus of alloys.
Recently, with the development of artificial intelligence, machine learning has been widely used in various fields.In materials science, machine learning is used to establish influence factors (features)-property relationship to solve different challenges, such as multiple property designs [22][23][24][25][26][27][28].The properties of metallic materials could be modeled using machine learning, which accelerates the design and development of novel materials.Honysz [29] modeled artificial neural networks and successfully predicted the chemical composition of ferritic stainless steels for corresponding mechanical properties.Churyumov et al. [30] modeled an artificial neural network to analyze how alloying element concentrations influenced the behavior of hot deformation.Machine learning could also be used for β-Ti alloys, for which the moduli are close to the moduli of human bones.For instance, relationships between the compositions and Young's moduli of the β-Ti alloys were built in the studies of Yang et al. [31] and Wu et al. [32].Low-modulus β-Ti alloys that have a Young's modulus less than 50 GPa were designed with the help of their machine learning models.In their studies, however, the compositions of the alloys were regarded as the only features.Other important features (such as the solution temperature, solution time, and mixing enthalpy) were not considered in their study.Furthermore, some researchers used machine learning to find features that could describe the Young's moduli of β-Ti alloys more exhaustively.Xiong et al. [33] discovered four critical features that strongly affected the energetic stability and Young's modulus in the binary Ti-Zr system.However, their results are not suitable for multi-component β-Ti alloy systems.
To solve the problems mentioned above, a machine learning method is proposed in the present study, to predict the microstructures and Young's moduli of the biomedical multi-component β-Ti alloys.Furthermore, new features embedded with the knowledge of the domain materials of alloys are established.A classification model is built to predict whether the alloy is composed of single β-phase or not, while a regression model is built to predict the Young's modulus.These two models are modified by introducing new experimental data.Finally, the accuracies of the models reach a high level, which could be of vital use for the design of biomedical multi-component β-Ti alloys.

Approach
A machine learning prediction method was proposed to predict the microstructures and Young's moduli of the biomedical multi-component β-Ti alloys, as shown in Figure 1.The method was composed of four main steps: (1) data collection; (2) feature selection; (3) model building; and (4) experiments.
(1) Collecting biomedical Ti alloys data reported in the relevant literature as machine learning databases were used for the training and testing of models.(2) It is difficult to interpret the prediction results only considering the chemical compositions as features for machine learning.The interpretability and robustness of the model could be improved by embedding the material domain knowledge into the machine learning.Some features were suggested, including heat-treatment process parameters, and macroscopic as well as microscopic properties.Following the feature selection method, the most relevant features were retained.(3) A classification model and a regression model were built.The classification model was utilized to predict the microstructures, and the regression model was utilized to predict Young's modulus.It is worth noting that there were many hyperparameters to optimize the modeling process.The Bayesian optimization algorithm was used to automatically find the hyperparameters.The optimized hyperparameters were then subjected to machine learning training.(4) Once the models were built, the microstructures and Young's modulus of the β-Ti alloys could be predicted in the virtual space of the compositions and heat-treatment process.To validate the performances of the models, the multi-component Ti alloys were prepared.Subsequently, the observation of the microstructures and mechanical property test were performed.Finally, the experimental results were fed back into the machine learning databases for the next iteration and design.
process parameters, and macroscopic as well as microscopic properties.Following the feature selection method, the most relevant features were retained.(3) A classification model and a regression model were built.The classification model was utilized to predict the microstructures, and the regression model was utilized to predict Young's modulus.It is worth noting that there were many hyperparameters to optimize the modeling process.The Bayesian optimization algorithm was used to automatically find the hyperparameters.The optimized hyperparameters were then subjected to machine learning training.(4) Once the models were built, the microstructures and Young's modulus of the β-Ti alloys could be predicted in the virtual space of the compositions and heat-treatment process.To validate the performances of the models, the multi-component Ti alloys were prepared.Subsequently, the observation of the microstructures and mechanical property test were performed.Finally, the experimental results were fed back into the machine learning databases for the next iteration and design.

Figure 1.
A machine learning prediction method to predict the microstructures and Young's moduli of the biomedical multi-component β-Ti alloys.

Datasets
Two datasets, the microstructure and Young's modulus datasets (see Supplementary Materials), were collected from the relevant literature reports and used to train the classification and regression models, respectively.All the data were carefully selected.The records with errors and data with low reliability were not included.Finally, we collected 342 different alloy compositions and 235 Young's moduli with different β-Ti alloy compositions.In the Young's modulus dataset, all the alloys were single β-phase alloys prepared by the quenching process, with only minor amounts of α″ and ω phases permitted.Each dataset was randomly divided into training and testing datasets in the ratio of 7:3.The training dataset was used to train the model, and the testing dataset was used to test the model's performance.
Embedding the physical and chemical properties of materials as features into the machine learning model could provide a priori knowledge to improve the model's performance and interpret the predictions [34][35][36].Thus, we examined some physical and chemical property parameters that potentially influenced the microstructures and Young's modulus of Ti alloys.The features included alloy compositions, solution temperature, solution time, molybdenum equivalence (   ), the average valence electron concentration ( ̅̅̅̅̅̅ ), the average density ( ̅ ), the average bulk modulus ( ̅ ) and the average bulk modulus (  ̅̅̅̅ ), as seen in Table 1.
Table 1.A total of 33 features used to build the machine learning models.

Features Formula Range Description
Figure 1.A machine learning prediction method to predict the microstructures and Young's moduli of the biomedical multi-component β-Ti alloys.

Datasets
Two datasets, the microstructure and Young's modulus datasets, were collected from the relevant literature reports and used to train the classification and regression models, respectively.All the data were carefully selected.The records with errors and data with low reliability were not included.Finally, we collected 342 different alloy compositions and 235 Young's moduli with different β-Ti alloy compositions.In the Young's modulus dataset, all the alloys were single β-phase alloys prepared by the quenching process, with only minor amounts of α" and ω phases permitted.Each dataset was randomly divided into training and testing datasets in the ratio of 7:3.The training dataset was used to train the model, and the testing dataset was used to test the model's performance.
Embedding the physical and chemical properties of materials as features into the machine learning model could provide a priori knowledge to improve the model's performance and interpret the predictions [34][35][36].Thus, we examined some physical and chemical property parameters that potentially influenced the microstructures and Young's modulus of Ti alloys.The features included alloy compositions, solution temperature, solution time, molybdenum equivalence (Mo eq ), the average valence electron concentration (VEC), the average density (D), the average bulk modulus (E) and the average bulk modulus (B k ), as seen in Table 1.
Valence electrons [39] δr Difference of atomic radii [37] ∆χ Pauling electronegativity [37] ∆H Approximate interatomic bonding force [41] Note: Machine learning algorithms calculate the Euclidean distance between data points.Each data point is different in magnitude, unit, and range.Hence, all the features are normalized, so that they reach the same magnitude, and are scaled to the interval (0,1), according to formula (1): where X is the normalized processed dataset, X is the initial dataset, X max is the maximum value of the initial dataset and X min is the minimum value of the initial dataset.

Feature Selection
The feature selection method is an essential part of the "data preprocessing" process in machine learning.The performance of the model can be adversely impacted by irrelevant and redundant features.Using feature selection to remove these features, the prediction capability and robustness of the model can be improved.Currently, the most popular feature reduction methods are filter, wrapper and embedded methods [42].The wrapper method finds the optimal subset of candidate features at the cost of computation.In comparison to the other two methods, the wrapper method is highly accurate and efficient.In the present study, the forward sequential feature selection (FSS), one of the wrapper methods, was used to maximize the prediction accuracy.The implementation details are as follows: (1) Create an empty subset of features.
(2) Randomly insert a new feature into the previous subset of features.The newly inserted feature can be kept as part of the subset of features, if it improves the performance of the model.( 3) Repeat (2) until no more features are available to be inserted into the subset of features.(4) Keep repeating processes (1)-( 3) until no more optimal subsets of the features are found.
With the FSS method, the most optimal feature subset could be selected to maximize the performance of the model.

Model Building 2.4.1. Classification Models
To predict the microstructures, seven classification models were used: light gradientboosting machine classification (LGBMC), extreme gradient-boosting classification (XGBC), gradient-boosting decision-tree classification (GBDTC), random forest classification (RFC), support vector machine based on linear (SVC.l) and Gaussian kernels (SVC.r), and K-Nearest Neighbors (KNN).Accuracy (accuracy) is used as a metric to assess the performance of the classification model.The closer the value is to 1, the better the ability of the model to identify the β-phase and non-β-phase, see formula (2).The ROC curve is also used to evaluate the generalization ability of the classification model.The area value under the ROC curve, or AUC, is calculated to indicate the performance of the classification model.
where True Positive and True Negative indicate actual β-phase alloy, predicted β-phase alloy and non-β-phase alloy, respectively; False Positive and False Negative indicate actual not β-phase alloy, predicted β-Ti alloy and non-β-Ti alloy, respectively.

Regression Models
To predict the Young's modulus, seven regression models, light gradient-boosting machine regression (LGBMR), extreme gradient-boosting regression (XGBR), gradientboosting decision-tree regression (GBDTR), random forest regression (RFR), support vector machine based on linear (SVR.l) and Gaussian kernels (SVR.r), and ElasticNet (ENET) were used.The root-mean-square error (RMSE) and R-squared coefficients (R 2 ) were chosen for the evaluation metrics.The RMSE was used to measure the average difference between the predicted and true values of the model, as shown in formula (3).The R 2 coefficient reflects the extent to which the variation in Young's modulus can be explained by the features, as calculated in formula (4).Its value ranges from 0 to 1.As R 2 approaches 1, the better the fit of the regression.
where y denotes the experimentally measured Young's modulus, ŷ denotes the predicted Young's modulus and N denotes the size of the dataset.

K-Fold Cross Validation
Cross validation was used to evaluate the machine learning performance.The dataset was randomly divided into k subsets.The k-1 of these subsets was used as the training of the model and the remaining subsets were used as the testing of the model.This resulted in k-division cases.Finally, the performance of the model was evaluated by averaging the evaluation results of the k-testing cases.Generally, there was no definite value of k [43].In this work, the value of k was taken as 5.

Hyperparameter Optimization
Machine learning algorithms have many hyperparameters to be optimized in the modeling process.The grid search is the most widely used hyperparameter search algorithm.The optimal value was determined by finding all the points in the search range.However, grid search is resource intensive, especially when many hyperparameters have to be tuned.Thus, in the current work, Bayesian optimization was used to accelerate the hyperparameter tuning process.In comparison to grid search, Bayesian optimization uses a Gaussian process that takes into account a priori parameter information.This process continuously updates the optimization, significantly reducing the number of computational iterations.

Modeling Method
In this study, the computational methods described above were implemented using the Python3 language.Scikit-learn [44], xgboost created by Chen [45] and the lightgbm library created by Microsoft [46] were used for machine learning modeling.The Bayesian optimization was conducted using the hyperopt library [47] in Python3 language.All libraries are used under open-source licenses.

Experiment Method
Ti-Mo-Nb-Ta-Zr-Sn (wt.%) alloys were prepared by arc melting in a vacuum melting furnace under a high-purity Ar atmosphere using raw materials of high purity Ti, Mo, Ta, Zr and Sn.The melting process was performed at least five times to ensure the homogeneity of the ingots.After arc melting, the ingots were sealed in quartz tubes under Ar atmosphere and heat treated at 1473 K for 30 h and then water quenched.The samples were etched under an etching solution consisting of HF:HNO 3 :H 2 O = 1:4:16 after mechanical polishing.The microstructures of the alloys were obtained by optical microscopy (OM, Leica, Wetzlar, Germany).The phase compositions of the samples were analyzed using X-ray diffraction (XRD, D8, Bruker Co., Billerica, MA, USA), at an operating 40 KV and 40 mA, using Cu Kα radiation.The chemical compositions of the alloys were determined by energy dispersive spectroscopy (SEM-EDS, SU70 Hitachi, Tokyo, Japan) and all spectra were recorded at 20 kV accelerating voltage.The Young's moduli of the alloys were measured by a nanoindentation instrument (CSM-NHT, Baar, Switzerland) equipped with a Berkovich indenter.
To reduce errors, we obtained the average of five measurements with a loading force of 20 mN, a loading rate of 1.0 mN/s and a dwell time of 5 s.

Features Correlation Analysis
The Pearson product-moment correlation coefficient (PCC) [48] was used to observe the degree of correlation between the features.The p-value was used to measure the degree of correlation between two features.The closer the absolute value was to 1, the stronger the correlation.A positive value indicated a positive correlation and a negative value indicated a negative correlation.From Figure 2a, it can be observed that the absolute values of the p-values between some features are close to 1 in the microstructures dataset.For example, Ti (mass %) was strongly negatively correlated with Z e f f (the average effective nuclei charges) and ∆S mix (the entropy of mixing), and Ti (mass %) and S h (specific heat) were strongly positively correlated.Similar results were also obtained from the Young's modulus dataset, as shown in Figure 2b.Hence, feature selection is required to reduce redundant features.The number of features and the optimal combination of features can be determined.

Features Correlation Analysis
The Pearson product-moment correlation coefficient (PCC) [48] was used to observe the degree of correlation between the features.The p-value was used to measure the degree of correlation between two features.The closer the absolute value was to 1, the stronger the correlation.A positive value indicated a positive correlation and a negative value indicated a negative correlation.From Figure 2a, it can be observed that the absolute values of the p-values between some features are close to 1 in the microstructures dataset.For example, Ti (mass %) was strongly negatively correlated with   ̅̅̅̅̅̅ (the average effective nuclei charges) and ∆  (the entropy of mixing), and Ti (mass %) and  ℎ ̅̅̅ (specific heat) were strongly positively correlated.Similar results were also obtained from the Young's modulus dataset, as shown in Figure 2b.Hence, feature selection is required to reduce redundant features.The number of features and the optimal combination of features can be determined.

Forward Sequential Feature Selection
The results are shown in Figure 3.The microstructures and Young's modulus datasets were processed using the forward sequential feature selection (FSS) method.
From Figure 3a, it can be observed that only a few features are initially considered for modeling.It appears that the model is underfitting during the training process due to only a few features supplying a limited amount of information.As the number of features increases, the performance of the model gradually improves and reaches a peak.However, too many features degrade the model's performance.The reason is that redundant features lead to overfitting.From Figure 3a, it can be observed that the highest prediction accuracy is achieved when the number of features is increased to 7 for the classification model.From Figure 3b, it can be observed that the error reaches a minimum when the number of features is 12 for the regression model.As a result, we finally determined the combination of features with the best cross-validation scores as the features for model training, as shown in Table 2.

Forward Sequential Feature Selection
The results are shown in Figure 3.The microstructures and Young's modulus datasets were processed using the forward sequential feature selection (FSS) method.From Figure 3a, it can be observed that only a few features are initially considered for modeling.It appears that the model is underfitting during the training process due to only a few features supplying a limited amount of information.As the number of features increases, the performance of the model gradually improves and reaches a peak.However, too many features degrade the model's performance.The reason is that redundant features lead to overfitting.From Figure 3a, it can be observed that the highest prediction accuracy is achieved when the number of features is increased to 7 for the classification model.From Figure 3b, it can be observed that the error reaches a minimum when the number of features is 12 for the regression model.As a result, we finally determined the combination

The Predicting Model of the Microstructures
Figure 4a shows the prediction performance of different classification models by fivefold cross validation.Among all the models, the GBDTC model had the highest prediction accuracy with a cross-validation score of 0.92, and the SVC.l and KNN models had the worst prediction accuracies of 0.83 and 0.86 only.Hence, the GBDTC model was used to predict whether it was a β-phase or not.
After training the GBDTC model on the training dataset, the ROC curve on the testing dataset was obtained, with an AUC value of 0.96, as shown in Figure 4b.It can be concluded that the performance of the GBDTC model for prediction accuracy and generalization ability distinguishes well between β-phase alloys and non-β-phase alloys.Figure 4a shows the prediction performance of different classification models by fivefold cross validation.Among all the models, the GBDTC model had the highest prediction accuracy with a cross-validation score of 0.92, and the SVC.l and KNN models had the worst prediction accuracies of 0.83 and 0.86 only.Hence, the GBDTC model was used to predict whether it was a β-phase or not.

The Predicting Model of the Microstructures
Figure 4a shows the prediction performance of different classification models by fivefold cross validation.Among all the models, the GBDTC model had the highest prediction accuracy with a cross-validation score of 0.92, and the SVC.l and KNN models had the worst prediction accuracies of 0.83 and 0.86 only.Hence, the GBDTC model was used to predict whether it was a β-phase or not.
After training the GBDTC model on the training dataset, the ROC curve on the testing dataset was obtained, with an AUC value of 0.96, as shown in Figure 4b.It can be concluded that the performance of the GBDTC model for prediction accuracy and generalization ability distinguishes well between β-phase alloys and non-β-phase alloys.After training the GBDTC model on the training dataset, the ROC curve on the testing dataset was obtained, with an AUC value of 0.96, as shown in Figure 4b.It can be concluded that the performance of the GBDTC model for prediction accuracy and generalization ability distinguishes well between β-phase alloys and non-β-phase alloys.

The Predicting Model of Young's Modulus
Figure 5a shows the prediction performance of the regression models by five-fold cross validation.The tree-based models had a small cross-validation error, whereas the RFR model had the smallest cross-validation root-mean-square error of 6.67 GPa.The SVR.l and ENET linear models had larger errors because they had difficulty in handling high-dimensional data.In this study, we chose the RFR model to predict the Young's modulus of the alloys.

The Predicting Model of Young's Modulus
Figure 5a shows the prediction performance of the regression models by five-fold cross validation.The tree-based models had a small cross-validation error, whereas the RFR model had the smallest cross-validation root-mean-square error of 6.67 GPa.The SVR.l and ENET linear models had larger errors because they had difficulty in handling high-dimensional data.In this study, we chose the RFR model to predict the Young's modulus of the alloys.

Feature Importance
Feature importance is the relative importance score of each feature in making predictions about the model, which is critical for the interpretation of the model.Following the training of the model on the training dataset, the importance of the features was assessed.Figure 6 shows the feature importance calculated by the GBDTC and RFR models, respectively.As shown in Figure 6a, it can be observed that the importance of the   feature in the GBDTC model is the highest, at 46%.Additionally, it is much higher than other features and contributes the most to the prediction.From Figure 6b, it can be observed that the features that can be found to influence the RFR model are   ̅̅̅̅ ;   ̅̅̅ ;   ;  ̅ and Bonding, whose importance is 16%, 15%, 13%, 11%, and 11%, respectively.  ̅̅̅̅ and  ̅ are the average bulk modulus and the average Young's modulus of the alloy, respectively, which are useful for the model to estimate the overall Young's modulus.Moreover,   contributes to the prediction of the model.Since it is a key parameter, it used to describe the stability and phase transformation of the solid solution phase, which, in turn, affects the Young's modulus of the alloys [49][50][51].On the other hand, it has been reported that a parameter (Bonding) [41] is proposed to represent the interatomic  It is clear that the data points on both the training and testing datasets are better and that the data points are well distributed in a diagonal shape.The deviation of the predicted values from the experimental values is not significant.

Feature Importance
Feature importance is the relative importance score of each feature in making predictions about the model, which is critical for the interpretation of the model.Following the training of the model on the training dataset, the importance of the features was assessed.Figure 6 shows the feature importance calculated by the GBDTC and RFR models, respectively.As shown in Figure 6a, it can be observed that the importance of the Mo eq feature in the GBDTC model is the highest, at 46%.Additionally, it is much higher than other features and contributes the most to the prediction.From Figure 6b, it can be observed that the features that can be found to influence the RFR model are B k ; K c ; ∆H mix ; E and Bonding, whose importance is 16%, 15%, 13%, 11%, and 11%, respectively.B k and E are the average bulk modulus and the average Young's modulus of the alloy, respectively, which are useful for the model to estimate the overall Young's modulus.Moreover, ∆H mix contributes to the prediction of the model.Since it is a key parameter, it used to describe the stability and phase transformation of the solid solution phase, which, in turn, affects the Young's modulus of the alloys [49][50][51].On the other hand, it has been reported that a parameter (Bonding) [41] is proposed to represent the interatomic bonding force of the alloy, which is in good agreement with the Young's modulus of the alloy.This is in accordance with the predicting results of our models.bonding force of the alloy, which is in good agreement with the Young's modulus of the alloy.This is in accordance with the predicting results of our models.

Experimental Validation
To validate the prediction accuracy and generalization ability of the models, the compositions of the alloys were chosen in a multi-component system.The predicted chemical composition space of the alloy was Ti-xMo-yNb-zTa-uZr-vSn, where (0 ≤ x, y, z, u, v < 20).With a step size of 1, the predicted compositions had 3200000 possibilities.The Ti-Nb-Ta-Zr system was selected as the master alloy because the system has been extensively studied [52].The microstructure and Young's modulus of the alloys of the Ti-Nb-Ta-Zr-Mo/Sn system were predicted.
In the first iteration of the experiments, alloy compositions were randomly selected from the predicted space for experiments, which were predicted to have a single β-phase and the Young's modulus predicted to be <60 GPa.Two alloys, Ti-19Nb-16Ta-7Zr-4Mo (I-1, wt.%) and Ti-19Nb-5Ta-1Zr-9Sn (I-2, wt.%), were predicted to be single β-phase, and their predicted Young's moduli were 56.15 GPa and 52.80 GPa, respectively.Then, a series of experimental characterizations and measurements were conducted to verify the predictions.
The compositions of the alloys were measured by EDS, see Table 3.As shown in Figure 7, the microstructures of both I-1 and I-2 alloys were the single β-phase observed by optical microscopy.The absence of other phase precipitations was confirmed by the XRD patterns of I-1 and I-2 alloys (Figure 8), where only the β diffraction peaks could be indexed.This is consistent with the predictions of the model.After performing the nanoindentation test, the Young's modulus of the I-1 and I-2 alloys are shown in Figure 9a.The experimental Young's modulus values of I-1 and I-2 alloys are 87.09± 2.30 GPa and 80.58 ± 1.42 GPa, respectively.Figure 9b shows the errors between the experimentally measured results and the predictions, which are 30.94GPa and 27.78 GPa, respectively.It can be observed that there is a large difference between them.This is ascribed to the lack of data samples of multi-component β-Ti alloys in the machine learning databases, which could lead to major errors in the model's predictions.Thus, the experimental data from I-1 and I-2 alloys were fed back to the machine learning databases for training.

Model Validation 3.4.1. Experimental Validation
To validate the prediction accuracy and generalization ability of the models, the compositions of the alloys were chosen in a multi-component system.The predicted chemical composition space of the alloy was Ti-xMo-yNb-zTa-uZr-vSn, where (0 ≤ x, y, z, u, v < 20).With a step size of 1, the predicted compositions had 3200000 possibilities.The Ti-Nb-Ta-Zr system was selected as the master alloy because the system has been extensively studied [52].The microstructure and Young's modulus of the alloys of the Ti-Nb-Ta-Zr-Mo/Sn system were predicted.
In the first iteration of the experiments, alloy compositions were randomly selected from the predicted space for experiments, which were predicted to have a single β-phase and the Young's modulus predicted to be <60 GPa.Two alloys, Ti-19Nb-16Ta-7Zr-4Mo (I-1, wt.%) and Ti-19Nb-5Ta-1Zr-9Sn (I-2, wt.%), were predicted to be single β-phase, and their predicted Young's moduli were 56.15 GPa and 52.80 GPa, respectively.Then, a series of experimental characterizations and measurements were conducted to verify the predictions.
The compositions of the alloys were measured by EDS, see Table 3.As shown in Figure 7, the microstructures of both I-1 and I-2 alloys were the single β-phase observed by optical microscopy.The absence of other phase precipitations was confirmed by the XRD patterns of I-1 and I-2 alloys (Figure 8), where only the β diffraction peaks could be indexed.This is consistent with the predictions of the model.After performing the nanoindentation test, the Young's modulus of the I-1 and I-2 alloys are shown in Figure 9a.The experimental Young's modulus values of I-1 and I-2 alloys are 87.09± 2.30 GPa and 80.58 ± 1.42 GPa, respectively.Figure 9b shows the errors between the experimentally measured results and the predictions, which are 30.94GPa and 27.78 GPa, respectively.It can be observed that there is a large difference between them.This is ascribed to the lack of data samples of multi-component β-Ti alloys in the machine learning databases, which could lead to major errors in the model's predictions.Thus, the experimental data from I-1 and I-2 alloys were fed back to the machine learning databases for training.In the second iteration of the experiment, two alloys, Ti-18Nb-9Ta-11Zr-6Mo (II-1, wt.%) and Ti-13Nb-12Ta-10Zr-4Sn (II-2, wt.%), were randomly selected, which were in the same system as the first iteration of the experimental alloys.The II-1 and II-2 alloys were predicted to be single β-phase, and their predicted Young's moduli were 82.39 GPa and 66.82 GPa, respectively.In Figures 7c and d, it is shown that the microstructures of both II-1 and II-2 alloys are single β-phase.The XRD pattern confirms that alloys II-1 and II-2 only have β-phases.The experimentally measured Young's moduli for them are 86.53 ± 2.86 GPa and 69.91 ± 2.79 GPa, respectively.The prediction errors of the model are only 4.14 GPa and 3.09 GPa.It can be found that, after the feedback of the experimental data from the previous round, the prediction errors are substantially reduced.The predictions were more accurate and reliable.

Latest Reference Validation
As shown in Table 4, the 9 β-Ti alloy compositions from the latest reported literature were collected to validate the models.These alloys were not included in the machine learning databases.For #1 to #2 alloys, the Young's modulus increased with the addition of Nb; #3 to #4, Young's modulus increased with the addition of Mo, which was in agreement with the predictions.For alloys #5 to #9, it is worth noting that the Fe was not included in the machine learning databases.Nevertheless, their Young's moduli were predicted with an error less than 1.5 GPa.The results show that the models exhibit good robustness and prediction accuracies, providing guiding suggestions for the subsequent optimization of alloy properties.In the second iteration of the experiment, two alloys, Ti-18Nb-9Ta-11Zr-6Mo (II-1, wt.%) and Ti-13Nb-12Ta-10Zr-4Sn (II-2, wt.%), were randomly selected, which were in the same system as the first iteration of the experimental alloys.The II-1 and II-2 alloys were predicted to be single β-phase, and their predicted Young's moduli were 82.39 GPa and 66.82 GPa, respectively.In Figure 7c,d, it is shown that the microstructures of both II-1 and II-2 alloys are single β-phase.The XRD pattern confirms that alloys II-1 and II-2 only have β-phases.The experimentally measured Young's moduli for them are 86.53 ± 2.86 GPa and 69.91 ± 2.79 GPa, respectively.The prediction errors of the model are only 4.14 GPa and 3.09 GPa.It can be found that, after the feedback of the experimental data from the previous round, the prediction errors are substantially reduced.The predictions were more accurate and reliable.

Latest Reference Validation
As shown in Table 4, the 9 β-Ti alloy compositions from the latest reported literature were collected to validate the models.These alloys were not included in the machine learning databases.For #1 to #2 alloys, the Young's modulus increased with the addition of Nb; #3 to #4, Young's modulus increased with the addition of Mo, which was in agreement with the predictions.For alloys #5 to #9, it is worth noting that the Fe was not included in the machine learning databases.Nevertheless, their Young's moduli were predicted with an error less than 1.5 GPa.The results show that the models exhibit good robustness and prediction accuracies, providing guiding suggestions for the subsequent optimization of alloy properties.The relationship between Young's modulus and alloying elements is quite complex, especially for multi-component alloys.In this study, the machine learning model proposed could predict the Young's modulus in a specific composition space.Its predictions are effective and reliable.However, the prediction model of Young's modulus is a tree ensemble model, which cannot directly derive a simple function or relation.For the purpose of exploring the dependencies of Young's modulus on the concentrations of the alloying elements, we predicted Young's modulus for the alloys of Ti-12Nb-xTa-yZr, Ti-18Nb-xTa-yZr, Ti-24Nb-xTa-yZr and Ti-30Nb-xTa-yZr systems, where 4 < x, y < 16 in steps of 1, as shown in Figure 10.It can be observed in Figure 10a,b that the model predicts a high Young's modulus when the Nb + Zr + Ta content is less 30 wt.%.The reason was because the precipitation of the hard brittle phase due to the low content of the β stabilizers, which lead to a high Young's modulus.Some researchers have studied the effects of alloying elements on Young's modulus in the Ti-Nb-Ta-Zr system.Sakaguchi et al. [55] discovered that the increasing Ta content lead to increasing Young's modulus in the Ti-30N-XTa-5Zr system when the Ta content increased from 10% to 15%.Additionally, the effect of the Zr element on Young's modulus was not significant in the Ti-30N-10Ta-XZr system when the Zr content increased from 5% to 10%.Dai et al. [56] found that the Young's modulus of the Ti-30Nb-5Ta-XZr system decreased when the Zr content increased from 6% to 9%, and the Young's modulus increased when the Zr content increased from 9% to 15%.Their results are consistent with our predicted results presented in Figure 10d.Additionally, it is worth noting that the lowest predicted Young's modulus value was 48.58 GPa, presented in Figure 10c.This indicates that the possibility of having a Young's modulus in the Ti-Nb-Ta-Zr system, which could provide guidance for the subsequent research work.
The relationship between Young's modulus and alloying elements is quite complex, especially for multi-component alloys.In this study, the machine learning model proposed could predict the Young's modulus in a specific composition space.Its predictions are effective and reliable.However, the prediction model of Young's modulus is a tree ensemble model, which cannot directly derive a simple function or relation.For the purpose of exploring the dependencies of Young's modulus on the concentrations of the alloying elements, we predicted Young's modulus for the alloys of Ti-12Nb-xTa-yZr, Ti-18Nb-xTa-yZr, Ti-24Nb-xTa-yZr and Ti-30Nb-xTa-yZr systems, where 4 < x, y < 16 in steps of 1, as shown in Figure 10.It can be observed in Figure 10a,b that the model predicts a high Young's modulus when the Nb + Zr + Ta content is less 30 wt.%.The reason was because the precipitation of the hard brittle phase due to the low content of the β stabilizers, which lead to a high Young's modulus.Some researchers have studied the effects of alloying elements on Young's modulus in the Ti-Nb-Ta-Zr system.Sakaguchi et al. [55] discovered that the increasing Ta content lead to increasing Young's modulus in the Ti-30N-XTa-5Zr system when the Ta content increased from 10% to 15%.Additionally, the effect of the Zr element on Young's modulus was not significant in the Ti-30N-10Ta-XZr system when the Zr content increased from 5% to 10%.Dai et al. [56] found that the Young's modulus of the Ti-30Nb-5Ta-XZr system decreased when the Zr content increased from 6% to 9%, and the Young's modulus increased when the Zr content increased from 9% to 15%.Their results are consistent with our predicted results presented in Figure 10d.Additionally, it is worth noting that the lowest predicted Young's modulus value was 48.58 GPa, presented in Figure 10c.This indicates that the possibility of having a low Young's modulus in the Ti-Nb-Ta-Zr system, which could provide guidance for the subsequent research work.

Conclusions
In this study, the problems faced by experimental and computational methods for biomedical multicomponent β-Ti alloys were firstly discussed.It was very difficult to test all the possible alloy compositions using traditional methods.Machine learning has become a popular method for designing materials with target properties in the recent years.Due to the limited data collected in materials science, machine learning has some limitations.
Hence, a machine learning prediction method was proposed that could efficiently and reliably predict the microstructures and Young's moduli of biomedical multi-component β-Ti alloys.The properties affecting the microstructures and Young's moduli were embedded in the modeling.Firstly, the GBDTC model was the best model to identify microstructures, and the RFR model was the best model for predicting Young's modulus.Finally, the experimental results showed that the models had excellent robustness, low prediction errors and the prediction results were in good agreement with the experiments.
With this approach, the Ti-13Nb-12Ta-10Zr-4Sn (wt.%) alloy was prepared with a single β-phase microstructure and a Young's modulus of 69.91 GPa.The relationship between the composition-microstructure process and parameters-properties of the alloys had good predictive capabilities in the new β systems, which can assist in the design and optimization of novel Ti alloys.

Figure 2 .
Figure 2. The heat map describes the correlation between the features of the datasets.The darker the color, the stronger the positive correlation between the features and characteristics; the lighter the color, the stronger the negative correlation: (a) the microstructures dataset and (b) the Young's modulus dataset.

Figure 2 .
Figure 2. The heat map describes the correlation between the features of the datasets.The darker the color, the stronger the positive correlation between the features and characteristics; the lighter the color, the stronger the negative correlation: (a) the microstructures dataset and (b) the Young's modulus dataset.

Figure 3 .
Figure 3. FSS feature selection results: (a) the microstructures dataset and (b) the Young's modulus dataset.

Figure 3 .
Figure 3. FSS feature selection results: (a) the microstructures dataset and (b) the Young's modulus dataset.

Figure 4 .
Figure 4. Evaluation of the classification model's performances: (a) cross-validation results under different classification models and (b) ROC curve distribution of the GBDTC model on the testing dataset.

Figure 4 .
Figure 4. Evaluation of the classification model's performances: (a) cross-validation results under different classification models and (b) ROC curve distribution of the GBDTC model on the testing dataset.

Figure 5 .
Figure 5. Evaluation of the regression model's performances: (a) cross-validation results under different regression models and (b) RFR model prediction on the training dataset and testing dataset.

Figure
Figure5bshows the prediction fits of the RFR model on the training and testing datasets.The blue points are the training dataset, and the RMSE and  2 coefficients on the training set are 2.7 GPa and 0.96, respectively.The orange points are the testing dataset, and the RMSE and  2 coefficients on the testing dataset are 6.21 GPa and 0.78, respectively.It is clear that the data points on both the training and testing datasets are better and that the data points are well distributed in a diagonal shape.The deviation of the predicted values from the experimental values is not significant.

Figure 5 .
Figure 5. Evaluation of the regression model's performances: (a) cross-validation results under different regression models and (b) RFR model prediction on the training dataset and testing dataset.

Figure
Figure5bshows the prediction fits of the RFR model on the training and testing datasets.The blue points are the training dataset, and the RMSE and R 2 coefficients on the training set are 2.7 GPa and 0.96, respectively.The orange points are the testing dataset, and the RMSE and R 2 coefficients on the testing dataset are 6.21 GPa and 0.78, respectively.It is clear that the data points on both the training and testing datasets are better and that the data points are well distributed in a diagonal shape.The deviation of the predicted values from the experimental values is not significant.

Figure 6 .
Figure 6.Feature importance analysis, which is used to observe the percentage of importance of features of the predicted target: (a) classification model and (b) regression model.

Figure 6 .
Figure 6.Feature importance analysis, which is used to observe the percentage of importance of features of the predicted target: (a) classification model and (b) regression model.

Figure 9 .
Figure 9. Comparing experimental and predicted results of Young's modulus of Ti-Mo-Nb-Ta-Zr-Sn alloys.(a) Young's modulus of the alloys was measured by the nanoindentation test, and (b) prediction errors of machine learning.

Figure 9 .
Figure 9. Comparing experimental and predicted results of Young's modulus of Ti-Mo-Nb-Ta-Zr-Sn alloys.(a) Young's modulus of the alloys was measured by the nanoindentation test, and (b) prediction errors of machine learning.

Table 1 .
A total of 33 features used to build the machine learning models.

Table 2 .
Metals 2022, 12, 796 8 of 16 of features with the best cross-validation scores as the features for model training, as shown in Metals 2022, 11, x FOR PEER REVIEW 8 of 16

Table 2 .
The optimal combinations of features were obtained after feature selection, which produces the best performance of the model.

Table 2 .
The optimal combinations of features were obtained after feature selection, which produces the best performance of the model.
3.2.The Establishment of the Predicting Models3.2.1.The Predicting Model of the Microstructures

Table 2 .
The optimal combinations of features were obtained after feature selection, which produces the best performance of the model.

Table 3 .
The compositions of alloys (wt.%) were measured by EDS and all spectra were recorded at an accelerating voltage of 20 kV.

Table 3 .
The compositions of alloys (wt.%) were measured by EDS and all spectra were recorded at an accelerating voltage of 20 kV.

Table 4 .
Alloy compositions and experimental Young's modulus were collected from the latest reported literature to estimate the prediction performance of the models.

Table 4 .
Alloy compositions and experimental Young's modulus were collected from the latest reported literature to estimate the prediction performance of the models.