The Prediction of the Undercooling Degree of As-Cast Irons and Aluminum Alloys via Machine Learning

: As-cast irons and aluminum alloys are used in various industrial ﬁelds and their phase and microstructure properties are strongly affected by the undercooling degree. However, existing studies regarding the undercooling degree are mostly limited to qualitative analyses. In this paper, a quantitative analysis of the undercooling degree is performed by collecting experimental data and employing machine learning. Nine machining learning models including Random Forest (RF), eXtreme Gradient Boosting (XGBOOST), Ridge Regression (RIDGE) and Gradient Boosting Regressor (GBDT) methods are used to predict the undercooling degree via six features, which include the cooling rate (CR), mean atomic covalence radius (MAR) and mismatch (MM). Four additional effective models of machine learning algorithms are then selected for a further analysis and cross-validation. Finally, the optimal machine learning model is selected for the dataset and the best combination of features is found by comparing the prediction accuracy of all possible feature combinations. It is found that RF model with CR and MAR features has the optimal performance results for predicting the undercooling degree.


Introduction
In terms of production volumes and application scales, iron and aluminum are two of the mostly utilized metals in the world. They have found applications in various industries such as mechanical engineering and shipping [1,2]. Due to its advantages of relatively low cost and wide processing adaptability, casting is one of the main methods of iron and aluminum material preparation. The casting process is always accompanied by the nucleation process, which plays an important role in metal solidification. The undercooling degree strongly affects the nucleation and additionally controls the phase composition, microstructure, properties and quality of as-cast materials [3][4][5].
There are many factors affecting the undercooling degree such as the metal nature, the cooling rate, the mismatch magnitude, the interfacial energy of the molten metals and the nucleated solid phase [6][7][8][9]. Generally, the undercooling degree increases with the cooling rate, which consequently increases both the nucleation and growth rates. Due to the limitation of the heat transfer process, rapid solidification technology, which is widely used in the industry, can only prepare alloys with extremely small dimensions. With the outstanding advances in deep undercooling technology, many metals and alloys have achieved relatively large undercooling degrees, which have greatly exceeded the critical undercooling degree for the homogeneous nucleation of liquid metals [10]. Deep undercooling techniques primarily include melt immersion floatation [11], suspension without vessel treatment [12] and free fall methods [13]. These methods can ensure a liquid metal state hundreds of degrees Celsius below the liquid phase line and then suddenly achieve a fast-solidification microstructure via nucleation. The development of deep undercooling technology via rapid solidification can contribute to grain refinement, the elimination of segregation, the expansion of the solid solution limit and the formation of sub-stable phases. Thus, material properties are improved. Battersby et al. [14] used a melt encasement (fluxing) technique to achieve high undercooling and systematically studied the velocity-undercooling relationship in samples of pure Ge and Ge doped with 0.01 at % Fe at undercooling up to 300 K. Jian et al. [15] studied the effect of undercooling on crystal-liquid interface energy in the growth mode of undercooled semiconductors. Li et al. [16] utilized containerless electromagnetic levitation processing to obtain the undercooling of 420 K using a two-step heating method in elemental semiconductor silicon. Li et al. [17] investigated Fe alloy melts containing 7.5, 15, 22.5 and 30 at% Ni and found that the undercooled degree had a strong influence on the structure evolution especially for grain refinement and recrystallization. Previously conducted investigations have mainly focused on the mechanism and qualitative analysis of deep undercooling. A model that could accurately predict the undercooling degree and thus increase the experimental and industrial cost-effectiveness as well as improve the processing accuracy is required but is not yet established.
With the rapid development of material informatics, machine learning (ML) has emerged as a new method to quantitatively predict material parameters based on a specific dataset. Various useful predictions have been performed by ML to obtain a quantitative analysis. Agrawal et al. [18] established ML models to predict the fatigue strength of steel, quantitatively analyze its relationship with the composition and processing parameters and eventually develop steels with a high fatigue strength. By employing support vector regression, sequenced minimum optimization regression and a multi-layer perceptron algorithm, Jiang et al. [19] proposed a model that accounted for the chemical composition, dendrite crystal parameters and measured temperature to predict the interface mismatch. Its accuracy was verified by the empirical formula and validated by the experimental results. Based on a database of density functional theory calculations, an ML model was developed by Meredig et al. [20] to predict the thermodynamic stability of arbitrary components without any additional inputs. Javed et al. [21] proposed a lattice constant prediction model based on the support vector machine. This model could optimize the lattice constants of perovskites with predetermined structures. The model was proven to be more efficient, faster and robust than the models based on the artificial neural network. In addition, ML has proven effective in material data mining collected from experiments or simulations as well as in the accurate prediction of material behavior. However, as different ML algorithms result in a variation of the prediction accuracy, the ML algorithm that should be employed for a specific material still requires further discussion.
In this paper, nine popular ML algorithms are considered to build a model for the undercooling degree prediction by mining the data from previously conducted experimental results. In Figure 1, the workflow diagram of this paper is presented. First, data samples are collected and filtered based on the literature survey. Following standardization, the data are then divided into training and testing sets according to a certain ratio. Subsequently, after nine ML models are used to mine data samples, four ML models are chosen based on their superior performance. Next, to achieve the accurate prediction of the material undercooling degree, the optimal model is obtained by comparing its evaluation indexes with the ones from the remaining models. Finally, the influence of different feature combinations on the prediction accuracy is investigated using the selected optimal model. Compared with previous qualitative understandings, we establish a model that can accurately predict the undercooling degree and thus increase the experimental and industrial cost-effectiveness as well as improve the processing accuracy. A quantitative analysis of the undercooling degree for the sake of an accurate industrial and experimental control is of great interest. the material undercooling degree, the optimal model is obtained by comparing its evaluation indexes with the ones from the remaining models. Finally, the influence of different feature combinations on the prediction accuracy is investigated using the selected optimal model. Compared with previous qualitative understandings, we establish a model that can accurately predict the undercooling degree and thus increase the experimental and industrial cost-effectiveness as well as improve the processing accuracy. A quantitative analysis of the undercooling degree for the sake of an accurate industrial and experimental control is of great interest.

Data Collection and Features Selection
In this paper, based on the conducted literature survey on undercooling [6][7][8][9]22], 63 datasets of undercooling are collected and screened with different substrate phases under nucleation phases such as β−Sn, BCC-Fe and FCC-Al (Table A1). The data are then divided into 50 training datasets and 13 testing datasets (the former for model construction and the latter for model validation). As features have a different effect on the target properties, the beneficial performance of the ML model heavily relies on the feature selection. Therefore, the reasonable selection of features is very important for establishing the model.
When selecting the features, substrate phases and nucleation phases are initially selected as two feature parameters involved in the establishment of the model. As substrate phases and nucleation phases are non-numerical data types that cannot be involved in the calculation, datasets have to be One-Hot encoded prior to establishing the model. In other words, the data are converted into zeroes and ones, i.e., the existing value is represented as a one while the non-existent value is represented as a zero. Following One-Hot encoding, the dimensionality of the features is changed. By considering the current data volume as a small sample of data, a change in the feature dimensionality is disadvantageous to the establishment of the ML model. During the model validation, the ML model did not perform well in choosing the former two features. Hence, other features are considered.
According to the literature survey, six feature variables that affect the undercooling are selected from a total of nine features (such as the substrate phase, nucleation phase, mismatch and lattice number) including the cooling rate (CR), the mean covalent atomic radius (MAR) [23], the number of lattices (NL), the mismatch (MM) [6], the mean Mendeleev number (MMN) and the nucleation and substrate plane (NSP). Here, the predicted target value is the undercooling. The MAR mean value is the average value of the mean atomic covalent nucleation radius and a substrate plane, which is used to express the properties of nucleation and the substrate phase. The NL mean value is the product of a substrate and the nucleation phase lattice constants. For a dense hexagonal structure, the NL is the value of a/c. The MMN mean value is the Mendeleev number mean value of nucleation and the substrate phase, which indicates the chemical properties of the nucleus phase and the base phase. The NSP mean value is the crystallographic representation of

Data Collection and Features Selection
In this paper, based on the conducted literature survey on undercooling [6][7][8][9]22], 63 datasets of undercooling are collected and screened with different substrate phases under nucleation phases such as β-Sn, BCC-Fe and FCC-Al (Table A1). The data are then divided into 50 training datasets and 13 testing datasets (the former for model construction and the latter for model validation). As features have a different effect on the target properties, the beneficial performance of the ML model heavily relies on the feature selection. Therefore, the reasonable selection of features is very important for establishing the model.
When selecting the features, substrate phases and nucleation phases are initially selected as two feature parameters involved in the establishment of the model. As substrate phases and nucleation phases are non-numerical data types that cannot be involved in the calculation, datasets have to be One-Hot encoded prior to establishing the model. In other words, the data are converted into zeroes and ones, i.e., the existing value is represented as a one while the non-existent value is represented as a zero. Following One-Hot encoding, the dimensionality of the features is changed. By considering the current data volume as a small sample of data, a change in the feature dimensionality is disadvantageous to the establishment of the ML model. During the model validation, the ML model did not perform well in choosing the former two features. Hence, other features are considered.
According to the literature survey, six feature variables that affect the undercooling are selected from a total of nine features (such as the substrate phase, nucleation phase, mismatch and lattice number) including the cooling rate (CR), the mean covalent atomic radius (MAR) [23], the number of lattices (NL), the mismatch (MM) [6], the mean Mendeleev number (MMN) and the nucleation and substrate plane (NSP). Here, the predicted target value is the undercooling. The MAR mean value is the average value of the mean atomic covalent nucleation radius and a substrate plane, which is used to express the properties of nucleation and the substrate phase. The NL mean value is the product of a substrate and the nucleation phase lattice constants. For a dense hexagonal structure, the NL is the value of a/c. The MMN mean value is the Mendeleev number mean value of nucleation and the substrate phase, which indicates the chemical properties of the nucleus phase and the base phase. The NSP mean value is the crystallographic representation of the mismatch between the nucleation and the substrate phase, which reveals the effect of different orientations of the crystallographic plane on the undercooling. For example, if the selected crystallographic surface of the nucleation phase is 111 and the selected crystallographic surface of the base phase is 100, then the NSP value is equal to 11,100. Furthermore, if the selected crystallographic surface of the nucleation phase is 100 and the crystallographic surface of the base phase is 110, then the NSP value is equal to −1,100,110. Here, the negative sign indicates that 1 is present and the first digit indicates that several different types of numbers are present while the following digit indicates the position of those numbers. An information summary of the dataset for a simple statistical analysis is presented in Table 1. When predicting material undercooling, if a large difference in magnitude between features is encountered, the ML model is affected. Therefore, the feature data have to be normalized, i.e., feature scaling has to be conducted. In this paper, z-score normalization is employed where datasets are modified by a mean of zero and a variance of one, as shown in Equation (1). This not only eliminates the impact of inconsistent data magnitude on ML but also ensures that data maintain the original distribution compared with the magnitudes of different features. It is worth noting that this treatment causes the loss of the meaning of the original data. However, it is beneficial for the establishment of ML models.
where x i is the original data, x is the mean of the original data and σ is the standard deviation of the original data.

Correlation Analysis and Machine Learning
In the ML algorithm, a low correlation between features should be ensured. In this paper, the Pearson correlation coefficient r is used to observe the correlation between features [24], as shown in Equation (2). Here we use the original data for the correlation analysis; the correlation coefficient ranges between the values of -1 and 1. The closer the absolute value of the coefficient is to 1, the stronger the correlation between the two variables is. When the coefficient is equal to 0, the two variables are not correlated. In this paper, when the absolute value of the coefficient between features is greater than 0.95 [25], its correlation is considered to be high. Consequently, the feature should be removed.
where X i and Y i are the values of the two undercooling degree factors and X and Y are the average values of these two factors. For these datasets, nine ML models are used for the undercooling prediction. These models are the RF model [26], the gradient boosting regressor (GBDT) [27], TREE [28], XGBOOST [29], RIDGE [30], the Bayesian Ridge (BR) [31], k-nearest-neighbor (KNN) [32], the least absolute shrinkage and selection operator (LASSO) [33] and the support vector machine (SVM (kernel = linear)) [34] model. The variations between different ML models in the predicted results are compared to select the most suitable models for datasets and further analysis. Top-ranked ML models are selected and validated using the k-fold (with k = 5) cross-validation [35]. This method randomly divides the input datasets into k groups of equal size. These datasets are used for the ML model training while the remaining groups are denoted as the testing data. When evaluating different ML models, certain parameters are employed to indicate the strengths and weaknesses of the ML models. Model adjustment based on the feedback from the evaluation metrics is a key parameter in model evaluation. In this paper, the mean absolute error (MAE) is employed, which demonstrates improved reflections of the actual prediction error. The root mean squared error (RMSE) is also employed, which is used to measure the deviation between the predicted and the actual value. Furthermore, the RMSE can eliminate the effect of different magnitudes between features. The average square of the Pearson product-moment correlation coefficient (R 2 ) is used as a generalized performance evaluation parameter controlling the goodness of fit of the ML model, as shown in Equations (3)-(5). The optimal combination of features is considered after selecting the best model by comparing the evaluation metrics. In this paper, the scikit-learn package [36] is used to process the datasets and establish the ML models.
where y i are the actual values andŷ i are the predicted values.

Correlation Analysis and Algorithm Selection
All of the features as well as the predicted values are briefly described in Table 1. In Figure 2, the correlation values between the features are shown. The color between the CR and the target feature undercooling (UR) is yellow to indicate that the CR had a high influence on the target feature. The remaining features demonstrated a low correlation. Therefore, they were retained. Different ML models have different predictive capabilities. Due to the complexity of the datasets and material properties, researchers usually do not specify which ML algorithm is the most suitable. In addition, while the predicted values for specific attributes heavily depend on the ML algorithm selection, it is necessary to evaluate the performance and output of the chosen algorithm to assess the degree of uncertainty arising from its choice. An R 2 comparison of nine ML models is presented in Figure 3. It can be seen that three of the top four ML models were integrated models. The RF model showed the best performance with a value of 0.831. There was a minor difference between the other three ML models. RIDGE was the next model performing relatively well while the worst model Different ML models have different predictive capabilities. Due to the complexity of the datasets and material properties, researchers usually do not specify which ML algorithm is the most suitable. In addition, while the predicted values for specific attributes heavily depend on the ML algorithm selection, it is necessary to evaluate the performance and output of the chosen algorithm to assess the degree of uncertainty arising from its choice. An R 2 comparison of nine ML models is presented in Figure 3. It can be seen that three of the top four ML models were integrated models. The RF model showed the best performance with a value of 0.831. There was a minor difference between the other three ML models. RIDGE was the next model performing relatively well while the worst model was the SVM (kernel = linear) with a value of 0.34. In summary, the prediction results varied significantly with respect to different ML models. Thus, it was necessary to carefully select ML models. In order to further analyze the models, the top four ML models were selected. In addition, the RIDGE model was substituted in place of the TREE model due to its overfitting problem. To summarize, the RF, XGBOOST, GBDT and RIDGE models were used to analyze the material undercooling. The results showed that when ML models were employed to predict the material properties, various ML models performed inconsistently for the same datasets, which was in accordance with [37][38][39].  To further illustrate the performance capability of different algorithms for the same datasets, the strengths and weaknesses of each model were evaluated from the MAE and the RMSE. In Table 2, the average training and testing set evaluation results of different ML models according to Figure 4 are listed; each model was tuned up to 10 iterations. The R 2 values of four ML models were similar with all of them being above 0.85. The RF model In Figure 3, the results of a single training are displayed. In Figure 4, a comparison of R 2 values under four different ML algorithms trained for 10 times is shown. It displays that each algorithm fluctuated with different training times. In the second model training, the four algorithms showed relatively low R 2 values compared with other training times. More specifically, the R 2 value of the RIDGE algorithm was equal to 0.672. This was due to the random selection of training and testing datasets. When the selected training dataset was good, its R 2 value increased and vice-versa. This is further explained below. In addition, the prediction results of four ML models were almost equal and the R 2 of the remaining training results was close to 0.8. This was a relatively good result especially considering the R 2 value of the GBDT algorithm, which had the maximum value of 0.971 at the 9th training. This indicated that the selection of this training dataset was very representative. In conclusion, when studying the predictive ability of the ML model, the dataset selection should be considered because different training and testing sets lead to model differences. By considering the dataset selection, model differences were reduced and the generalization ability of the datasets was improved.  To further illustrate the performance capability of different algorithms for the same datasets, the strengths and weaknesses of each model were evaluated from the MAE and the RMSE. In Table 2, the average training and testing set evaluation results of different ML models according to Figure 4 are listed; each model was tuned up to 10 iterations. The R 2 values of four ML models were similar with all of them being above 0.85. The RF model had the highest R 2 value of 0.902 and its RMSE was also the lowest among the four ML models. However, the MAE was not the lowest of the four algorithms. This indicated that the gap between the predicted and the true values of the remaining models was greater, thus leading to a greater value of the RMSE. This meant that the gap between the predicted and the true values of 10 training RF results was smaller. In the training data, the performance of the GBDT and XGBOOST were better than the RF and R 2 was close to 1 so there might have been overfitting, making the evaluation standard of the testing data lower To further illustrate the performance capability of different algorithms for the same datasets, the strengths and weaknesses of each model were evaluated from the MAE and the RMSE. In Table 2, the average training and testing set evaluation results of different ML models according to Figure 4 are listed; each model was tuned up to 10 iterations. The R 2 values of four ML models were similar with all of them being above 0.85. The RF model had the highest R 2 value of 0.902 and its RMSE was also the lowest among the four ML models. However, the MAE was not the lowest of the four algorithms. This indicated that the gap between the predicted and the true values of the remaining models was greater, thus leading to a greater value of the RMSE. This meant that the gap between the predicted and the true values of 10 training RF results was smaller. In the training data, the performance of the GBDT and XGBOOST were better than the RF and R 2 was close to 1 so there might have been overfitting, making the evaluation standard of the testing data lower than the RF. In summary, the R 2 , the MAE and the RMSE values of the RIDGE model were relatively poor performers among the four ML models, which also indicated that the integrated algorithm had an improved prediction ability when dealing with the datasets.

Cross-Validation
Different algorithms resulted in different prediction results due to statistical dataset features being evaluated from the sub-datasets. This sometimes might not be representative of the entire datasets and it might lead to sampling uncertainly. In this paper, a five-fold cross-validation was employed, which randomly selected both the training and the testing set. However, unlike the previous selection methods, the selection of the training and testing set ensured that all samples could serve as either the training set or the testing set. In order to reduce the sampling uncertainty, various training models were iterated several times to broaden the distribution of the validation subsets for a given set of parameters. In Table 3, the R 2 results for the cross-validation of four ML models are listed. It divides the data into five pieces, i.e., fold 1-fold 5. Each fold from fold 1 to fold 5 was then successively used as validation data while others were used as training data. As the selected data are different each time, the R 2 of the different ML models fluctuated with training. The crossvalidation R 2 values of the four algorithms fluctuated widely from 0.28 to 0.98. Among them, the R 2 prediction of the RF in the five-fold cross-validation fluctuated from 0.61 to 0.98. Its fluctuation range was relatively small, which indicated that the RF algorithm was more stable in coping with the datasets. The mean R 2 value for two five-fold cross-validations of four ML algorithms is shown in Figure 5. Through cross-validation, it could be concluded that the RF algorithm had the best prediction result for the datasets, which also verified the previous conclusion and further showed that the generalization ability of the RF was relatively strong. The RF had the highest R 2 value in four cross-validated models (0.865). This was a better result compared with the previously obtained R 2 mean value of 0.848. This was because the dataset selection was improved and could better represent the entire dataset features. To summarize, the cross-validation results indicated that different algorithms responded to different attributes of datasets with various stability effects. Furthermore, the RF had a relatively high stability for the dataset selection. In order to more intuitively illustrate the advantages and disadvantages of ML models, RF and XGBOOST were selected from the four ML models for further analysis. The prediction results of both models were compared. In Figure 6, the prediction results of R 2 and the MAE under the RF and XGBOOST are shown. Interestingly, many data points In order to more intuitively illustrate the advantages and disadvantages of ML models, RF and XGBOOST were selected from the four ML models for further analysis. The prediction results of both models were compared. In Figure 6, the prediction results of R 2 and the MAE under the RF and XGBOOST are shown. Interestingly, many data points were perfectly organized in the diagonal for the training data (Figure 6a) in the XGBOOST model. This meant that, for these data, the XGBOOST was much better (almost 100% fit) than the RF. However, an ML model with 100% fit for a large fraction of the data may be caused purely by overlearning, which was confirmed by testing data (Figure 6b). In addition, we noted that most data via XGBOOST fitted well in the training set but others did not. The reason was that XGBOOST might have misjudged them because several pieces of data were so similar with several similar features after checking the raw data corresponding with the deviation points. In conclusion, this indicated that their prediction results were not significantly different when compared with the actual values while the results of XGBOOST were somewhat worse. This indicated that the RF had a stronger performance capability.

Combination of Features
The selection of different features had a significant effect on the ML model. In this paper, the performance capability of different ML models for the same datasets is discussed. In order to further discuss the influence of different features on establishing the ML model, the RF model with the optimal performance was used to analyze feature combinations with the purpose of obtaining the feature with the greatest influence on the model establishment. In Figure 7, the importance of six different features is shown. The ranking was done via the RF based on the Gini coefficient approach. It was known that the CR had the greatest influence on undercooling, which was consistent with materials science knowledge. This was followed by the MAR and the MM while the least influential one was the NSP.

Combination of Features
The selection of different features had a significant effect on the ML model. In this paper, the performance capability of different ML models for the same datasets is discussed. In order to further discuss the influence of different features on establishing the ML model, the RF model with the optimal performance was used to analyze feature combinations with the purpose of obtaining the feature with the greatest influence on the model establishment. In Figure 7, the importance of six different features is shown. The ranking was done via the RF based on the Gini coefficient approach. It was known that the CR had the greatest influence on undercooling, which was consistent with materials science knowledge. This was followed by the MAR and the MM while the least influential one was the NSP.
binations with the purpose of obtaining the feature with the greatest influence on the model establishment. In Figure 7, the importance of six different features is shown. The ranking was done via the RF based on the Gini coefficient approach. It was known that the CR had the greatest influence on undercooling, which was consistent with materials science knowledge. This was followed by the MAR and the MM while the least influential one was the NSP. In Figure 8, the degree to which the selection of features affected the RF model is shown. It demonstrated that 2-6 combinations of features from high to low (according to Figure 4) were required to predict the undercooling by taking the average value after 100 training sessions. With an increase in the number of features, R 2 demonstrated a slight In Figure 8, the degree to which the selection of features affected the RF model is shown. It demonstrated that 2-6 combinations of features from high to low (according to Figure 4) were required to predict the undercooling by taking the average value after 100 training sessions. With an increase in the number of features, R 2 demonstrated a slight decline while the MAE increased. In other words, the selection of the top two features could improve the representation of the dataset features. Furthermore, the increased feature values did not significantly affect the accuracy of the model. To summarize, the selection of features had a significant impact on the establishment of the model. However, this was based on the selection of beneficial features while inferior features only increased the workload. The improvement of the model was smaller, which decreased the predictive power of the model. This, in turn, showed that the number of features should not be selected randomly but rather appropriately.  The above presented feature combinations were only sequential superimposed combinations of the feature importance ranking. It was not possible to discern how the remaining feature combinations affected the prediction results. Thus, a further analysis was required. In Figure 9a, R 2 > 0 results for 100 runs under 2-6 feature combinations in the RF algorithm are shown. Under multiple combinations of 2-6 major factors, the optimal R 2 value for each of its feature combinations fluctuated between 0.7 and 0.85 with the highest R 2 being 0.82 under two feature combinations represented by the CR and the MAR. This was similar to the importance ranking results provided in Figure 7. It could be seen that The above presented feature combinations were only sequential superimposed combinations of the feature importance ranking. It was not possible to discern how the remaining feature combinations affected the prediction results. Thus, a further analysis was required. In Figure 9a, R 2 > 0 results for 100 runs under 2-6 feature combinations in the RF algorithm are shown. Under multiple combinations of 2-6 major factors, the optimal R 2 value for each of its feature combinations fluctuated between 0.7 and 0.85 with the highest R 2 being 0.82 under two feature combinations represented by the CR and the MAR. This was similar to the importance ranking results provided in Figure 7. It could be seen that the R 2 of the testing set decreased with an increase in the number of features, which might produce overfitting and a loss of the generalization ability of the model. The results indicated that the number of features should favor quality over quantity. In Figure 9b, the results of the MAE run for 100 times under 2-6 combinations of features in the RF algorithm are shown. Point I indicates that under the combination of the CR and the MAR features, the MAE was equal to 8.813. Point II indicates that the value of the MAE under the combination of the MM, CR, MAR, NSP and MAE features was 8.397. Its value was lower than the value of point I. However, R 2 was equal to 0.792. The features of the CR and the MAR had favorable performances regarding R 2 values. Thus, they had a significant influence on the undercooling and could be considered as the key factors. Based on the RF model, the CR and the MAR were the most important factors for the undercooling degree related to processing and the properties of both nucleation and the substrate phase, respectively. Obviously, the higher the CR, the larger the undercooling degree because the melt metal cannot nucleate on time and thus keeps liquid far below the solidification temperature when following the rapid change of temperature. The MAR was used to describe the atomic information of both the nucleation and the substrate phase. It expressed lattice distortion in a solid solution and consequently electron means free path [23], which is supposed to affect nucleation and thus the undercooling degree.
In summary, after comparing the evaluation indexes under different ML models, it could be concluded that the RF had the best performance ability in the material undercooling prediction. The CR had the greatest influence on the prediction results when the correlation analysis was performed on the features. The importance of the CR and the MAR was also demonstrated in the features ranking using the RF models, which further increased the reliability of its feature selection. The results obtained in this study can serve as a beneficial reference for obtaining key undercooling factors.

Conclusions
This study developed ML models for the prediction of the undercooling degree of ascast irons and aluminum alloys. Here, 63 datasets with six features were collected and standardized from the experimental results. Furthermore, nine ML algorithms were used to mine the datasets. Four models were selected for a detailed analysis of nine ML models. It was found that differences in algorithm, features and data have a significant influence on the performance of the ML models. After comparing the evaluation indexes, the RF model was considered to be the optimal model for the accurate prediction of the undercooling degree with the corresponding R 2 value of 0.85 and an MAE of 8.43. Various factors affected the undercooling degree differently according to their importance in the fol- Based on the RF model, the CR and the MAR were the most important factors for the undercooling degree related to processing and the properties of both nucleation and the substrate phase, respectively. Obviously, the higher the CR, the larger the undercooling degree because the melt metal cannot nucleate on time and thus keeps liquid far below the solidification temperature when following the rapid change of temperature. The MAR was used to describe the atomic information of both the nucleation and the substrate phase. It expressed lattice distortion in a solid solution and consequently electron means free path [23], which is supposed to affect nucleation and thus the undercooling degree.
In summary, after comparing the evaluation indexes under different ML models, it could be concluded that the RF had the best performance ability in the material undercooling prediction. The CR had the greatest influence on the prediction results when the correlation analysis was performed on the features. The importance of the CR and the MAR was also demonstrated in the features ranking using the RF models, which further increased the reliability of its feature selection. The results obtained in this study can serve as a beneficial reference for obtaining key undercooling factors.

Conclusions
This study developed ML models for the prediction of the undercooling degree of as-cast irons and aluminum alloys. Here, 63 datasets with six features were collected and standardized from the experimental results. Furthermore, nine ML algorithms were used to mine the datasets. Four models were selected for a detailed analysis of nine ML models. It was found that differences in algorithm, features and data have a significant influence on the performance of the ML models. After comparing the evaluation indexes, the RF model was considered to be the optimal model for the accurate prediction of the undercooling degree with the corresponding R 2 value of 0.85 and an MAE of 8.43. Various factors affected the undercooling degree differently according to their importance in the following sequence: cooling rate (CR), mean covalent atomic radius (MAR), mismatch (MM), mean Mendeleev number (MMN), number of lattices (NLs) and the nucleation and substrate plane (NSP). Two key features, the cooling rate (CR) and the mean covalent atomic radius (MAR), were selected as an optimal combination after comparing all possible combinations among six features and were enough to build the ML model for the prediction of the undercooling degree. In this study, the ML model based on the RF algorithm could accurately predict the undercooling degree for as-cast iron materials and aluminum alloys, which has a potential application in both industrial and experimental areas.
Author Contributions: Y.C. conceived the total investigation; L.W. analyzed the data and wrote the manuscript; S.W., C.Y. and N.Z. helped data collections and discussed the results; Z.Z. designed the whole work and revised the manuscript; K.Z. supervised the whole work. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: All data, models or code generated or used during the study are available from the corresponding author by request.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
The original data used in this study.