Artificial Neural Networks-Based Prediction of Hardness of Low-Alloy Steels Using Specific Jominy Distance

Successful prediction of the relevant mechanical properties of steels is of great importance to materials engineering. The aim of this research is to investigate the possibility of reducing the complexity of artificial neural networks-based prediction of total hardness of hypoeutectoid, low-alloy steels based on chemical composition, by introducing the specific Jominy distance as a new input variable. For prediction of total hardness after continuous cooling of steel (output variable), ANNs were developed for different combinations of inputs. Input variables for the first configuration of ANNs were the main alloying elements (C, Si, Mn, Cr, Mo, Ni), the austenitizing temperature, the austenitizing time, and the cooling time to 500 ◦C, while in the second configuration alloying elements were substituted by the specific Jominy distance. Comparing the results of total hardness prediction, it can be seen that the ANN using the specific Jominy distance as input variable (runseen = 0.873, RMSEunseen = 67, MAPE = 14.8%) is almost as successful as ANN using main alloying elements (runseen = 0.940, RMSEunseen = 46, MAPE = 10.7%). The research results indicate that the prediction of total hardness of steel can be successfully performed only based on four input variables: the austenitizing temperature, the austenitizing time, the cooling time to 500 ◦C, and the specific


Introduction
The mechanical behavior of steels determines their usefulness in a variety of applications. Different loads that materials experience in their application make it necessary to identify the limiting values that can be withstood without failure or permanent deformation [1]. Knowledge of the mechanical behavior of steels during manufacturing processes (such as heat treatment) is also necessary since it directly influences the mechanical properties of steel components.
Quenching is a common heat treatment process usually implemented to produce steel components with reliable service properties. The most common use of quenching is the hardening of steel. Although quenching is a vital part of production of highly loaded steel components and load-carrying machine elements, it is also one of the major causes of rejected components and production losses due to uncontrollable distortion, residual stress and cracking of steel component [2]. Therefore, the heat treatment industry needs computer modeling of the quenching processes to control and to optimize the process parameters with a purpose to achieve desired distribution of microstructure and properties, and to avoid cracking and reduce distortion of final parts.
Mechanical properties of steel are directly related to its mechanical behavior during heat treatment. Therefore, successful prediction of the relevant mechanical properties is the first step in predicting the mechanical behavior of steel during and after heat treatment, such as resistance to fracture and distortions [3][4][5].
When quenching is considered, mechanical properties of steel mostly depend on microstructure constituents and temperature evolution during the treatment. Variation of temperature at any point in the component is the major driving force for phase transformations. Upon cooling, the thermodynamic stability of the parent phase is altered, which results in the decomposition of austenite into transformation products. Transformation rate depends on the temperature and the cooling rate. Consequently, by changing the cooling rate, a wide range of mechanical properties can be obtained in steel parts.
However, there are other factors such as alloying elements, grain size refinement, internal stresses, microstructure heterogeneity and crystal imperfections [6][7][8][9] which also significantly affect the resulting mechanical properties that should be considered in prediction of mechanical properties of quenched parts.
Prediction of mechanical properties is usually based on semi-empirical methods. In [10], the equations for prediction of microstructure constituents' hardness after the isothermal decomposition of austenite are proposed. Developed model can be a very good basis for predicting hardness in continuous steel cooling using Scheil's additivity rule. One of the most common semi-empirical methods for predicting hardness during continuous cooling is based on the continuous cooling transformation (CCT) diagrams [11]. Additionally, distribution of hardness in quenched steel can be predicted using the Jominy test results by transferring the results from the Jominy curve to the hardness values for different points of the steel specimen [4,12].
Artificial neural networks (ANNs) are empirical methods very useful in modeling, estimating, predicting and process control in different science and technology fields. The neural network concept proved to be powerful and versatile for predicting materials' properties based on chemical composition and/or other variables, particularly in cases when some of the influences are unknown, as well as for solving many complex phenomena for which physical models do not exist. For example, in [13] authors show how a hybrid strategy, which combines decision trees and ANNs, can be used for accurate and reliable prediction of ore crushing plate lifetimes. In [14], region convolutional neural networks are used for automatic detection of steel surface defects in product quality control. In [15], authors show the application of neural networks to a cyclic elastoplastic material as well as to a more complex thermo-viscoplastic steel solidification model. In [16][17][18], ANNs address the problem of extracting the Jominy hardness profiles of steels directly from the chemical composition. In [19,20], ANN is applied for predicting microstructure composition of quenched steel based on chemical composition, the austenitizing temperature, the austenitizing time, the cooling time from 800 to 500 • C, the austenite grain size and the total hardness of quenched steel, while in [21], ANN is applied for predicting as-welded HAZ hardness based on the cooling time from 800 to 500 • C and chemical composition of steel. In [22][23][24][25][26], ANNs are successfully applied for predicting mechanical properties of steels and steel microconstituents.
Due to the high accuracy of the prediction results, as compared with results of mathematical modeling and regression analysis, ANNs have also continued to develop in recent years. Among the neural networks used, the feed-forward neural network with the backpropagation learning algorithm (BPNN) is commonly used [27,28].
Hardness, which is considered to be one of the fundamental mechanical properties of a material, has also great importance for application. The resulting hardness of steels is mainly dependent on their chemical composition; however, hardenability of steels and the heat treatment parameters, such as temperature, time and cooling rate, which can control the microstructure and crystal grain size, are also related to the hardness.
Hardness is often used as a basis for prediction of various elastic and plastic stressstrain as well as fatigue properties of steel [12,[29][30][31]. Hardness prediction is very important in heat treatment of steel. Based on hardness, it is possible to predict other mechanical properties of steel after quenching and tempering, such as tensile strength, yield strength, elongation and contraction [32]. Consequently, the ability to predict hardness and its distribution in heat-treated steel parts with improved accuracy, opens possibility to also predict advanced material properties better.
The aim of this research is to investigate the possibility of reducing the complexity of artificial neural networks-based prediction of total hardness, HV tot , of hypoeutectoid, lowalloy steels, based on detailed chemical composition, by introducing the specific Jominy distance, E d , as a new input variable, and provide another approach to prediction when detailed chemical composition is unknown.
For that purpose, ANNs were developed for different combinations of inputs. Input variables for the first configuration of ANNs were the main alloying elements (C, Si, Mn, Cr, Mo, Ni), the austenitizing temperature, the austenitizing time, and the cooling time to 500 • C, while in the second configuration alloying elements were substituted by the specific Jominy distance. In total 423 datasets of 24 hypoeutectoid, low-alloy steels were used to develop and test ANNs.

Materials
Hypoeutectoid, low-alloy steels that present study deals with, are steels with less than~0.8 wt. % of carbon and containing alloying elements, including carbon, up to a total content of about 5.0 wt. %. Low-alloy steels with suitable alloy compositions have greater hardenability than structural carbon steels and, thus, can provide high strength and good toughness in heat-treated thicker sections. Carbon is the main hardening element in all steels except the austenitic precipitation hardening steels, maraging steels and interstitial-free steels. The strengthening effect of carbon in steels consists of solid solution strengthening and carbide dispersion strengthening [29].

Input Variables and Data
The hardness of steels for case hardening based on carburizing and steels for quenching and tempering, depends mainly on volume fraction of steel microconstituents, i.e., martensite, bainite and ferrite-pearlite mixture. Higher cooling rates during the cooling of steel from austenitizing temperature results in higher volume fraction of hard martensite, while lower cooling rates result in higher volume fraction of soft ferrite-pearlite mixture. Therefore, volume fraction of those microconstituents and hardness depend mainly on cooling rate of steel. Cooling rate during the cooling from austenitizing temperature is adequately defined by cooling time from austenitizing temperature to temperature of 500 • C. This is further confirmed by the fact that prediction of the as-quenched hardness of steel based on cooling time from 800 • C to 500 • C is well known in the literature and practice, where as-quenched hardness at different workpiece points is estimated by the conversion of the cooling time to the hardness. This conversion is provided by the relationship between the cooling time and distance from the quenched end of the Jominy test specimen [4,12], making the cooling time to 500 • C, t 500 , good candidate for one of the main input variables in ANN for prediction of hardness. An equally important factor influencing the steel hardness is hardenability of steel. In general, steels with lower diffusion of carbon and other alloying elements have higher hardenability. Different alloying elements, more or less suppress the diffusional pearlitic and bainitic transformations in steel, which means that different alloying elements can influence the volume fraction of microconstituents. Hardenability of steel can be involved in prediction model by taking into account the chemical composition. Additionally, hardenability of steel can be involved in prediction model by taking into account the specific Jominy distance, E d , which depends on chemical composition of steel and corresponds to the Jominy distance when 50% of the microstructure is martensite (Figure 1) [4,12,34]. Distance E d can be determined/estimated from the Jominy curve based on hardness of steel with 50% martensite in the microstructure, HRC 50%M [35]: where c 0 is the mass fraction of carbon in the steel.
Since specific Jominy distance is related to hardenability, it can be assumed that artificial neural networks-based prediction of hardness using specific Jominy distance instead of chemical composition can be applied in case when detailed chemical composition is unknown, and as additional model of prediction of hardness of heat-treated steels.
Another factor influencing the steel hardness after the cooling from austenitizing temperature are austenitizing temperature, T a , and austenitizing time, t a , which influence the austenite grain size. The higher austenitizing temperature and longer austenitizing time result in austenite grain growth and solubility of carbon and other alloying elements in austenite, which influence the kinetic of austenite decomposition and hardness of steel.
It is worth noting that besides the chemical composition, mechanical properties of studied steels significantly depend on the crystal grain size. With an increase in the heating temperature or holding time in the austenite range, the grains begin to grow intensively. However, since the majority of used CCT diagrams do not provide information about the austenite grain size, and for the sake of practicality, this parameter was not selected as input variable in this research.
Based on the previous discussion and arguments, prediction of total hardness, HV tot , by artificial neural networks was designed to be based on 10 input variables: main alloying elements, heat treatment parameters and the specific Jominy distance, as shown in detail in Table 2. The data was acquired from the results of experiments obtained from the literature [33]. For each steel listed in Table 1, between 8 and 13 datasets were collected which, in addition to the main alloying elements, contain information on heat treatment parameters such as austenitizing temperature, T a , austenitizing time, t a , cooling time to 500 • C, t 500 , the specific Jominy distance, E d , and total hardness after continuous cooling, HV tot . In total, 423 datasets were collected. Table 3 shows sample dataset for steel 42CrMo4 (data No. 19 in Table 1). All datasets are provided in supplement material accompanying the paper (Table S1).

Development of Artificial Neural Networks for Prediction of Total Hardness after Continuous Cooling, HV tot
Artificial neural networks are flexible, nonlinear computational models that can be successfully used in function approximation problems in various areas of research, thus also for prediction/estimation of material properties based on heat treatment parameters. Artificial neural networks are inspired by biological neural networks. They consist of highly or fully connected artificial neurons that are divided into three layers-input layer (one neuron for each input variable), hidden layers (one or more neurons in one or more hidden layers) and output layer (one neuron for each output variable), which are connected by weights. Considering the features of the investigated problem, two-layered artificial neural network (Figure 2) was used in this research. For the given problem i.e., prediction of total hardness after continuous cooling, HV tot , several two-layer multilayer perceptrons (MLP) with hyperbolic tangent transfer function in the hidden layer and linear transfer function in the output layer were developed. According to [36], this kind of artificial neural network can approximate any arbitrary function well. Common algorithm for supervised learning with MLP, as is the case in this research, is the backpropagation algorithm. The main goal of artificial neural network development is to adjust weights so the error function, in this case the mean square error, MSE, is minimal i.e., that output values (predicted hardness values, HV tot,pred ) are close to target values (experimental hardness values, HV tot ). Once the initial values of weights are set, those input signals move layer by layer from the input to the output layer. This is called the forward phase. Forward phase is followed by backpropagation i.e., the backward phase, in which the error signals, obtained by comparison of target and output values, move layer by layer, only now from the output layer to the input layer. In the backward phase, weights in the ANN are adjusted to minimize error function, as previously explained. This error correction learning stops when a certain criterion is met.
In this research, artificial neural networks were developed for different combinations of input variables while every MLP had only one output variable-total hardness after continuous cooling, HV tot . Input variables for the first configuration of artificial neural networks were the main alloying elements, austenitizing temperature, T a , austenitizing time, t a , and cooling time to 500 • C, t 500 , while in the second configuration alloying elements were substituted by specific Jominy distance, E d (as given in Table 4). In total 423 datasets of 24 steels were used to develop and test artificial neural networks. For this research computer software, MATLAB R2020b (MathWorks Inc., Natick, MA, USA) was used. Robustness of ANNs can be ensured through preventing overlearning and overfitting of the ANNs, and most importantly, by checking the ANNs performance on unseen data (data that were not used for ANN development). Out of 423 datasets, about 10% of data (41 datasets) were used as new, "unseen" data, those which were used for an unbiased evaluation of particular artificial neural network performance. The "unseen" data were randomly chosen, but in a way to represent the whole "population". Since artificial neural networks learn by example, they should never be used to extrapolate data.
Overlearning is another caveat of learning-by-example principle and must be addressed properly to improve generalization-performance of the network on new, "unseen" data. This was taken into account by combining early stopping as a method for improving generalization, and the "growth method" for determination of the number of neurons in hidden layer. Early stopping means that weights are adapted for training dataset while error function (in this case mean square error MSE) is calculated for validation dataset. When value of MSE on validation dataset reaches minimum (hopefully a global minimum), and then increases for a predefined number of epochs, training i.e., learning of ANN is stopped.
To prevent overfitting maximum number of neurons in hidden layer, H, for which the particular design was trained, is determined depending on number of available training examples, N train , number of input variables, I, and number of output variables, O: Networks were trained using Levenberg-Marquadt algorithm with early stopping, for different combinations of input variables as listed in Table 4, and hidden layer size from one neuron to H neurons ("growth method"). For each architecture, 10 networks were trained with random initial weights and data divisions.
Initial weights set a starting point for training of the ANN and if this is done randomly, for 10 times, the odds that said starting point is determined well, is increased. Furthermore, Levenberg-Marquadt algorithm usually requires data division into training, validation and testing datasets. Random data division is important because even with the data randomly divided into training, validation and testing datasets, it can happen that these three 'groups' are not sufficiently representative of the whole population, and if data are properly chosen it is assumed that the entire population is represented. Common ratio of this data division is 70/15/15, respectively, so it was also used in this research. According to the mentioned data division, if N is the total number of datasets used for development of ANNs, number of training examples, N train , is: Number of N train was constant for all networks, regardless of the number of input variables and architecture. Output variable was always only HV total , which gives number of outputs, O = 1.
Concerning the number of unknowns, i.e., weights in fully connected ANN: For the "worst" case scenario, when ANN has maximum number of inputs (9), number of weights is 24, while the number of unknown weights (network hyperparameters) is 265. The number of degrees of freedom, N dof , of a network is a difference between the number of training examples, N train , and the number of weights, N w : The number of degrees of freedom, N dof , of a network should always be greater than 0, and the above mentioned "worst" case scenario yields N dof = 2. These are extreme scenarios/ANN architectures, which were investigated to determine the most suitable hidden layer size, but also to check what is obtained with overfit networks. Since number of weights should be a lot smaller than number of training examples, i.e., N w << N train , or if we presume that the number of training, N train , should be 4-5 times greater than the number of unknown weights, N w , we obtain that for 9 inputs the maximum number of neurons in hidden layer, H is, when rounded, 5 or 6.
Out of all architectures (hidden layer size H) and for both configurations of input variables, the best artificial neural networks were selected based on the value of coefficients of correlation for whole dataset and test dataset, r and r test respectively, and the value of root mean square error, RMSE and RMSE test , again for both whole dataset and test dataset, respectively. Root mean square error, RMSE, is the square root of the MSE, and the good measure of model accuracy since it gives prediction errors of different models in the same unit as the variable that is to be predicted. The goal of the network is to provide maximal r and r test , and minimal RMSE and RMSE test . If several networks had similar results, the one with smaller hidden layer size was chosen. Results of chosen architectures for both configurations are given in Table 5. In accordance to previously explained criteria it can be concluded that selected networks are not overfit.

Results
To provide comparable measure with results published in relevant literature, performance of selected artificial neural networks was evaluated using the mean absolute percentage error, MAPE, calculated using Equation (6): where e i are prediction errors and t i are network target values i.e., experimental values of HV tot . MAPE value is usually interpreted as a forecasting (or prediction) goodness indicator, and usually, if it is under 10% it can be said that it indicates highly accurate forecasting. In some cases, it is taken as the indicator of the generalization capability of the model. However, MAPE should not be used as the sole indicator of the goodness of the model. To determine generalization capability of the model, in this case ANNs, it is useful to calculate statistics/indicators that were used for selection of ANN, also for unseen datai.e., data that were not used in development and selection of ANN (training, validation, test data). To obtain completely unbiased evaluation of artificial neural networks performance, for each chosen network, besides MAPE unseen , coefficient of correlation r unseen between targets and outputs (i.e., experimental and predicted values) and root mean square error RMSE mean,unseen were calculated for the same set of the new, "unseen" data. Results are given in Table 6. It can be seen from Tables 5 and 6 that r unseen and RMSE mean,unseen are comparable or even better than the values obtained for the dataset used for development and selection of ANN designs. MAPE values were also calculated for development dataset (13% for Configuration No. 1 and 15.7% for Configuration No. 2) and when compared to MAPE unseen (Table 6) it can be seen that those values do not differ significantly. Based on all three indicators it can be concluded that the ANN designs are robust, and that generalization capability of the ANNs, as well as the expected forecasting/prediction are promising.
Scatter diagrams in Figures 3 and 4 provide information on relation between experimental and predicted values of total hardness after continuous cooling, HV tot , obtained by selected artificial neural networks for development and "unseen" data. Results for development data are included not as an absolute performance indicator, but more to confirm the consistence of the performance and the results obtained for the development and "unseen" dataset.  From both diagrams in Figure 3, a grouping of datapoints related to certain constant values of predicted total hardness can be observed. Two major values/levels in addition to a few less pronounced ones are observable in Figure 3a which is related to configuration No. 1 where main alloying elements were used as inputs. In Figure 3b, related to configuration No. 2 developed using specific Jominy distance, single such value/level is notable. These "levels", prevalently present at lower values of hardness, are related to ferrite-pearlite microstructure which is achieved with low to very low cooling rates i.e., very high cooling times to 500 • C [8,11,32]. After certain, relatively high value of cooling time, its additional increase no longer results in notable changes i.e., differences in hardness of resulting microstructure as it becomes prevalently that of ferrite type-an effect which developed artificial neural networks correctly captures.
To further evaluate predictive accuracy of artificial neural networks for prediction of total hardness after continuous cooling, HV tot , deviations of predicted values from their experimentally obtained counterparts were used as relevant indicators. Deviations up to ±5, ±(5 . . . 10), ±(10 . . . 15) and ±(15 . . . 20)% were used for evaluation of predictions of HV tot and shown in Figure 5 for "unseen" dataset.
Both configurations of artificial neural networks show similar performance. ANN that uses main alloying elements as input variables (configuration No. 1) is somewhat more successful than the network using specific Jominy distance (configuration No. 2) as input variable. However, for both configurations the same amount (34%) of predicted HV tot values deviate up to ±5% from the experiment-based counterparts. Configuration No. 1 is more successful in the deviation range ±(5 . . . 10)%, while configuration No. 2 is more successful in the deviation range ±(10 . . . 15)% of data. When comparing deviations up to ±20%, about 87% and 73% of the predicted data fall within that range, when predicted with configuration No. 1 and configuration No. 2, respectively.

Discussion
An analysis of previous research whose results are available in the literature shows that the most of ANNs for hardness prediction of steels after continuous cooling are based on chemical composition, heat treatment parameters and cooling time as input variables. Due to the large number of input variables and the fact that in certain cases detailed chemical composition of steel is not available, in this paper it is proposed for the first time to replace the chemical composition with the specific Jominy distance, as a new input variable, which corresponds to the Jominy distance when 50% of the microstructure is martensite.
In addition to the specific Jominy distance, other input variables used to predict total hardness after continuous cooling of hypoeutectoid, low-alloy steels for case hardening based on carburizing and steels for quenching and tempering are austenitizing temperature, austenitizing time, and cooling time to 500 • C. Austenitizing temperature and austenitizing time influence the prior austenite grain size and solubility of carbon and other alloying elements in austenite, and thus the microstructure and mechanical properties of steel after continuous cooling. With an increase in the heating temperature or holding time in the austenite range, the austenite grains begin to grow intensively, which lead to a coarse-grain structure of steel (ferrite-perlite, bainite, martensite), characterized by lower mechanical properties.
The main driving force of phase transformations is the change of thermodynamic instability caused by temperature change. With undercooling of steel below the critical temperature, the thermodynamic stability of a primary microstructure is disrupted, resulting in austenite decomposition into ferrite, pearlite, bainite and martensite. The volume fraction and hardness of these microconstituents and thus the total hardness of steel after continuous cooling depend mainly on cooling rate, which could be adequately replaced by the cooling time.
Comparing the results of total hardness prediction after continuous cooling, it can be seen that the ANN using main alloying elements as input variables (Table 4, configuration No. 1) is somewhat more successful than the ANN using the specific Jominy distance as input variable (configuration No. 2). It is also important to notice ( Figure 5) that in the deviation range up to ±5% the ANN using the specific Jominy distance has equal prediction abilities. Therefore, the research results indicate that the prediction of total hardness of steel can be successfully performed only based on four input variables: the austenitizing temperature, the austenitizing time, the cooling time to 500 • C and the specific Jominy distance.
From the aspect of materials and input variables, further research could be directed to dividing the investigated steels into two groups: 1. steels for case hardening based on carburizing, and 2. steels for quenching and tempering. Also, with the aim of achieving even better results, further research could be directed to involving the maximal achievable hardness of steel and/or ranges of average values of individual chemical elements pertaining to typical composition of steels in prediction model. The maximal achievable hardness is available from Jominy curve of investigated steels.

Conclusions
In this paper, artificial neural network-based prediction of total hardness of hypoeutectoid, low-alloy steels using the specific Jominy distance, E d , have been proposed. The main goal of this research was to check whether chemical composition (C, Si, Mn, Cr, Mo, Ni (in wt. %) as input variables) can be substituted with specific Jominy distance, E d , and provide another approach to prediction when detailed chemical composition is unknown, and simpler models that predict total hardness sufficiently accurate.
The following conclusions can be reached.

2.
The prediction results indicate that the ANN designs are robust, and that generalization capability of the ANNs, as well as the expected forecasting/prediction are promising.

3.
The prediction of total hardness of steel can be successfully performed only based on four input variables: the austenitizing temperature, the austenitizing time, the cooling time to 500 • C, and the specific Jominy distance.