Application of an Artiﬁcial Neural Network to Develop Fracture Toughness Predictor of Ferritic Steels Based on Tensile Test Results

: Analyzing the structural integrity of ferritic steel structures subjected to large temperature variations requires the collection of the fracture toughness ( K J c ) of ferritic steels in the ductile-to-brittle transition region. Consequently, predicting K J c from minimal testing has been of interest for a long time. In this study, a Windows-ready K J c predictor based on tensile properties (speciﬁcally, yield stress σ YSRT and tensile strength σ BRT at room temperature (RT) and σ YS at K J c prediction temperature) was developed by applying an artiﬁcial neural network (ANN) to 531 K J c data points. If the σ YS temperature dependence can be adequately described using the Zerilli–Armstrong σ YS master curve (MC), the necessary data for K J c prediction are reduced to σ YSRT and σ BRT . The developed K J c predictor successfully predicted K J c under arbitrary conditions. Compared with the existing ASTM E1921 K J c MC, the developed K J c predictor was especially effective in cases where σ B / σ YS of the material was larger than that of RPV steel.

Since Ritchie and Knott introduced the idea of using critical stress and distance to predict fracture toughness temperature dependence [4], researchers who explicitly or implicitly applied this idea have obtained results that demonstrate a strong correlation between the temperature dependence of fracture toughness and that of yield stress (σ YS ) [5,6]. Wallin observed that the increase in fracture toughness with increasing temperature is not sensitive to steel alloying, heat treatment, or irradiation [7]. This observation led to the concept of a universal curve shape that applies to all ferritic steels, i.e., the difference in materials is reflected by the temperature shift. This concept is now known as the master curve (MC) method, as described by the American Society for Testing and Materials (ASTM) E1921 [8]. The existence of a K Jc MC was physically supported by Kirk et al. based on dislocation mechanics considerations [9,10]. They argued that the temperature dependence of K Jc is related to the temperature dependence of the strain energy density (SED). Furthermore, under arbitrary conditions such as the specimen size and temperature, without performing the fracture toughness test, was conducted; this is treated as a regression issue. There are various algorithms for machine learning models for regression. In this study, a multilayer perceptron (MLP) was classified into an ANN that can express complex nonlinear relationships. The regression model was constructed using the MLP regressor, which is a scikit-learn library of the general-purpose programming language Python [34]. Figure 1 shows a schematic diagram of the MLP network. The MLP is a hierarchical network comprising an input layer, a hidden layer, and an output layer; the unit of the hidden layer is completely connected to the input and output layers [34,35].

Selection of Machine Learning Model
Machine learning models are used in many fields, such as search engines, image classification, and voice recognition, and various methods have been proposed according to the application. In this study, a tool to predict the fracture toughness KJc of a material under arbitrary conditions such as the specimen size and temperature, without performing the fracture toughness test, was conducted; this is treated as a regression issue. There are various algorithms for machine learning models for regression. In this study, a multilayer perceptron (MLP) was classified into an artificial neural network (ANN) that can express complex nonlinear relationships. The regression model was constructed using the MLP regressor, which is a scikit-learn library of the general-purpose programming language Python [34]. Figure 1 shows a schematic diagram of the MLP network. The MLP is a hierarchical network comprising an input layer, a hidden layer, and an output layer; the unit of the hidden layer is completely connected to the input and output layers [34,35]. In Figure 1, only one hidden layer is schematically shown; however, in general, multiple hidden layers are used to enhance the expressiveness of the model. The unit in the hidden layer (hereinafter, referred to as the activation unit aj (j = 1 ~ k)) is calculated using Equation (1), where n input values are Xi and the output values are f(X).

Overview of Multilayer Perceptron in an Artificial Neural Network
Here, , is the connection weight, X0 is a constant called bias, and of Equation (1) is a function called the activation function. For the activation function, a function with differentiable nonlinearity was selected to enhance the expressiveness of the model. In this study, the rectified linear unit (ReLU) function (z) = max(0, z) was used and aj was assigned to the hidden layer. The total number k of aj (the number of nodes in the hidden layer) and the number of hidden layers are parameters that were adjusted according to the learning accuracy. The output value f(X) can be obtained via Equation (2).

Input layer
Hidden layer Output layer In Figure 1, only one hidden layer is schematically shown; however, in general, multiple hidden layers are used to enhance the expressiveness of the model. The unit in the hidden layer (hereinafter, referred to as the activation unit a j (j = 1~k)) is calculated using Equation (1), where n input values are X i and the output values are f (X).
Here, w h j,i is the connection weight, X 0 is a constant called bias, and φ of Equation (1) is a function called the activation function. For the activation function, a function with differentiable nonlinearity was selected to enhance the expressiveness of the model. In this study, the rectified linear unit (ReLU) function φ(z) = max(0, z) was used and a j was assigned to the hidden layer. The total number k of a j (the number of nodes in the hidden layer) and the number of hidden layers are parameters that were adjusted according to the learning accuracy. The output value f (X) can be obtained via Equation (2).
where w o j denotes the connection weight. In Equations (1) and (2), the connection weights w h j,i , w o j are unknown constants and can be obtained from the combination of known input and output values. By assuming that the known teaching data (true value) are Y (to distinguish it from f (X), predicted from the input value X i from Equation (2)), the connection weights can be updated in Equation (3), using the loss function E.
Here, the first term in Equation (3) is the sum of the squared residuals of the teaching data Y and the output value f (X), and the second term is a regularization term using the L 2 norm to suppress overfitting. α is a parameter that is adjusted according to learning accuracy. Overfitting is a problem in which training data are overfitted and unknown data cannot be effectively generalized. Several effective optimization algorithms have been developed to avoid falling into a locally optimal solution for updating the connection weights. In this study, adaptive moment estimation (Adam) [36] was used. The connection weight w is updated using Equations (4)- (9).
The recommended values were used for the adjustment parameters η, β 1 , β 2 , and [36]. The error backpropagation method to update the connection weight was used, which calculates the gradient of the loss function by moving backward from the output layer. This method is known to be less computationally expensive than updating weights in the forward direction [37].

Goodness Valuation of Constructed Learning Model
The goodness of valuation of the constructed machine learning model is based on the coefficient of determination R 2 in Equation (10), where n is the amount of teaching data, Y i is the true objective value, f (X) is the predicted objective value, and the average value of the true objective values is σ Y .
The coefficient of determination indicates the goodness of fit of the regression model and is an evaluation index for assessing how well the predicted and true values match. R 2 = 1 when the true and predicted values are the same. There is no clear standard for the coefficient of determination, but it can be considered compatible if it is approximately 0.5 or more.

Dataset
For machine learning, the fracture toughness test data of 531 ferritic steels in the DBTT region obtained by the authors or previous studies were used. Table 1 presents the chemical compositions of the test specimens of the materials considered in the teaching data.  [1][2][3][4][5][6][7][8][9][10][11][12] used in this study, nT indicates the specimen thickness, and n is expressed in multiples of 25 mm. They are fundamentally extracted from previous work [30,33], but differ slightly in terms of the following: (1) K Jc > K Jc(ulimit) invalid data were excluded, (2) K Jc data were limited to cases obtained with standard specimens of thickness-to-width ratio B/W = 0.5, (3) When there were no σ YS data for the fracture toughness test temperature, it was obtained by using the following modified Z-A σ YS temperature-dependent MC [9] where T is the temperature ( • C), C 1 = 1033 (MPa), C 3 = 0.00698 (1/K), C 4 = 0.000415 (1/K), and . ε = 0.0004 (1/s). The three Miura heats (heat No. 1, 4, 5) were another exception for which linear interpolation of raw data was used because the fracture toughness and tensile test temperatures were different.   The objective variable was K Jc . Assuming a direct relationship between the SED temperature dependence and that of K Jc , σ B temperature dependence was the first candidate explanatory parameter. However, considering that (i) σ B /σ YS temperature dependence is small, (ii) ferritic steel has a σ YS temperature-dependent MC such as Z−A MC, and (iii) σ B /σ YS at RT is usually easily available, σ B and σ YS at RT, and σ YS at K Jc test temperatures and specimen width W were selected as the explanatory variables. To optimize the connection weight, 371 points, i.e., 70% of the 531 points in the known dataset, were used as the training data. The data were divided by "train_test_split" of Python's scikit-learn library. If the digits of the input value and output value to be learned are significantly different, the influence of variables with small digits may not be fully considered in learning. Therefore, in this study, the input values W, σ YS , σ YSRT , σ BRT , and output value K Jc were standardized, as shown in Equation (12).
Here, with reference to ASTM E1921, W was normalized using the width 50 mm of a 1T specimen, and the yield stress and tensile strength were normalized using the average value of 550 MPa of the yield stress of 275 to 825 MPa in the allowable temperature range targeted by the standard. K Jc was normalized to a fracture toughness of value 100 MPa·m 1/2 at the reference temperature. Table 5 presents a list of hyperparameters used for the machine learning model in this study. Using the data in Tables 2-4 and the parameters in Table 5, which is currently an invariant model, the coefficient of determination R 2 of the developed K Jc predictor was 0.61 for the training data and 0.53 for the test data. Table 6 presents the explanation variables for predicting fracture toughness K Jc .  The input data (W, σ YS , σ YSRT , σ BRT ) for the developed K Jc predictor and output window after its execution (the coefficient of determination R 2 and the predicted K Jc ) are shown in Figure 2. In Figure 3, the comparison of K Jc of ASTM E1921 MC and predicted K Jc by the predictor is shown. In Figure 3, the horizontal axis is T, the vertical axis K Jc(1T) is the test data, and the predicted K Jc is converted to 1T thickness. The K Jc of the ASTM E1921 MC is plotted as a black solid line, the K Jc of the test data are plotted as open black symbols, and the predicted conditions listed in Table 6 are plotted as open red symbols. In Figure 3a, for RPV steel, both the K Jc by the ASTM E1921 MC and the predicted K Jc by this model are in agreement with the test results. However, in Figure 3b for SCM440, although the K Jc by the ASTM E1921 MC significantly differs from the test results at high temperatures, the predicted K Jc values by this model are in agreement with the test results.

Discussion
By applying the ANN, a KJc predictor for ferritic steels that only requires tensile properties (i.e., σYS at the desired temperature for predicting KJc, and the RT values σYSRT and σBRT) were derived. This method eliminates the need for time-and material-consuming fracture toughness tests. The tool for predicting KJc by considering the specimen size and material properties is based on 531 fracture toughness test data values obtained from five RPV steel heats and seven non-RPV steel heats. The specimen sizes ranged from 0.4T to 4T to learn the size effect, the yield stress ranged from 328 to 775, and the tensile strength ranged from 519 to 832 to learn the material properties. The data range used in the training was equal to the application limit of the predictor. The developed KJc predictor successfully predicted training data with R 2 = 0.61 and test data with R 2 = 0.53.
To predict KJc at a specific temperature of interest, the user needs σYS at this temperature as well as σYSRT and σBRT at RT. If the material of interest is known to be well fitted by the Z-A σYS MC, the quantities for which test data are necessary for KJc prediction are only σYSRT and σBRT.
A considerable advantage of the proposed KJc predictor is that fracture toughness tests are not necessary to predict KJc. The key novel idea here is to use tensile properties (such as σYS and σB) and specimen size W.
Although the developed KJc predictor predicts one KJc for a combination of explanatory variables, the predicted KJc fracture probability is predicted by assuming the probability distribution of the data to be learned (e.g., Weibull distribution). It is also possible to evaluate it together, which is a future issue.
According to Tables 2-4, the (σB/σYS)RT of non-RPV and RPV are different. Accepting Kirk's opinion that KJc and SED correspond, ASTM E1921 MC may deviate from non-RPV. However, this KJc predictor has an advantage in that it considers this. On this point, the developed KJc predictor, compared with the existing ASTM E1921 KJc MC, is expected to be especially effective in cases where σB/σYS of the material is larger than that of RPV steel.

Discussion
By applying the ANN, a KJc predictor for ferritic steels that only requires tensile properties (i.e., σYS at the desired temperature for predicting KJc, and the RT values σYSRT and σBRT) were derived. This method eliminates the need for time-and material-consuming fracture toughness tests. The tool for predicting KJc by considering the specimen size and material properties is based on 531 fracture toughness test data values obtained from five RPV steel heats and seven non-RPV steel heats. The specimen sizes ranged from 0.4T to 4T to learn the size effect, the yield stress ranged from 328 to 775, and the tensile strength ranged from 519 to 832 to learn the material properties. The data range used in the training was equal to the application limit of the predictor. The developed KJc predictor successfully predicted training data with R 2 = 0.61 and test data with R 2 = 0.53.
To predict KJc at a specific temperature of interest, the user needs σYS at this temperature as well as σYSRT and σBRT at RT. If the material of interest is known to be well fitted by the Z-A σYS MC, the quantities for which test data are necessary for KJc prediction are only σYSRT and σBRT.
A considerable advantage of the proposed KJc predictor is that fracture toughness tests are not necessary to predict KJc. The key novel idea here is to use tensile properties (such as σYS and σB) and specimen size W.
Although the developed KJc predictor predicts one KJc for a combination of explanatory variables, the predicted KJc fracture probability is predicted by assuming the probability distribution of the data to be learned (e.g., Weibull distribution). It is also possible to evaluate it together, which is a future issue.
According to Tables 2-4, the (σB/σYS)RT of non-RPV and RPV are different. Accepting Kirk's opinion that KJc and SED correspond, ASTM E1921 MC may deviate from non-RPV. However, this KJc predictor has an advantage in that it considers this. On this point, the developed KJc predictor, compared with the existing ASTM E1921 KJc MC, is expected to be especially effective in cases where σB/σYS of the material is larger than that of RPV steel.

Discussion
By applying the ANN, a K Jc predictor for ferritic steels that only requires tensile properties (i.e., σ YS at the desired temperature for predicting K Jc , and the RT values σ YSRT and σ BRT ) were derived. This method eliminates the need for time-and material-consuming fracture toughness tests. The tool for predicting K Jc by considering the specimen size and material properties is based on 531 fracture toughness test data values obtained from five RPV steel heats and seven non-RPV steel heats. The specimen sizes ranged from 0.4T to 4T to learn the size effect, the yield stress ranged from 328 to 775, and the tensile strength ranged from 519 to 832 to learn the material properties. The data range used in the training was equal to the application limit of the predictor. The developed K Jc predictor successfully predicted training data with R 2 = 0.61 and test data with R 2 = 0.53.
To predict K Jc at a specific temperature of interest, the user needs σ YS at this temperature as well as σ YSRT and σ BRT at RT. If the material of interest is known to be well fitted by the Z-A σ YS MC, the quantities for which test data are necessary for K Jc prediction are only σ YSRT and σ BRT .
A considerable advantage of the proposed K Jc predictor is that fracture toughness tests are not necessary to predict K Jc . The key novel idea here is to use tensile properties (such as σ YS and σ B ) and specimen size W.
Although the developed K Jc predictor predicts one K Jc for a combination of explanatory variables, the predicted K Jc fracture probability is predicted by assuming the probability distribution of the data to be learned (e.g., Weibull distribution). It is also possible to evaluate it together, which is a future issue.
According to Tables 2-4, the (σ B /σ YS ) RT of non-RPV and RPV are different. Accepting Kirk's opinion that K Jc and SED correspond, ASTM E1921 MC may deviate from non-RPV. However, this K Jc predictor has an advantage in that it considers this. On this point, the developed K Jc predictor, compared with the existing ASTM E1921 K Jc MC, is expected to be especially effective in cases where σ B /σ YS of the material is larger than that of RPV steel.
The predictors that were generated and analyzed during the current study are available from the corresponding author upon reasonable request.

Conclusions
In this study, a tool was developed that can predict K Jc for an arbitrary specimen size W and material properties (σ YSRT , σ YS , σ BRT ) via an ANN applied to 531 fracture toughness test data values. Currently, the conditions applicable to the tool are material properties ranging from σ YSRT = 328 to 775 MPa, σ BRT = 519 to 832 MPa, specimen size ranging from 0.4T to 4T and its types are CT and SEB. By using the tool developed through the application of data-driven ideas, it is possible to predict the fracture toughness at this temperature from the tensile test results and the specimen size at the target temperature of the fracture toughness without performing a fracture toughness test. In the future, it is planned to predict the predicted probability of fracture toughness. Acknowledgments: This work is part of the cooperative research between KOBELCO RESEARCH IN-STITUTE, INC., and the University of Fukui. Support from both organizations is greatly appreciated.

Conflicts of Interest:
The authors declare no conflict of interest.

B
test specimen thickness J J-integral K Jc fracture toughness T temperature ( • C) T 0 ASTM E1921 MC reference temperature ( • C) for a 25 mm thick specimen with a fracture toughness of 100 MPa·m 1/2 W specimen width σ YS , σ B yield (0.2% proof) and tensile strength σ 0ZA yield stress at the temperature T ( • C) described by the Zerilli equation (i.e., Equation (11)) R 2 coefficient of determination X i input value of MLP a j activation unit of MLP n number of input value k number of activation unit f(X) output value of MLP w h j,i connection weight between input value X i and activation unit a j φ activation function w o j connection weight between activation unit a j and output value f(X) Y teaching data E loss function α regularization strength of L 2 norm term w (t) connection weight at timestep t in Adam m (t) exponential moving averages of the gradient at timestep t in Adam v (t) exponential moving averages of the squared gradient at timestep t in Adam m (t) bias-corrected first moment estimates at timestep t in Adam