Prediction of Cyclic Stress–Strain Property of Steels by Crystal Plasticity Simulations and Machine Learning

In this study, a method for the prediction of cyclic stress–strain properties of ferrite-pearlite steels was proposed. At first, synthetic microstructures were generated based on an anisotropic tessellation from the results of electron backscatter diffraction (EBSD) analyses. Low-cycle fatigue experiments under strain-controlled conditions were conducted in order to calibrate material property parameters for both an anisotropic crystal plasticity and an isotropic J2 model. Numerical finite element simulations were conducted using these synthetic microstructures and material properties based on experimental results, and cyclic stress-strain properties were calculated. Then, two-point correlations of synthetic microstructures were calculated to quantify the microstructures. The microstructure-property dataset was obtained by associating a two-point correlation and calculated cyclic stress-strain property. Machine learning, such as a linear regression model and neural network, was conducted using the dataset. Finally, cyclic stress-strain properties were predicted from the result of EBSD analysis using the obtained machine learning model and were compared with the results of the low-cycle fatigue experiments.


Introduction
It is important to consider fatigue problems when we design structural materials. It takes a considerable time and high cost to obtain enough information on the fatigue properties of new materials. Consequently, many researchers try to predict fatigue performance. As one way to predict material performance, machine learning, such as linear regression and neural network, has been attractive [1][2][3][4][5][6][7]. In these studies, prediction of performances, such as tensile strength and hardness, from chemical compositions and heat treatment conditions, was conducted. In many studies to predict material performance by machine learning, the performance is directly predicted from process conditions. However, in the case of complicated phenomena, such as fatigue, it is difficult to obtain a prediction model that can be applied to a wide range of materials. For example, a helpful prediction model for carbon steels is not always applicable to stainless steel [7].
Model-based machine learning [8] is one approach to predict complicated phenomenon. In model-based machine learning, one complicated phenomenon is divided into several elementary phenomena. This framework can be extended to more complex situations. In the case of material performance prediction, we can divide it into four stages: Process, structure, property, and performance [9]. Especially in the case of fatigue, there are various methods to connect process and structure such as phase filed method and Monte Carlo method. Moreover, fatigue life can be predicted from the cyclic stress-strain property by using the Tanaka-Mura model and finite element method (FEM) [10][11][12]. Therefore, we focused on the linkage between microstructure and cyclic stress-strain property in this study.
The microstructure-property data is not always sufficient for machine learning. Thus, a lot of finite element simulations are conducted instead of fatigue experiments to obtain the microstructure-property dataset. Also, it is necessary to quantify the microstructure for machine learning. Although there are various kinds of quantification methods, an appropriate method for fatigue prediction has not been established. We quantify microstructure by using one of the distribution functions: Two-point correlation [13]. The purpose of this study is to propose the method to predict cyclic stress-strain property from microstructure by combining finite element simulation, two-point correlation, and machine learning. Figure 1 shows the proposed framework. The framework can be divided into the following steps: (i) microstructure analysis with electron backscatter diffraction (EBSD) analysis, synthetic microstructure generation and two-point correlations, (ii) calibration of material parameters by strain-controlled low-cycle fatigue experiments, (iii) finite element simulations with the synthetic microstructure, and (iv) machine learning using two-point correlation and Ramberg-Osgood relationship. The rest of the paper is organized as follows. In Section 2, microstructure analysis and strain-controlled low-cycle fatigue experiments are presented. Finite element models are then created based on the results of the microstructure analysis, and parameters for simulation are identified from the results of fatigue experiments in Section 3. A set of finite element simulations is conducted, and stress-strain hysteresis loops are obtained. In Section 4, two-point correlations are calculated from finite element models, and the dimension of two-point correlation is reduced by principal component analysis (PCA). Then, machine learning is conducted with the obtained dataset in Section 5. The inputs are strain amplitude and principal components of two-point correlations, and the output is maximum stress (σ max ) of each hysteresis loop. Finally, σ max is predicted from the result of microstructure analysis by the obtained prediction model, and it is compared with the experimental result in Section 6. The validity of the proposed method for predicting cyclic stress-strain properties is discussed. method (FEM) [10][11][12]. Therefore, we focused on the linkage between microstructure and cyclic stress-strain property in this study. The microstructure-property data is not always sufficient for machine learning. Thus, a lot of finite element simulations are conducted instead of fatigue experiments to obtain the microstructure-property dataset. Also, it is necessary to quantify the microstructure for machine learning. Although there are various kinds of quantification methods, an appropriate method for fatigue prediction has not been established. We quantify microstructure by using one of the distribution functions: Two-point correlation [13]. The purpose of this study is to propose the method to predict cyclic stress-strain property from microstructure by combining finite element simulation, two-point correlation, and machine learning. Figure 1 shows the proposed framework. The framework can be divided into the following steps: (i) microstructure analysis with electron backscatter diffraction (EBSD) analysis, synthetic microstructure generation and two-point correlations, (ii) calibration of material parameters by strain-controlled low-cycle fatigue experiments, (iii) finite element simulations with the synthetic microstructure, and (iv) machine learning using two-point correlation and Ramberg-Osgood relationship. The rest of the paper is organized as follows. In Section 2, microstructure analysis and strain-controlled low-cycle fatigue experiments are presented. Finite element models are then created based on the results of the microstructure analysis, and parameters for simulation are identified from the results of fatigue experiments in Section 3. A set of finite element simulations is conducted, and stress-strain hysteresis loops are obtained. In Section 4, two-point correlations are calculated from finite element models, and the dimension of two-point correlation is reduced by principal component analysis (PCA). Then, machine learning is conducted with the obtained dataset in Section 5. The inputs are strain amplitude and principal components of two-point correlations, and the output is maximum stress (max) of each hysteresis loop. Finally, max is predicted from the result of microstructure analysis by the obtained prediction model, and it is compared with the experimental result in Section 6. The validity of the proposed method for predicting cyclic stressstrain properties is discussed.

Materials
Five types of low-carbon steel were used. Four were ferrite/pearlite dual-phase steels, and the other was pearlite single-phase steel. Table 1 shows the chemical composition of the steels. The chemical composition was measured on the rolled materials mainly by spark discharge optical emission spectrometry (SD-OES). The volume fraction of ferrite was calculated from the carbon contents (S25C: 71%, S35C: 60%, S45C: 40%, K1: 92%, Pearlite: 0%).

Materials
Five types of low-carbon steel were used. Four were ferrite/pearlite dual-phase steels, and the other was pearlite single-phase steel. Table 1 shows the chemical composition of the steels. The chemical composition was measured on the rolled materials mainly by spark discharge optical emission spectrometry (SD-OES). The volume fraction of ferrite was calculated from the carbon contents (S25C: 71%, S35C: 60%, S45C: 40%, K1: 92%, Pearlite: 0%).

Microstructure Analysis
Each sample was cut, mechanically polished by emery paper and alumina slurry, and the cross-section was finally polished with an argon ion cross-section polisher (SM-090010, JEOL, Tokyo, Japan). EBSD analysis was performed for the four steels (S25C, S35C, S45C, and K1) to characterize the grain morphology. The observation area was approximately 330 × 540 µm 2 and the step size was 2 µm. To identify the ferrite and pearlite phases, image quality (IQ) was used. The IQ value of pearlite tends to be smaller than that of ferrite because pearlite is the lath structure of ferrite and cementite. In this research, we assumed that grain with large IQ value is ferrite, and grain with small IQ value is pearlite. Based on this assumption, the threshold of IQ value between ferrite and pearlite was determined so that the area fraction of the IQ map matched the volume fraction calculated in Section 2.
After the phase identification, grains were fitted into ellipses that can be characterized by three parameters: the major axis a, the aspect ratio b/a (b is the minor axis) and the ellipse orientation angle θ with the horizontal direction. Phase identification and ellipse fitting were conducted using the software EDAX TSL OIM Analysis 7 (EDAX Inc., Mahwah, NJ, USA). Figure 2 shows the result of ellipse fitting for the ferrite grain of S25C steel. The black region represents the pearlite phase. The ellipse data were then fitted with probability distribution functions (PDF) by using MATLAB R2016a (Math Works Inc., Natick, MA, USA) distribution fitting toolbox. Since the major axis distribution exhibited a positive skewness, it was fitted with the log-normal distribution whose PDF is given by where µ and σ SD are the location and scale parameters respectively. The distribution of the aspect ratio was almost symmetrical. The normal distribution was used as a fitting function. The PDF of the normal distribution is given by Since the ellipse orientation did not follow a particular distribution, it was fitted with a nonparametric distribution based on normal kernel smoothing. The fitted cumulative distribution functions and experimental data for ferrite grain of S25C are displayed in Figure 3. The microstructures of the other three steels (S35C, S45C, and K1) were also analyzed by the same procedure. Finally, four sets of probability distributions for grain morphology were obtained.

Low-Cycle Fatigue Experiments
Strain-controlled low-cycle fatigue experiments were conducted to characterize the mechanical behavior of five materials under cyclic loading. Figure 4 shows the shape of specimens in the experiments. All experiments were conducted with a constant strain rate of 0.002 s −1 and four different strain amplitudes of 0.2%, 0.3%, 0.4%, and 0.5% on a 50 kN tension-compression testing machine (Servopulser 50 kN, Shimadzu, Kyoto, Japan). The frequency was set to a different value for each strain amplitude to keep the constant strain rate. The strain was measured using an extensometer with a gauge length of 8 mm (3442-008M-005M-ST, Epsilon Technology Corp., Jackson,

Low-Cycle Fatigue Experiments
Strain-controlled low-cycle fatigue experiments were conducted to characterize the mechanical behavior of five materials under cyclic loading. Figure 4 shows the shape of specimens in the experiments. All experiments were conducted with a constant strain rate of 0.002 s −1 and four different strain amplitudes of 0.2%, 0.3%, 0.4%, and 0.5% on a 50 kN tension-compression testing machine (Servopulser 50 kN, Shimadzu, Kyoto, Japan). The frequency was set to a different value for each strain amplitude to keep the constant strain rate. The strain was measured using an extensometer with a gauge length of 8 mm (3442-008M-005M-ST, Epsilon Technology Corp., Jackson,

Low-Cycle Fatigue Experiments
Strain-controlled low-cycle fatigue experiments were conducted to characterize the mechanical behavior of five materials under cyclic loading. Figure 4 shows the shape of specimens in the experiments. All experiments were conducted with a constant strain rate of 0.002 s −1 and four different strain amplitudes of 0.2%, 0.3%, 0.4%, and 0.5% on a 50 kN tension-compression testing machine (Servopulser 50 kN, Shimadzu, Kyoto, Japan). The frequency was set to a different value for each strain amplitude to keep the constant strain rate. The strain was measured using an extensometer with a gauge length of 8 mm (3442-008M-005M-ST, Epsilon Technology Corp., Jackson, WY, USA). The stabilized stress-strain hysteresis loops were extracted at half of fatigue life. The results are shown in Figure 5. In each figure, four hysteresis loops with different strain amplitudes were plotted. Also, the cyclic stress-strain (CSS) curve was calculated by the Ramberg-Osgood relationship [14] using the maximum stress of each hysteresis loop. The curve is referred to as the "cyclic flow curve" in the literature. The Ramberg-Osgood relationship is defined as follow where K and n are material constants, and E is Young's modulus. In this study, Young's modulus was 210 GPa. The obtained constants of Ramberg-Osgood relationship are shown in Table 2. These results were used to identify the material parameters for finite element simulations. WY, USA). The stabilized stress-strain hysteresis loops were extracted at half of fatigue life. The results are shown in Figure 5. In each figure, four hysteresis loops with different strain amplitudes were plotted. Also, the cyclic stress-strain (CSS) curve was calculated by the Ramberg-Osgood relationship [14] using the maximum stress of each hysteresis loop. The curve is referred to as the "cyclic flow curve" in the literature. The Ramberg-Osgood relationship is defined as follow where and are material constants, and is Young's modulus. In this study, Young's modulus was 210 GPa. The obtained constants of Ramberg-Osgood relationship are shown in Table 2. These results were used to identify the material parameters for finite element simulations.

Generation of Synthetic Microstructure
The generation of synthetic microstructures can be conducted based on various tessellation methods. In this study, microstructures were created by an anisotropic tessellation method [11] to reproduce the detailed grain morphology. In the method, ellipses were sampled until the total area occupied by the ellipse exceeds a given threshold: where a i and b i respectively correspond to the major and minor axis of the ellipse i randomly extracted from the probability distribution functions, λ is an efficiency factor set to 20%, and A is the total area. In the reference to [11], single-phase microstructures were created. In order to generate dual-phase microstructures, Equation (4) was rewritten to Equation (5).
where V is the volume fraction of each phase. After the ellipse sampling, the ellipses were positioned in the domain using a relaxed random sequential addition (RSA) algorithm [15]. The ellipses were allowed overlapping with the condition: where β is an overlap percentage proposed by St-Pierre et al. [16] and was set to 20%. The domain was set to 200 × 200 µm 2 . Figure 6a shows the example of the final configuration of the ellipse filling process by the relaxed RSA algorithm. The blue and red ellipse show ferrite and pearlite grains, respectively. Anisotropic tessellation was conducted for the obtained ellipses. The anisotropic metric function is as follow: where the anisotropic weight matrix W i is given by: where a i , b i , θ i are the major axis, minor maxis, and orientation of the ellipse i, respectively? Figure 6b shows the result of anisotropic tessellation from Figure 6a. The black grains are pearlite, and the colored grains are ferrite. After the tessellation, a finite element model was created based on the synthetic microstructure. The mesh size was 5 × 5 µm 2 (totally 40 × 40 = 1600 elements per microstructure). Figure 6c shows the finite element model derived from Figure

Constitutive Models
Two types of constitutive models were used to reproduce some basic phenomena observed experimentally: Isotropic hardening, kinematic hardening and cyclic hardening/softening. One is macroscopic 2 model and the other is a crystal plasticity (CP) model. The 2 model is based on the second invariant of the deviatoric stress 2 . The yield function is defined by where is the macroscopic stress tensor, is the kinematic back stress tensor and is the yield stress. Isotropic hardening is modeled as follow [17] where 0 is the yield stress at zero strain, ∞ is the maximum change in the yield stress, is the rate of hardening and ̅ is the equivalent plastic strain. On the other hand, kinematic back stress is defined as follow [18] where is the number of back stress , and and are material parameters. This model is already implemented in the FEM software Abaqus. In this study, the 2 model was used to analyze the plastic deformation of pearlite grains.
The 100% pearlite finite element models were created and used for parameter identification. The parameters of 2 model were calibrated by using the optimization module of Abaqus software

Constitutive Models
Two types of constitutive models were used to reproduce some basic phenomena observed experimentally: Isotropic hardening, kinematic hardening and cyclic hardening/softening. One is macroscopic J 2 model and the other is a crystal plasticity (CP) model. The J 2 model is based on the second invariant of the deviatoric stress J 2 . The yield function is defined by where σ is the macroscopic stress tensor, α is the kinematic back stress tensor and σ y is the yield stress. Isotropic hardening is modeled as follow [17] where σ 0 is the yield stress at zero strain, Q ∞ is the maximum change in the yield stress, b is the rate of hardening and ε pl is the equivalent plastic strain. On the other hand, kinematic back stress is defined as follow [18] .
where N is the number of back stress α k , and C k and γ k are material parameters. This model is already implemented in the FEM software Abaqus. In this study, the J 2 model was used to analyze the plastic deformation of pearlite grains. The 100% pearlite finite element models were created and used for parameter identification. The parameters of J 2 model were calibrated by using the optimization module of Abaqus software so that simulation results of hysteresis loops in five cycles reach experimental results for pearlite steel. Table 3 summarizes the parameters and Figure 7 shows the simulated hysteresis loops in five cycles compared with the experimental stable hysteresis loop for the pearlite steel. The Young's modulus and Poisson's ratio were set to E = 210 GPa and ν = 0.3, respectively. Table 3. Finite element analysis parameters of pearlite.

Elasticity
Isotropic Hardening Kinematic hardening (N = 5), C k (GPa) shows the simulated hysteresis loops in five cycles compared with the experimental stable hysteresis loop for K1 steel. The elastic coefficients were taken from the literature [26]. These materials exhibit rate-dependent plasticity behavior. However, the predictions for different strain rates are beyond the scope of this paper. The numerical models were loaded at the same strain rate (0.002 s −1 ) as the experiments.    The CP model considers plastic anisotropy for each grain. In this study, cubic elasticity following Hooke's law and characterized by three coefficients, C 1111 , C 1122 , and C 1212 was considered. The phenomenological constitutive law from the Damask user material subroutine for Abaqus [19] was used to analyze the plastic deformation in the ferrite grains. In the CP model, the crystal deformation gradient F is multiplicatively decomposed into elastic and plastic part, The deformation velocity gradient L is defined as By combining Equations (13) and (14), the velocity gradient can be expressed by Assuming that plastic deformation occurs due to only dislocation slip from the {110} 111 slip system, the plastic velocity gradient L p is expressed as follow [20] where . γ α , m α and n α are plastic share rate, slip direction vector, and normal to the slip plane of the slip system α, respectively. The N is the number of slip systems and was set to 12. Plastic shear rate is expressed by using resolved shear stress (RSS) and critical resolved shear stress (CRSS) [21][22][23][24] .
where τ α = S : (m α ⊗ n α ) and τ α c are the RSS and CRSS on the slip system α, respectively. The back stress χ α is incorporated in order to consider kinematic hardening effects. Work hardening is expressed by increasing the CRSS on each slip system, as proposed by Bronkhorst et al. [25]: with the hardening matrix h αβ defined by where α and β areslip systems, q αβ = 1 for coplanar slip systems while q αβ = 1.4 for non-coplanar systems. The h 0 , a and τ c s are the hardening coefficient, the hardening exponent, and the saturated CRSS, respectively. Finally, kinematic hardening is modeled following Frederik-Armstrong law [18] .

Parameter Identification
In order to identify crystal plasticity parameters for ferrite grains, a finite element model of K1 steel was created based on the probability distributions of the K1 steel and volume fraction of 92%. The detailed identification method is written in the literature [11]. Since the K1 steel is a ferrite-pearlite steel, material constants of pearlite in Table 3 were used for calculating the plastic deformation in pearlite grains. Table 4 summarizes the crystal plasticity parameters, and Figure 8 shows the simulated hysteresis loops in five cycles compared with the experimental stable hysteresis loop for K1 steel. The elastic coefficients were taken from the literature [26]. These materials exhibit rate-dependent plasticity behavior. However, the predictions for different strain rates are beyond the scope of this paper. The numerical models were loaded at the same strain rate (0.002 s −1 ) as the experiments.

Low-Cycle Fatigue Simulations
Low-cycle fatigue simulations were conducted by FEM. The distribution of major axis and aspect ratio, the volume fraction of each phase, and the crystal orientation were necessary to create finite element models by the method explained above. The probability distributions for S25C, S35C, S45C, and K1 steels obtained in Section 2 were used to create models. The volume fraction of the ferrite phase was changed under ten conditions (10%, 20%, 100%). In the crystal orientation, three Euler angles ( 1 , , 2 ) were randomly assigned. Under the same conditions of the probability distributions, volume fraction, and orientation, different microstructures are created due to the randomness of ellipse sampling and positioning. To increase the number of data, five models were created for each condition. Totally 200 models (four grain morphologies × ten volume fractions × five models) were created.
Periodic boundary conditions were applied in the x and y directions, and analysis was carried out under plane strain condition. Strain-controlled fatigue simulations for y-direction were conducted with the same condition as experiments (strain rate of 0.002 s −1 , strain amplitudes: 0.2%, 0.3%, 0.4%, and 0.5%). The number of cycles was set to five.
In total, 800 crystal plasticity finite element method (CPFEM) analyses (200 models × four strain amplitudes) were conducted in this study. Among these results, results of models that reproduce the actual materials for both microstructure and volume fraction about S25C, S35C, and S45C steels were compared with experimental results, as shown in Figure 9. Each point is maximum stress in the hysteresis loop at half-life for the experiment and five cycles for simulation. Each curve is the Ramberg-Osgood cyclic stress-strain curve calculated with four maximum stresses. Since the model reproducing the volume fraction of S25C steel (ferrite volume fraction is 71%) was not created, Figure 9a shows the comparison result between experimental results for S25C steel and simulated results for the model with the ferrite volume fraction of 70%. The discrepancy between the experiments and simulations increased with the carbon content. The main reason is that the crystal plasticity parameters were calibrated with the polycrystalline model with lower carbon content. In

Low-Cycle Fatigue Simulations
Low-cycle fatigue simulations were conducted by FEM. The distribution of major axis and aspect ratio, the volume fraction of each phase, and the crystal orientation were necessary to create finite element models by the method explained above. The probability distributions for S25C, S35C, S45C, and K1 steels obtained in Section 2 were used to create models. The volume fraction of the ferrite phase was changed under ten conditions (10%, 20%, 100%). In the crystal orientation, three Euler angles (ϕ 1 , φ, ϕ 2 ) were randomly assigned. Under the same conditions of the probability distributions, volume fraction, and orientation, different microstructures are created due to the randomness of ellipse sampling and positioning. To increase the number of data, five models were created for each condition. Totally 200 models (four grain morphologies × ten volume fractions × five models) were created.
Periodic boundary conditions were applied in the x and y directions, and analysis was carried out under plane strain condition. Strain-controlled fatigue simulations for y-direction were conducted with the same condition as experiments (strain rate of 0.002 s −1 , strain amplitudes: 0.2%, 0.3%, 0.4%, and 0.5%). The number of cycles was set to five.
In total, 800 crystal plasticity finite element method (CPFEM) analyses (200 models × four strain amplitudes) were conducted in this study. Among these results, results of models that reproduce the actual materials for both microstructure and volume fraction about S25C, S35C, and S45C steels were compared with experimental results, as shown in Figure 9. Each point is maximum stress in the hysteresis loop at half-life for the experiment and five cycles for simulation. Each curve is the Ramberg-Osgood cyclic stress-strain curve calculated with four maximum stresses. Since the model reproducing the volume fraction of S25C steel (ferrite volume fraction is 71%) was not created, Figure 9a shows the comparison result between experimental results for S25C steel and simulated results for the model with the ferrite volume fraction of 70%. The discrepancy between the experiments and simulations increased with the carbon content. The main reason is that the crystal plasticity parameters were calibrated with the polycrystalline model with lower carbon content. In order to realize a more accurate prediction, it is necessary to improve the microstructure reconstruction using a finer mesh and to examine a parameter calibration method in the future works. order to realize a more accurate prediction, it is necessary to improve the microstructure reconstruction using a finer mesh and to examine a parameter calibration method in the future works.

Microstructure Quantification by Two-Point Correlations
Two-point correlation is one of the correlation functions. In recent years, Kalidindi and co-workers have presented the mathematical framework for quantification of microstructure based on a two-point correlation [27][28][29][30]. In this study, the two-point correlation of discretized microstructure was used to quantify microstructure. At first, the discretized microstructure was characterized with microstructure function , where is a spatial location and is a local state. The microstructure function is the probability that the local state of location is . possess the following properties: where is the number of possible local states. One-point correlation is defined by where is the number of grid points. This one-point correlation corresponds to the volume fraction. In a similar manner, two-point correlation is defined by where is the vector between two points. Figure 10 shows

Microstructure Quantification by Two-Point Correlations
Two-point correlation is one of the correlation functions. In recent years, Kalidindi and co-workers have presented the mathematical framework for quantification of microstructure based on a two-point correlation [27][28][29][30]. In this study, the two-point correlation of discretized microstructure was used to quantify microstructure. At first, the discretized microstructure was characterized with microstructure function m n s , where s is a spatial location and n is a local state. The microstructure function m n s is the probability that the local state of location s is n. m n s possess the following properties: N n=1 m n s = 1, m n s ≥ 0 (21) where N is the number of possible local states. One-point correlation is defined by where S is the number of grid points. This one-point correlation corresponds to the volume fraction. In a similar manner, two-point correlation is defined by where t is the vector between two points. Figure 10 shows a simple example of a two-point correlation. Let n = 1 for the white cell and n = 0 for the gray cell. The microstructure functions are calculated like m 1 (1,0) = 1, m 1 (3,1) = 0 and m 0 (2,3) = 1. One-point correlation (volume fraction) is also calculated as f 1 = 9/16 and f 2 = 7/16. The red arrows mean t = (2, 2), blue arrows mean t = (0, 1) and green arrows mean t = (3, −1). In general, a periodic boundary condition was applied to microstructure to calculate the two-point correlation. The two-point correlations have the following properties: The vector t can take a value between −(x − 1) ≤ t x ≤ x − 1, −(y − 1) ≤ t y ≤ y − 1 for x × y domain. By using Equation (24), two-point correlation is defined in the following range: For dual-phase microstructure (n = 0, 1), four two-point correlations, f 00 t , f 11 t , f 01 t , f 10 t can be calculated. However, only N − 1 two-point correlations are independently defined for N phase microstructure by various interrelationships [27]. Therefore, only one two-point correlation f FF t . (ferrite-ferrite) was used to quantify ferrite-pearlite microstructures.
The two-point correlations were calculated using a fast Fourier transform (FFT) [27,29,30]. At first, a discrete Fourier transform (DFT) is conducted on Equation (23) where F means DFT. Now let s + t = z, then Finally, by periodic boundary condition, m n z = m n z+S , we can get the following equation where M n k = F (m n s ) and * is the complex conjugate. Two-point correlation is easily obtained from Equation (29)

Linear Regression Model
Linear regression is one way of regression analysis in which the relationship between one or more parameters is modeled by a linear regression equation. In general, the linear regression model can be expressed as where and are input and output value, is the number of inputs, is the coefficient and is the constant. We obtained the coefficients and constant using MATLAB R2016a 'regress' function. The algorithm of this function is written in the literature [31].
Generally in linear regression models, the more complex the model becomes, the more accurate the prediction becomes. However, if the model is too complex, the model becomes over-fitted, and the prediction for unknown data does not go well [32]. In this study, parameter selection from all 2 15 − 1 combination of 15 principal components was conducted so that the model became the most accurate. Moreover, cross validation was conducted to reduce the error depending on how to divide data into training and test set. In this study, 700 training data was divided into 70 groups of 10 data for the cross validation. We used RMSE (root mean squared error, Equation (31)) as the criteria for the model selection.
where is the number of data, is the experimental (or simulated by FEM) value and ̂ is the predicted value. The cross validation was conducted for all 2 15 − 1 combinations of principal components and the combination with the smallest RMSE was used as the most accurate principal components. Finally, by using these components, coefficients and constants were calculated with all 700 training data and the obtained model was tested for 100 test data. Details of the procedure can be found in reference [7].

Linear Regression Model
Linear regression is one way of regression analysis in which the relationship between one or more parameters is modeled by a linear regression equation. In general, the linear regression model can be expressed as where y and x i are input and output value, n is the number of inputs, a i is the coefficient and b is the constant. We obtained the coefficients and constant using MATLAB R2016a 'regress' function. The algorithm of this function is written in the literature [31]. Generally in linear regression models, the more complex the model becomes, the more accurate the prediction becomes. However, if the model is too complex, the model becomes over-fitted, and the prediction for unknown data does not go well [32]. In this study, parameter selection from all 2 15 − 1 combination of 15 principal components was conducted so that the model became the most accurate. Moreover, cross validation was conducted to reduce the error depending on how to divide data into training and test set. In this study, 700 training data was divided into 70 groups of 10 data for the cross validation. We used RMSE (root mean squared error, Equation (31)) as the criteria for the model selection.
where N is the number of data, y i is the experimental (or simulated by FEM) value andŷ i is the predicted value. The cross validation was conducted for all 2 15 − 1 combinations of principal components and the combination with the smallest RMSE was used as the most accurate principal components. Finally, by using these components, coefficients and constants were calculated with all 700 training data and the obtained model was tested for 100 test data. Details of the procedure can be found in reference [7].

Neural Network Model
An artificial neural network is one way of machine learning which is inspired by biological neural networks. It can approximate almost any function by adjusting the weights and biases of each layer and unit. The general feedforward neural network consists of three layers: Input, hidden, and output layer. In each unit in hidden layers, calculation, as shown in Equation (32), is conducted and the result h j is sent to the next layer.
where x i is the input value from ith unit in the former layer, I is the number of units in the former layer, w ij and b j are weight and bias, and ϕ is a differentiable non-linear function called "activation function." In the output unit, identity function ϕ(x) = x is generally used as the activation function. Thus, the output is calculated as where J is the number of the last hidden units. Backpropagation [33] is widely used to set weights and biases suitably in neural network training. Weights and biases are adjusted to minimize the sum of squared of error between dataset and predicted value as where y k and o k are dataset and predicted value, respectively. Various calculation procedures for the error reduction of backpropagation have been proposed, such as the steepest descent method and the Gauss-Newton method. Levenberg-Marquardt method [34,35] is the combination of the steepest descent method and the Gauss-Newton method and can finish training fast and accurately. As mentioned above, neural networks can approximate almost any function. In other words, it is important to be careful with over-fitting. The Bayesian framework is widely studied to control this problem [2,32,36]. In a neural network with Bayesian framework, weights and biases are adjusted to minimize the following function instead of E in Equation (34): where E D is the same as Equation (34). E w is the sum of squared of weights and it works to reduce the value of weights. Two parameters α and β are at first constant parameters corresponding to the variance of weights and data in the beginning, respectively. The α and β are updated during the training process. The parameter adjusting process is a substitute for validation data. Thus only training and test data are necessary for a neural network with Bayesian framework. In MATLAB R2016a Neural Network Toolbox, various algorithms for neural network training are prepared. In this study, we used the Levenberg-Marquardt method with a Bayesian framework from these prepared algorithms. Various activation functions have been used in neural networks. Sigmoid function (Equation (37)) was used commonly at the beginning of neural network study because it is simple and easy to differentiate.
After that, Glorot and Bengio [37] suggested that hyperbolic tangent (Equation (38)) is better and later, Glorot et al. [38] presented that Rectified Linear Unit (ReLU, Equation (39)) is better than hyperbolic tangent. LeCun et al. [39] reported in 2015 that ReLU is the best activation function. However, ReLU is a very simple function and expressing ability by itself is poor and is not necessarily the best activation function for the network with few hidden layers.
In this study, hyperbolic tangent and ReLU were compared. In general, input and output values are normalized before training to eliminate the deviation of data. In this study, both the input and output variables except strain amplitude were normalized so that the average becomes zero and the variance becomes one (standardization) as follows: where x i is the input/output value of ith data, x i is the normalized value and x and s are the average and standard deviation. The training of the neural network with multi hidden layers is called "deep learning." The improvement of computer performance and a vast amount of data obtained by the development of the Internet boost the research of deep neural networks. It is considered that prediction accuracy is improved with deep learning, but the general deep learning is conducted with more than thousands of data. Only 800 data is available in this study, and it is not appropriate to use deep learning in this case. Thus, the number of hidden layers was set to only 1 or 2 in this study. Moreover, the number of units in hidden layers may affect prediction accuracy. In this study, three conditions of hidden layers were compared.

•
One hidden layer with five units • One hidden layer with ten units • Two hidden layers with ten and five units Together with two activation functions, six neural networks were constructed and trained/tested with the same data. The network with the minimum RMSE value for test data was used as the best neural network model.

Microstructure-Property Dataset
In Section 4, 800 CPFEM analyses (200 models × 4 strain amplitudes) were conducted. The principal components of two-point correlations of 200 finite element models were calculated in Section 5. The 15 principal component scores and strain amplitude were set to input values. The calculated maximum stress σ max in five cycles was set to output value. We have 800 microstructure-property data in total. The data were sorted randomly and divided into two groups: 700 for training and 100 for testing. Although the dataset is recommended to divide three groups of training, validation, and testing [32], we used the machine learning algorithm that does not require a validation dataset. Figure 11 shows the result of parameter selection with the cross validation in the linear regression model. The horizontal axis is the number of principal components, and the vertical axis is the minimum RMSE value of the model of each number of parameters. It can be seen that RMSE reached a minimum when the model had ten principal components (1,2,6,7,8,9,11,13,14, and 15th principal component). Figure 12 shows the prediction result with these ten principal components. The horizontal axis is the maximum stress σ max calculated by CPFEM (dataset value) and the vertical axis is σ max predicted by regression formula. This figure means that the prediction accuracy is higher as the plot is closer to the straight line. The RMSE was 11.6 MPa for training data and 12.1 MPa for test data. regression model in Figure 12. Also, RMSE was 1.38 MPa for training data and 1.37 MPa for test data. It is clear both visually and quantitatively that the neural network model was more accurate than the linear regression model. Moreover, these obtained models were compared with CPFEM results. Two comparisons were conducted by changing only one parameter. First, only strain amplitude was changed with fixed grain morphology (K1) and ferrite volume fraction (50%). The result is shown in Figure 14a. Cyclic stress-strain curves as Ramberg-Osgood relationships are calculated and also plotted. Second, only the ferrite volume fraction was changed with fixed grain morphology (S25C) and strain amplitude (0.3%). The result is shown in Figure 14b. As shown in these figures, machine learning, especially the neural network model, could predict the fatigue property accurately. Figure 11. Result of parameter selection in linear regression model. Figure 11. Result of parameter selection in linear regression model. data. It is clear both visually and quantitatively that the neural network model was more accurate than the linear regression model. Moreover, these obtained models were compared with CPFEM results. Two comparisons were conducted by changing only one parameter. First, only strain amplitude was changed with fixed grain morphology (K1) and ferrite volume fraction (50%). The result is shown in Figure 14a. Cyclic stress-strain curves as Ramberg-Osgood relationships are calculated and also plotted. Second, only the ferrite volume fraction was changed with fixed grain morphology (S25C) and strain amplitude (0.3%). The result is shown in Figure 14b. As shown in these figures, machine learning, especially the neural network model, could predict the fatigue property accurately.   Table 5 shows the comparison result of six neural network conditions. RMSE for test data became minimum value when the activation function was hyperbolic tangent, and the number of hidden layers was two with ten and five units. The prediction result with this condition is shown in Figure 13. It can be visually confirmed that plots are closer to the straight line than those of the linear regression model in Figure 12. Also, RMSE was 1.38 MPa for training data and 1.37 MPa for test data. It is clear both visually and quantitatively that the neural network model was more accurate than the linear regression model.     Moreover, these obtained models were compared with CPFEM results. Two comparisons were conducted by changing only one parameter. First, only strain amplitude was changed with fixed grain morphology (K1) and ferrite volume fraction (50%). The result is shown in Figure 14a. Cyclic stress-strain curves as Ramberg-Osgood relationships are calculated and also plotted. Second, only the ferrite volume fraction was changed with fixed grain morphology (S25C) and strain amplitude (0.3%). The result is shown in Figure 14b. As shown in these figures, machine learning, especially the neural network model, could predict the fatigue property accurately.

Comparison of Simulation and Experimental Results
Experimental and simulated cyclic stress-strain curves were shown in Figure 9. As shown in the figure, the simulated results for S25C matched experimental results, but results for S35C and S45C did not match well. Since the mesh size was 5 µm in these simulations, the lower limit of grain size was 5 µm. However, the actual specimen, which was used for parameter identification, contains smaller grains than 5 µm. Thus, there is the possibility that the grain size in 2D modeling was overestimated. Since the parameters of the crystal plasticity model in ferrite grains were fitted to the K1 steel with the ferrite volume fraction of 92%, the prediction accuracy of the cyclic stress-strain curve decreased with increasing the divergence of volume fraction from K1. It seems that this was the reason that the simulated results for S25C matched experimental result but results for S35C and S45C did not.
There are two approaches to solve this problem. One is to use finer mesh, and the other is to modify the material parameters. The former method increases the calculation time and is not suitable for preparing a large dataset. The latter approach can provide a better fitting for S35C and S45C steels by adjusting the material property parameters. However, this approach may provide poor agreement with the experimental data of K1 and S25C steels. Therefore, identified parameters in this study were reasonable in case of analyzing various volume fractions with one material property.

Comparison of Machine Learning and Experimental Results
Finally, σ max was predicted from the result of microstructure analysis by the obtained prediction model, and it was compared with the experimental result. The square area of 200 × 200 µm 2 was trimmed from the result of EBSD analysis for S25C, S35C and S35C steels (e.g., Figure 2). These trimmed images were meshed by 5 µm and binarized to ferrite and pearlite. Then, two-point correlation f FF t was calculated and principal component scores were calculated using principal component coefficient obtained from 200 finite element models. These 15 principal component scores and strain amplitude (0.2%, 0.3%, 0.4%, 0.5%) were substituted for obtained linear regression/neural network model and σ max was predicted. Finally, the cyclic stress-strain curve as the Ramberg-Osgood relationship was calculated and compared with experimental results.
The results are shown in Figure 15. The red points are experimental σ max for each hysteresis loop and three curves are cyclic stress-strain curves. In the result for S25C (Figure 15a), there is no large difference between two prediction results and these results showed good agreement with the experimental result. On the other hand, in the result for S35C and S45C (Figure 15b,c), there is a difference between the prediction results of linear regression and the neural network model, and the experimental result lay between two prediction results. For S25C steel, there is a good agreement of experimental, simulated and predicted results. Therefore, the prediction method proposed in this study is effective for fatigue property prediction. As shown so far, cyclic stress-strain property could be predicted from microstructure. Recently, the Tanaka-Mura model has been modified and shown to be useful for the prediction of crack initiation in various metallic materials [40,41]. By combining the Tanaka-Mura model and CPFEM, as explained in our previous works [11,12], this prediction can be linked to fatigue life prediction.

Conclusions
In this study, the finite element method, two-point correlation and machine learning were combined to propose the new method to predict cyclic stress-strain property from the microstructure of ferrite-pearlite steel. Based on the experiment, simulation and prediction results and discussion presented in the preceding sections, the following conclusions were obtained.  The result of the finite element analysis showed good agreement with the experimental results. The results confirmed that the material parameters identified in this study were appropriate for fatigue analysis.  Cyclic stress-strain property of ferrite-pearlite steel could be predicted with high accuracy by combining two-point correlation and machine learning. Also, the prediction error of the neural network model was smaller than that of the linear regression model.  Cyclic stress-strain property predicted from the result of microstructure analysis by the model obtained by machine learning showed a good agreement with the experimental results. Thus, the prediction method proposed in this study was shown to be effective for fatigue property prediction.

Conclusions
In this study, the finite element method, two-point correlation and machine learning were combined to propose the new method to predict cyclic stress-strain property from the microstructure of ferrite-pearlite steel. Based on the experiment, simulation and prediction results and discussion presented in the preceding sections, the following conclusions were obtained.

•
The result of the finite element analysis showed good agreement with the experimental results. The results confirmed that the material parameters identified in this study were appropriate for fatigue analysis.

•
Cyclic stress-strain property of ferrite-pearlite steel could be predicted with high accuracy by combining two-point correlation and machine learning. Also, the prediction error of the neural network model was smaller than that of the linear regression model.

•
Cyclic stress-strain property predicted from the result of microstructure analysis by the model obtained by machine learning showed a good agreement with the experimental results. Thus, the prediction method proposed in this study was shown to be effective for fatigue property prediction.