Development of Closed-Form Equations for Estimating Mechanical Properties of Weld Metals according to Chemical Composition

: In this study, data analysis was performed using an artiﬁcial neural network (ANN) approach to investigate the effect of the chemical composition of welds on their mechanical properties (yield strength, tensile strength, and impact toughness). Based on the data collected from previously performed experiments, correlations between related variables and results were analyzed and predictive models were developed. Sufﬁcient datasets were prepared using data augmentation techniques to solve problems caused by insufﬁcient data and to make better predictions. Finally, closed-form equations were developed based on the predictive models to evaluate the mechanical properties according to the chemical composition.


Introduction
As welding is applied in almost all industrial fields, it is crucial in modern industries. Numerous studies have been conducted to improve the quality of weldments. The quality of a weld depends on its mechanical properties such as yield and tensile strengths, impact toughness, and hardness. These properties are determined by parameters such as chemical composition, microstructure, heat input, interpass temperature, and preheating temperature [1].
In terms of the microstructure, acicular ferrite (AF) is formed at a low heat input with a fast cooling rate, improving the low-temperature toughness. It grows in the form of laths and plates, and is formed in an interlocking structure, which prevents crack propagation. As the AF fraction increases, the strength also increases. Grain boundary ferrite (GBF) with large grain sizes is frequently generated at a slow cooling rate, which is a condition of high heat input. Ferrite side plates (FSPs) consist only of the boundary between laths grown in the same direction at the austenite grain boundary. Both GBF and ESPs have a significant adverse effect on toughness owing to their low crack resistance [1,2]. In addition, martensite-austenite (M-A) constituents, which are frequently generated in high heat input welding, are microstructures that adversely affect the transition temperature as their fraction is increased [3].
In terms of welding conditions, the higher the interpass temperature, the slower the cooling rate and the lower the AF fraction. Therefore, the tensile and yield strengths decrease, and the transition temperature also tends to decrease. In the case of a large heat input, the AF fraction decreases, and the tensile and yield strengths decrease. However, in the case of an excessively low heat input, the impact toughness is adversely affected; therefore, an appropriate amount of heat should be inputted [4][5][6].
Thus far, numerous studies have been conducted to determine the effect of chemical composition on mechanical properties. Shao et al. [7] studied the effect of chemical composition on the fracture toughness of bulk metallic glasses. Balaguru et al. [8] studied the effect of weld metal composition on the impact toughness properties of SMAW-welded ultrahigh hard armor steel joints. Glover et al. [4] studied the effect of cooling rate and chemical composition on the microstructure of weld joints of C-Mn and HSLA steels. Takashima and Minami [9] predicted the Charpy absorbed energy of steel for welded structures in the ductile-brittle transition temperature (DBTT) range. Jorge et al. [10] reviewed the relationship between the microstructure and the impact toughness of C-Mn and high-strength low-alloy steel weld metals based on the work of Evans and Bailey [1]. Khalaj and Poraliakbar [11,12] predicted the effects of chemical composition and heat treatment on the phase transformation of microalloy steel using ANN models to estimate the bainite fraction using the austenitization temperature as a parameter. Pak et al. [13] predicted the impact toughness change owing to the interlayer temperature and Ni and Mn concentrations using ANN models. Jung et al. [14] predicted the yield strength, ultimate tensile strength, and high-strength yield for various microstructures by means of soft magnetic wave linear regression and ANN-based algorithms. He et al. [15] devised a physical model to predict the yield stress of bainitic steel by dislocation strengthening and lath boundary strengthening based on the correlation between dislocation density, lath thickness, and yield stress.
Whereas numerous studies have been carried out on the effect of chemical composition on mechanical properties of weld metals, most of the studies were limited to specific conditions or chemical components. There are few studies that formulate experimental results for various chemical components so that their effects can be simply presented. Therefore, it is worthwhile to propose closed-form equations developed by a statistical method.
In this study, artificial neural networks (ANNs), a field of machine learning, were used to determine the effect of chemical composition on mechanical properties (yield strength, tensile strength, and impact toughness). Based on the collected data, correlations between the related variables and the results were analyzed, and a predictive model was developed. The experimental results were extracted from Evans and Bailey [1], and a sufficient dataset was prepared using a data augmentation technique to make better predictions. Finally, closed-form equations were developed based on the predictive models to evaluate the mechanical properties according to the chemical composition.

Data Collection
In this study, experimental data were extracted from Evans and Bailey [1] to understand changes in the mechanical properties with respect to the chemical composition. Mn, which is known to improve the strength significantly, was used as the base. Next, results of experiments on the yield strength, tensile strength, and impact toughness according to the increase in the content of each alloying element were analyzed. To evaluate the impact toughness, the Charpy V-notch (CVN) transition temperature at 100 J was applied, hereinafter referred to as the CVN temperature. Figure 1a,b shows the shape of the weld used in this study and the sampling method of the specimen. Shielded metal arc welding (SMAW) was applied as the welding technique. The base material and the electrode were welded by using the arc heat generated between a coated electrode and a metal. Herein, ISO 2560:2020, the welding standard for mild steel and low-alloy steel, was applied [16].
The welding electrode was manufactured using the standard technique for a 25% iron powder-coated electrode, and all components were kept constant, except for the components used in the investigation. The automatic spectrographic (ICP-AES) technique was applied for chemical composition analysis. All the core wires of the welding electrode had similar typical compositions of 0.07 C, 0.50 Mn, 0.008 Si, 0.006 S, 0.008 P, 0.02 Cr, 0.003 Mo, 0.03 Ni, 0.02 Cu, 0.0004 Ti, 0.0015 Al, 0.0005 Nb, 0.0005 V, 0.0002 B, 0.02 O, and 0.0025 N (wt%).  The welding electrode was manufactured using the standard technique for a 25% iron powder-coated electrode, and all components were kept constant, except for the components used in the investigation. The automatic spectrographic (ICP-AES) technique was applied for chemical composition analysis. All the core wires of the welding electrode had similar typical compositions of 0.07 C, 0.50 Mn, 0.008 Si, 0.006 S, 0.008 P, 0.02 Cr, 0.003 Mo, 0.03 Ni, 0.02 Cu, 0.0004 Ti, 0.0015 Al, 0.0005 Nb, 0.0005 V, 0.0002 B, 0.02 O, and 0.0025 N (wt%).
The weldment was performed in three beads per layer, as shown in Figure 1a. Two 20 mm-thick mild steel plates were attached to a backing strip, and a jig in a flat (downhand) position with a stringer bead was welded. The root gap size was designed to be 16 mm as shown in Figure 1b. Welding conditions were applied such that the current was 170 A, the voltage was 21 V, the interpass temperature was maintained at 200 °C, and the welding speed was adjusted so that the heat input was 1 kJ/mm. For the interpass temperature, the maximum interpass temperature suitable for welding short blocks in ISO 2560:2020 was applied.
The cross section of the panel after welding is shown in Figure 1b. The mechanical properties were tested under 'as-welded' conditions, and the specimens for tensile testing were heated at 250 °C for 14 h to remove diffusive hydrogen. Duplicate tests were conducted with ISO 6892 standard specimens using a gauge with a length of 500 mm and a diameter of 5 mm in a direction parallel to the welding direction.
In the case of metals, the fracture behavior varies from ductile to brittle as the temperature changes from high to low. This change in the fracture mode depends on the material ability to absorb fracture energy. Evans and Bailey [1] conducted an impact toughness test using a CVN impact test, and the specimen was manufactured by machining an axis perpendicular to the welding direction and a notch perpendicular to the plate surface, as shown in Figure 1b. The CVN impact test followed the E23 test procedure of the American Society for Testing Materials (ASTM). Figure 2 shows a typical CVN impact energy curve with respect to the temperature. As shown in Figure 2, the curve is divided into three regions: the upper shelf, lower shelf, and transition. The upper shelf region is a ductile region, the lower shelf is a brittle region, and the transition region is a reference temperature for defining the failure mode as the DBTT. The Charpy impact toughness is generally evaluated based on the transition temperature when absorbing 28 J and 100 J impact energies. In this study, we applied only 100 J of impact energy from a conservative point of view. The 100 J transition temperature data used in this study were obtained from the impact energy and temperature curves obtained through 36 impact tests for each composition condition. The weldment was performed in three beads per layer, as shown in Figure 1a. Two 20 mm-thick mild steel plates were attached to a backing strip, and a jig in a flat (downhand) position with a stringer bead was welded. The root gap size was designed to be 16 mm as shown in Figure 1b. Welding conditions were applied such that the current was 170 A, the voltage was 21 V, the interpass temperature was maintained at 200 • C, and the welding speed was adjusted so that the heat input was 1 kJ/mm. For the interpass temperature, the maximum interpass temperature suitable for welding short blocks in ISO 2560:2020 was applied.
The cross section of the panel after welding is shown in Figure 1b. The mechanical properties were tested under 'as-welded' conditions, and the specimens for tensile testing were heated at 250 • C for 14 h to remove diffusive hydrogen. Duplicate tests were conducted with ISO 6892 standard specimens using a gauge with a length of 500 mm and a diameter of 5 mm in a direction parallel to the welding direction.
In the case of metals, the fracture behavior varies from ductile to brittle as the temperature changes from high to low. This change in the fracture mode depends on the material ability to absorb fracture energy. Evans and Bailey [1] conducted an impact toughness test using a CVN impact test, and the specimen was manufactured by machining an axis perpendicular to the welding direction and a notch perpendicular to the plate surface, as shown in Figure 1b. The CVN impact test followed the E23 test procedure of the American Society for Testing Materials (ASTM). Figure 2 shows a typical CVN impact energy curve with respect to the temperature. As shown in Figure 2, the curve is divided into three regions: the upper shelf, lower shelf, and transition. The upper shelf region is a ductile region, the lower shelf is a brittle region, and the transition region is a reference temperature for defining the failure mode as the DBTT. The Charpy impact toughness is generally evaluated based on the transition temperature when absorbing 28 J and 100 J impact energies. In this study, we applied only 100 J of impact energy from a conservative point of view. The 100 J transition temperature data used in this study were obtained from the impact energy and temperature curves obtained through 36 impact tests for each composition condition. Table 1 shows the composition of the weld metal with varying alloy and Mn content used in this study. Nominal Mn contents of 0.6%, 1.0%, 1.4%, and 1.8% were used, and for each Mn content, three to five varying contents of each element were tested. Table 2 shows typical compositions, except for the elements under investigation.  Table 1 shows the composition of the weld metal with varying alloy and Mn con used in this study. Nominal Mn contents of 0.6%, 1.0%, 1.4%, and 1.8% were used, an each Mn content, three to five varying contents of each element were tested. Table 2 sh typical compositions, except for the elements under investigation.

Data Augmentation
A sufficiently large dataset is required for accurately determining the relationship between the input and the output because the ANN is a data-driven approach. However, the amount of publicly available data on changes in mechanical properties with respect to weld chemical composition is small. Although experiments have been conducted in a large number of studies, it is difficult to create a single dataset owing to different welding methods and conditions. Although the data by Evans and Bailey [1] applied in this study are relatively diverse and provide many experimental results, the amount of data for individual experiments is still insufficient, which may lead to inaccurate predictions. Therefore, in this study, the amount of initial input data was increased by applying a data augmentation technique. Figure 3 shows an example of the data augmentation applied in this study. There are only four Mn test data points for the 0.04 C condition. Because the amount of data is small and the trends are different, training the ANN directly may cause an inaccurate fitting or overfitting. Therefore, the test data (asterisks in Figure 3) were first plotted for each condition, and then, nonlinear regression was performed with a line that expressed the test data as good as possible. Finally, ten additional data points (circles in Figure 3) were added to the regression line. Although scatter exists owing to the nature of the experimental data, it is expected that a predictive model with better performance can be generated because the finely tuned data after direct fitting are used for model training.

Data Augmentation
A sufficiently large dataset is required for accurately determining the relations between the input and the output because the ANN is a data-driven approach. Howev the amount of publicly available data on changes in mechanical properties with respec weld chemical composition is small. Although experiments have been conducted i large number of studies, it is difficult to create a single dataset owing to different weld methods and conditions. Although the data by Evans and Bailey [1] applied in this stu are relatively diverse and provide many experimental results, the amount of data for dividual experiments is still insufficient, which may lead to inaccurate predictions. The fore, in this study, the amount of initial input data was increased by applying a data a mentation technique. Figure 3 shows an example of the data augmentation applied in this study. There only four Mn test data points for the 0.04 C condition. Because the amount of data is sm and the trends are different, training the ANN directly may cause an inaccurate fitting overfitting. Therefore, the test data (asterisks in Figure 3) were first plotted for each c dition, and then, nonlinear regression was performed with a line that expressed the data as good as possible. Finally, ten additional data points (circles in Figure 3) w added to the regression line. Although scatter exists owing to the nature of the exp mental data, it is expected that a predictive model with better performance can be gen ated because the finely tuned data after direct fitting are used for model training.

ANN Model
As shown in Table 1, in this study, we intended to create ANN models that can predict mechanical properties of welds using two variables (chemical compositions). If only one output is obtained from two inputs, multivariate nonlinear regression can be applied because it can be expressed in three dimensions. However, it is difficult to use the existing statistical method to create a model that calculates three outputs (yield strength, tensile strength, and 100 J Charpy temperature) simultaneously. Therefore, it is required to apply the ANN, which is a machine-learning approach that can effectively express multivariate nonlinear systems. The ANN has no limit to the number of input or output variables. If an appropriate structure is used, multiple outputs can be simultaneously derived from multiple inputs. Figure 4 shows the structure of the ANN model used in this study. This neural network had one input layer, one hidden layer, and one output layer. In the figure, the circle represents the node constituting each layer, and the connecting line represents the weight and the bias representing the relationship between each node. Two input vectors (for example, the content of Mn and the content of C) are fed into the input layer and the output vectors (yield strength, tensile strength, and 100 J Charpy transition temperature) are fed into the output layer. Weights and biases between each node are calculated according to the predetermined ANN model structure. Because the number of hidden layer nodes determines the nonlinearity of the entire system, it should be optimized for the corresponding system. That is, too few nodes result in high computation speed but poor accuracy, and many nodes can lead to overfitting and slow computation.

ANN Model
As shown in Table 1, in this study, we intended to create ANN models that can predict mechanical properties of welds using two variables (chemical compositions). If only one output is obtained from two inputs, multivariate nonlinear regression can be applied because it can be expressed in three dimensions. However, it is difficult to use the existing statistical method to create a model that calculates three outputs (yield strength, tensile strength, and 100 J Charpy temperature) simultaneously. Therefore, it is required to apply the ANN, which is a machine-learning approach that can effectively express multivariate nonlinear systems. The ANN has no limit to the number of input or output variables. If an appropriate structure is used, multiple outputs can be simultaneously derived from multiple inputs. Figure 4 shows the structure of the ANN model used in this study. This neural network had one input layer, one hidden layer, and one output layer. In the figure, the circle represents the node constituting each layer, and the connecting line represents the weight and the bias representing the relationship between each node. Two input vectors (for example, the content of Mn and the content of C) are fed into the input layer and the output vectors (yield strength, tensile strength, and 100 J Charpy transition temperature) are fed into the output layer. Weights and biases between each node are calculated according to the predetermined ANN model structure. Because the number of hidden layer nodes determines the nonlinearity of the entire system, it should be optimized for the corresponding system. That is, too few nodes result in high computation speed but poor accuracy, and many nodes can lead to overfitting and slow computation. In this study, the number of nodes in the hidden layer was determined using a case study, as shown in Figure 5. For each model, from two to five hidden layer nodes were tested based on the prediction performance. It was observed that the results almost converged if the number of hidden layer nodes exceeded two. Therefore, in consideration of efficiency, three nodes were selected for the hidden layer. The prediction performance was evaluated using the Pearson correlation coefficient between the predicted and target values. That is, the closer it is to one, the better the prediction result. In this study, the number of nodes in the hidden layer was determined using a case study, as shown in Figure 5. For each model, from two to five hidden layer nodes were tested based on the prediction performance. It was observed that the results almost converged if the number of hidden layer nodes exceeded two. Therefore, in consideration of efficiency, three nodes were selected for the hidden layer. The prediction performance was evaluated using the Pearson correlation coefficient between the predicted and target values. That is, the closer it is to one, the better the prediction result.  Equation (1) presents the ANN model applied in this study in the matrix form: 'tanh' was used as the activation function for the nonlinearity of the model and the backpropagation algorithm was applied for weight and bias optimization. For the calculation efficiency, all values applied to the calculation were normalized between −1 and 1, as in Equation (2). Therefore, the final derived outputs should be denormalized using Equation (1).
where is the normalized value and is the original value. and are the maximum and minimum values of , respectively. To evaluate the performance of the trained model, 15% of the original dataset was used as a test dataset. That is, 85% of the total dataset was used for model training and 15% was used for model evaluation. Moreover, out of 85%, 15% was again separated and used for model validation to prevent data overfitting [18]. In summary, the original dataset was divided into training, validation, and testing sets, as illustrated in Figure 6.  Figure 7a,b shows the model performances at the training and testing stages in the Mn-C model, respectively, and Figure 7c presents the performance after combing these two stages. Here, the x-axis represents the target value, and the y-axis represents the predicted value. R represents the Pearson correlation coefficient. Table 3 shows R values for all models. As a result, it can be seen that all models show high prediction accuracy. Equation (1) presents the ANN model applied in this study in the matrix form: 'tanh' was used as the activation function for the nonlinearity of the model and the backpropagation algorithm was applied for weight and bias optimization. For the calculation efficiency, all values applied to the calculation were normalized between −1 and 1, as in Equation (2). Therefore, the final derived outputs should be denormalized using Equation (1).

Model performance
where X N is the normalized value and X R is the original value. X max and X min are the maximum and minimum values of X R , respectively.
To evaluate the performance of the trained model, 15% of the original dataset was used as a test dataset. That is, 85% of the total dataset was used for model training and 15% was used for model evaluation. Moreover, out of 85%, 15% was again separated and used for model validation to prevent data overfitting [18]. In summary, the original dataset was divided into training, validation, and testing sets, as illustrated in Figure 6.  Equation (1) presents the ANN model applied in this study in the matrix form: 'tanh' was used as the activation function for the nonlinearity of the model and the backpropagation algorithm was applied for weight and bias optimization. For the calculation efficiency, all values applied to the calculation were normalized between −1 and 1, as in Equation (2). Therefore, the final derived outputs should be denormalized using Equation (1).
where is the normalized value and is the original value. and are the maximum and minimum values of , respectively. To evaluate the performance of the trained model, 15% of the original dataset was used as a test dataset. That is, 85% of the total dataset was used for model training and 15% was used for model evaluation. Moreover, out of 85%, 15% was again separated and used for model validation to prevent data overfitting [18]. In summary, the original dataset was divided into training, validation, and testing sets, as illustrated in Figure 6.  Figure 7a,b shows the model performances at the training and testing stages in the Mn-C model, respectively, and Figure 7c presents the performance after combing these two stages. Here, the x-axis represents the target value, and the y-axis represents the predicted value. R represents the Pearson correlation coefficient. Table 3 shows R values for all models. As a result, it can be seen that all models show high prediction accuracy.
Model performance Figure 6. Data splitting ratio. Figure 7a,b shows the model performances at the training and testing stages in the Mn-C model, respectively, and Figure 7c presents the performance after combing these two stages. Here, the x-axis represents the target value, and the y-axis represents the predicted value. R represents the Pearson correlation coefficient. Table 3 shows R values for all models. As a result, it can be seen that all models show high prediction accuracy.

Closed-Form Equations
The previous section discussed the ANN model development for predicting the mechanical properties according to changes in the content of Mn and other compositions. Developing an ANN model is the process of deriving the connection between each node, that is, weight and bias, as shown in Equation (1), through the 'learning' process based on the given data. Therefore, it is possible to use Equation (1) with the calculated weights and biases as a closed-form equation. We prepared the ANN models for each case derived in this study using Equation (3). The models for all eight cases, as summarized in Table 1, are presented in the matrix form. For example, n = 1 represents a model for Mn-C, and n = 2 represents Mn-Cr. To calculate the CVN temperature, yield strength, and tensile strength according to the Mn-C content, the Mn and C contents are entered into the P1 vector. Each value of the y1 vector is then calculated through a matrix operation. However, because these values are normalized, as mentioned in Section 3.1, the input values should be normalized first before applying Equation (3). The values required for normalization are listed in Table 4. The final calculated y1 must be denormalized again to convert it to the original scale.

Closed-Form Equations
The previous section discussed the ANN model development for predicting the mechanical properties according to changes in the content of Mn and other compositions. Developing an ANN model is the process of deriving the connection between each node, that is, weight and bias, as shown in Equation (1), through the 'learning' process based on the given data. Therefore, it is possible to use Equation (1) with the calculated weights and biases as a closed-form equation. We prepared the ANN models for each case derived in this study using Equation (3). The models for all eight cases, as summarized in Table 1, are presented in the matrix form. For example, n = 1 represents a model for Mn-C, and n = 2 represents Mn-Cr. To calculate the CVN temperature, yield strength, and tensile strength according to the Mn-C content, the Mn and C contents are entered into the P 1 vector. Each value of the y 1 vector is then calculated through a matrix operation. However, because these values are normalized, as mentioned in Section 3.1, the input values should be normalized first before applying Equation (3). The values required for normalization are listed in Table 4

Model Performance Estimation and Discussion
The results calculated using the ANN model (Equation (3)) were compared with the original test data to estimate the model performance. Figures 8-10 show the comparison between the calculated and experimental results using one of the eight models (Mn-Ni) in terms of 100 J Charpy temperature, yield strength, and tensile strength, respectively. The comparison results for the rest of the models for temperature are shown in Figures A1-A7 in Appendix A. In the case of yield and tensile strengths, we only included one case (Figures 9 and 10) because all other cases showed a nearly linear relationship and were almost consistent with the experimental results. In the figure, the lines represent the estimation (E), and the symbols represent the tested data (T).
It can be seen that the trend of the fitting results for the experimental results varies greatly depending on the type or content of the ingredients contained. For example, in Figure 8a, the temperature difference is not large depending on the Mn content in 0.5 Ni, and the temperature does not increase even when the Mn content is increased. However, as the Ni content is increased, the temperature increases rapidly with the Mn content, particularly for 3.5 Ni (Figure 8d). As shown in the other figures in the Appendix A, the trends are all different for the other elements. This indicates that the nonlinearity is large, depending on the type or content of the chemical component. In contrast, in terms of yield and tensile strengths, although the slopes are slightly different, they show an almost perfect linear relationship with the Mn content, regardless of the type or content of the component.   It can be seen that the trend of the fitting results for the experimental results varies greatly depending on the type or content of the ingredients contained. For example, in Figure 8a, the temperature difference is not large depending on the Mn content in 0.5 Ni, and the temperature does not increase even when the Mn content is increased. However as the Ni content is increased, the temperature increases rapidly with the Mn content particularly for 3.5 Ni (Figure 8d). As shown in the other figures in the Appendix A, the trends are all different for the other elements. This indicates that the nonlinearity is large depending on the type or content of the chemical component. In contrast, in terms of yield and tensile strengths, although the slopes are slightly different, they show an almost perfect linear relationship with the Mn content, regardless of the type or content of the component.
As a result, in most cases, the estimated and experimental values show good agreement. However, in some cases, for example, in the case of the CVN temperature of Mn-1.1Mo in Figure A4, an estimation error occurs. This seems to be because, in this case, the experimental results have greater nonlinearity than in the other cases. When the nonlinearity of the data used in the ANN model is high, the fitting accuracy can be improved by increasing the number of hidden layers. However, if the number of hidden layers is increased, there is a risk of overfitting. Therefore, the decision should be made by considering the overall data trend. In the case of the CVN temperature of Mn-1.1Mo, this error seems to be unavoidable because only this case has a large nonlinearity. It is expected that as the amount of experimental data is increased, the accuracy can be increased further. In contrast, for all cases of yield and tensile strengths, the prediction results show high accuracy, as compared with the test results, because the data used for the model training show a strong linearity.  As a result, in most cases, the estimated and experimental values show good agreement. However, in some cases, for example, in the case of the CVN temperature of Mn-1.1Mo in Figure A4, an estimation error occurs. This seems to be because, in this case, the experimental results have greater nonlinearity than in the other cases. When the nonlinearity of the data used in the ANN model is high, the fitting accuracy can be improved by increasing the number of hidden layers. However, if the number of hidden layers is increased, there is a risk of overfitting. Therefore, the decision should be made by considering the overall data trend. In the case of the CVN temperature of Mn-1.1Mo, this error seems to be unavoidable because only this case has a large nonlinearity. It is expected that as the amount of experimental data is increased, the accuracy can be increased further. In contrast, for all cases of yield and tensile strengths, the prediction results show high accuracy, as compared with the test results, because the data used for the model training show a strong linearity.

Conclusions
In this study, existing experimental data were collected to investigate changes in the following mechanical properties: 100 J Charpy temperature, yield strength, and tensile strength, depending on the chemical composition. Trends were analyzed by applying an ANN to the data. Data augmentation was performed to solve the problem of insufficient data caused by the dependence of experimental results on specific conditions. Finally, closed-form equations were developed based on the coefficients derived from the ANN models to facilitate a prediction. Based on these results, the following conclusions were drawn:

•
By increasing the amount of data through data augmentation, the performance of the ANN model improved. Inaccurate regression that may occur due to the insufficient number of experimental results was prevented in advance, and efficient ANN model training was made. However, some cases of CVN temperature showed an estimation error owing to the large nonlinearity in the data used for the ANN training. Because each condition has a different tendency, accurate regression could not be made in this case with relatively large nonlinearity. For a better predictive model, securing more experimental results is essential. In contrast, the yield and tensile strengths showed high accuracy, as the data showed a linear relationship.

•
The developed ANN models are presented in the form of vectors and matrices. Therefore, the three mechanical properties considered as targets in this study were calculated by inputting the content of each component through a simple matrix operation. • However, because each ANN model developed in this study only considered changes in the content of two elements, there is a limitation in that an accurate prediction cannot be performed if any element with a content different from that of the specimen used for the ANN model is included. That is, the results of this study can be mainly used to predict the relative increase or decrease according to the change in the content of two elements, including Mn.