Real-Time Detection of Weld Defects for Automated Welding Process Base on Deep Neural Network

: In the process of welding zinc-coated steel, zinc vapor causes serious porosity defects. The porosity defect is an important indicator of the quality of welds and degrades the durability and productivity of the weld. Therefore, this study proposes a deep neural network (DNN)-based non-destructive testing method that can detect and predict porosity defects in real-time, based on welding voltage signal, without requiring additional device in gas metal arc welding (GMAW) process. To this end, a galvannealed hot-rolled high-strength steel sheet applied to automotive parts was used to measure process signals in real-time. Then, feature variables were extracted through preprocessing, and correlation between the feature variables and weld porosity was analyzed. The proposed DNN based framework outperformed the artiﬁcial neural network (ANN) model by 15% or more. Finally, an experiment was conducted by using the developed porosity detection and prediction system to evaluate its ﬁeld application.


Introduction
Recently, automobile manufacturers have increased the application of galvannealed hot-rolled high-strength steel sheets to improve crash stability and corrosion resistance. In particular, automobile chassis parts, such as cross-members and lower arms, which support the vehicle weight and transmit power, are mostly assembled using gas metal arc welding (GMAW). Hence, high-quality arc welding is necessary. Currently, one of the major materials used for automobile parts is 590 MPa grade hot-rolled high-strength steel. The application rate of highly corrosion-resistant galvannealed steel sheets is increasing to prevent the corrosion. However, porosity defects, such as blowholes or pits in welds are a serious problem, caused by zinc vapor formed due to heat energy during the welding of zinc-coated steel. Specifically, this zinc vapor remains trapped in the weld, resulting in the formation of porosity and weld defects [1][2][3][4]. These defects cause serious strength degradation and changes in the mechanical characteristics of the weld, resulting in lower durability and productivity. Thus, when evaluating weld quality during the GMAW process, porosity defects are considered an important factor. Although, ultrasonic inspection and radiographic inspection are applied to production lines as non-destructive testing methods, their application is limited because they are difficult to apply in real-time and their cost is too high to apply all the automotive parts.
To overcome this problem, many studies have been conducted to evaluate quality by analyzing signal, including current signal and voltage signal, based on welding process [5][6][7][8][9][10][11][12]. Quinn et al. developed a welding quality evaluation method using the average welding current and arc voltage waveforms in the GMAW process [13]. Adolfsson et al. evaluated welding quality by calculating the arc and short-circuit times, short-circuit count, short-circuit peak current, and average welding current Metals 2020, 10 and arc voltage waveforms in the GMAW process and employed these quantities in the sequential probability ratio test (SPRT) algorithm [14]. Furthermore, Chu et al. proposed an efficient means of verifying welding quality through power spectrum and time-frequency spectrum analysis of welding current and arc voltage waveforms in the GMAW process [15]. Shin et al. studied the prediction of porosity in the appearance of beads in the GMAW process, using a regression model and an ANN model has been reported [16]. Recently, research has been conducted to detect defects by combining manufacturing technology and artificial intelligence (AI), such as machine learning in non-destructive testing (NDT) [17][18][19]. However, these studies evaluated defects that are artificially produced to test their algorithm on the test sheet. In addition, additional testing equipment, such as vision sensor, charge-coupled device (CCD) camera, and spectrometer was installed to evaluate the quality. This paper presents a unique NDT method to detect and predict porosity defects in real-time for GMAW process. The data acquisition device was used to measure welding voltage signal generated during welding in real-time. Then, feature variables were extracted from the welding voltage signal by preprocessing the data at intervals of 0.1 s. The significant feature variables were selected by analyzing the correlation between the feature variables and porosity defects, and the classification models based on artificial neural network (ANN) and deep neural network (DNN) were developed by using them as input variables. In addition, the predictability of the two models were compared and evaluated through the test data and verified by applying the developed system to actual manufacturing facility.

Material
In this study, galvannealed hot-rolled steel with an ultimate strength of 590 MPa and thickness of 2.3 mm was used as the test material. Hot-rolled steel sheet is a steel plate processed to suitable length and thickness by passing slab between rolls at a high temperature of more than 800 • C. However, the surface quality is worse than the cold-rolled steel sheet that has been reworked. Therefore, it is more economical than cold rolled steel sheet and is used for automobile chassis parts. The galvannealed coating thickness of the material is approximately 10 µm, and the mechanical properties and chemical composition are shown in Table 1. For welding test, as shown in Figure 1, a torch angle of 45 • and a lap joint configuration, which are frequently applied in automotive body sheet welding, were used. The test sheet was processed to a width of 150 mm and a length of 180 mm. A gap between workpieces was fixed at 0.0 mm and 0.5 mm. sequential probability ratio test (SPRT) algorithm [14]. Furthermore, Chu et al. proposed an efficient means of verifying welding quality through power spectrum and time-frequency spectrum analysis of welding current and arc voltage waveforms in the GMAW process [15]. Shin et al. studied the prediction of porosity in the appearance of beads in the GMAW process, using a regression model and an ANN model has been reported [16]. Recently, research has been conducted to detect defects by combining manufacturing technology and artificial intelligence (AI), such as machine learning in non-destructive testing (NDT) [17][18][19]. However, these studies evaluated defects that are artificially produced to test their algorithm on the test sheet. In addition, additional testing equipment, such as vision sensor, charge-coupled device (CCD) camera, and spectrometer was installed to evaluate the quality. This paper presents a unique NDT method to detect and predict porosity defects in real-time for GMAW process. The data acquisition device was used to measure welding voltage signal generated during welding in real-time. Then, feature variables were extracted from the welding voltage signal by preprocessing the data at intervals of 0.1 s. The significant feature variables were selected by analyzing the correlation between the feature variables and porosity defects, and the classification models based on artificial neural network (ANN) and deep neural network (DNN) were developed by using them as input variables. In addition, the predictability of the two models were compared and evaluated through the test data and verified by applying the developed system to actual manufacturing facility.

Material
In this study, galvannealed hot-rolled steel with an ultimate strength of 590 MPa and thickness of 2.3 mm was used as the test material. Hot-rolled steel sheet is a steel plate processed to suitable length and thickness by passing slab between rolls at a high temperature of more than 800 °C. However, the surface quality is worse than the cold-rolled steel sheet that has been reworked. Therefore, it is more economical than cold rolled steel sheet and is used for automobile chassis parts. The galvannealed coating thickness of the material is approximately 10 μm, and the mechanical properties and chemical composition are shown in Table 1. For welding test, as shown in Figure 1, a torch angle of 45° and a lap joint configuration, which are frequently applied in automotive body sheet welding, were used. The test sheet was processed to a width of 150 mm and a length of 180 mm. A gap between workpieces was fixed at 0.0 mm and 0.5 mm.

Equipment and Experimental Procedure
Experiments were conducted with short-circuit transfer using a constant-voltage direct current (DC) inverter-type welding machine (Fronius, Wels, Austria). The welding system is shown in Figure 2, and the workpiece in lap joint configuration was placed on the jig. A hall sensor for sensing welding current was installed at a cable between the power source and the jig, and the measured current signal was sent to the Data Acquisition (DAQ) device. In this study, voltage signal between the power source and the jig was directly measured by the DAQ device. The synchronized current and voltage signals were sent to computer by Local Area Network (LAN) communication in order to process and analyze the signals.

Equipment and Experimental Procedure
Experiments were conducted with short-circuit transfer using a constant-voltage direct current (DC) inverter-type welding machine (Fronius, Wels, Austria). The welding system is shown in Figure  2, and the workpiece in lap joint configuration was placed on the jig. A hall sensor for sensing welding current was installed at a cable between the power source and the jig, and the measured current signal was sent to the Data Acquisition (DAQ) device. In this study, voltage signal between the power source and the jig was directly measured by the DAQ device. The synchronized current and voltage signals were sent to computer by Local Area Network (LAN) communication in order to process and analyze the signals. As presented in Table 2, welding experiments were performed at a welding speed of 600 mm/min and wire feeding rate (WFR) of 3 mm/min in the overlapping part of the test sheet ( Figure  1). In addition, the contact tip to work piece distance (CTWD) was fixed at 15 mm, and the shielding gas consisted of a mixture of 90% Ar and 10% CO2. A ER70S-3 grade welding wire with a diameter of 1.2 mm was used. For a reliable analysis, two replicates were performed at each condition.

Relationship between Feature Variables and Porosity Using Welding Voltage Signal
In general, the weld porosity were reduced in the presence of gaps between workpieces. The reason for this is that the gap becomes the discharge path of zinc vapor, and thus, the zinc vapor is released. In this study, the relationship between the porosity generated in the weld and the welding voltage signal was analyzed by comparing the case with no gap and 0.5 mm gap. In addition, radiogrophic testing was conducted to identify the porosity location and match the weld voltage signal [4]. Figures 3 and 4 illustrate the measurement result, and in the presence of gap, the porosity of the weld were significantly reduced. As presented in Table 2, welding experiments were performed at a welding speed of 600 mm/min and wire feeding rate (WFR) of 3 mm/min in the overlapping part of the test sheet ( Figure 1). In addition, the contact tip to work piece distance (CTWD) was fixed at 15 mm, and the shielding gas consisted of a mixture of 90% Ar and 10% CO 2 . A ER70S-3 grade welding wire with a diameter of 1.2 mm was used. For a reliable analysis, two replicates were performed at each condition.

Relationship between Feature Variables and Porosity Using Welding Voltage Signal
In general, the weld porosity were reduced in the presence of gaps between workpieces. The reason for this is that the gap becomes the discharge path of zinc vapor, and thus, the zinc vapor is released. In this study, the relationship between the porosity generated in the weld and the welding voltage signal was analyzed by comparing the case with no gap and 0.5 mm gap. In addition, radiogrophic testing was conducted to identify the porosity location and match the weld voltage signal [4].     On the other hand, in the area where porosity occur (Figure 5b), irregularly shaped signals are generated repeatedly. In other words, the arc time and the short-circuit time were irregular, and the deviation between the peak value of the voltage signal and the instantaneous short-circuit voltage value was also relatively large.    On the other hand, in the area where porosity occur (Figure 5b), irregularly shaped signals are generated repeatedly. In other words, the arc time and the short-circuit time were irregular, and the deviation between the peak value of the voltage signal and the instantaneous short-circuit voltage value was also relatively large.  On the other hand, in the area where porosity occur (Figure 5b), irregularly shaped signals are generated repeatedly. In other words, the arc time and the short-circuit time were irregular, and the deviation between the peak value of the voltage signal and the instantaneous short-circuit voltage value was also relatively large. Based on these results, the feature variables of the voltage signal appearing in the short-circuit mode for 0.1 s are selected as shown in Figure 6. A total of 12 selected feature variables are shown in Table 3 for the nomenclature for each feature variable.  To analyze the correlation between the 12 feature variables and the porosity, the porosity ratios were calculated by randomly selected 9 regions where porosity occurred, by matching the bead appearance and X-ray image with the voltage signal in Figures 3 and 4 [4]. For a reliable analysis, the porosity region of Figure 4 was selected as the same position as that of Figure 3. The porosity ratio was calculated as zero. In Figure 3, porosity regions 3, 4, 5, 9 were divided into several to calculate Based on these results, the feature variables of the voltage signal appearing in the short-circuit mode for 0.1 s are selected as shown in Figure 6. A total of 12 selected feature variables are shown in Table 3 for the nomenclature for each feature variable. Based on these results, the feature variables of the voltage signal appearing in the short-circuit mode for 0.1 s are selected as shown in Figure 6. A total of 12 selected feature variables are shown in Table 3 for the nomenclature for each feature variable.  To analyze the correlation between the 12 feature variables and the porosity, the porosity ratios were calculated by randomly selected 9 regions where porosity occurred, by matching the bead appearance and X-ray image with the voltage signal in Figures 3 and 4 [4]. For a reliable analysis, the porosity region of Figure 4 was selected as the same position as that of Figure 3. The porosity ratio was calculated as zero. In Figure 3, porosity regions 3, 4, 5, 9 were divided into several to calculate

Feature Variable
Description Symbol Standard deviation of short-circuit peak voltage s V p X 4 Average voltage during short-circuit time Average short-circuit time T s X 7 Average arc time T a X 8 Number of short-circuit periods Standard deviation of short-circuit time s[T s ] X 10 Standard deviation of arc time s[T a ] X 11 Standard deviation of voltage during short-circuit time Standard deviation of voltage during arc time To analyze the correlation between the 12 feature variables and the porosity, the porosity ratios were calculated by randomly selected 9 regions where porosity occurred, by matching the bead appearance and X-ray image with the voltage signal in Figures 3 and 4 [4]. For a reliable analysis, the porosity region of Figure 4 was selected as the same position as that of Figure 3. The porosity ratio was calculated as zero. In Figure 3, porosity regions 3, 4, 5, 9 were divided into several to calculate porosity similar to weld bead length. Table 4 shows the measurement result of total weld length, bead width, porosity area, and porosity ratio for voltage signal sections and porosity calculations in nine porosity generated regions.  Table 5 shows the feature variables and porosity in 18 porosity regions, and the value of each feature variable was calculated as the average value in the porosity generated regions. Table 5. Feature variables and porosity ratios in the porosity generation regions.

Region
no.

Porosity
(%) Correlation analysis was performed to test the effects of the 12 feature variables on porosity ratio. The magnitude of the correlation may be expressed by quantifying a value called a correlation coefficient. In this study, the most widely used Pearson correlation coefficient was used. The value of the Pearson correlation coefficient lies between −1 and 1, and it is determined by analyzing the linear correlations between variables. Table 6 shows the correlation analysis results. Correlation coefficients of feature variables X 1 , X 2 , X 8 , X 9 , and X 12 have a positive correlation with porosity ratio and were approximately from 0.6 to 0.7. Although these values do not represent extremely strong positive correlations, they certainly show strong correlations. Furthermore, the correlation coefficients of X 7 and X 11 , which have negative correlations, are −0.666, and −0.468, respectively. Thus, the correlation coefficient of X 7 represents a strong correlation, whereas the correlation coefficient of X 11 represents a moderately weak correlation. Meanwhile, X 3 , X 4 , X 5 , X 6 , and X 10 show very weak correlations with the porosity ratio. Table 6. Correlation matrix for the independent variables and porosity.

No.
Porosity As a result of the correlation analysis, it was confirmed that the feature variables , and X 12 (s[V(T a )]), having positive or negative correlation coefficient value of 0.6 are closely related to porosity.
In this study, the arc voltage signal was measured by setting the sampling rate of 10 kHz for 20 s. Data preprocessing for feature variable extraction is shown in Figure 7. That is, the feature variable was extracted by setting the voltage signal section generated for 0.1 s as one window and overlapping each 0.09 (90%) s.

Deep Neural Network (DNN)
An artificial neural network (ANN) works by finding the optimal weight ( ) of each node in a perceptron structure, shown in Figure 8, by delivering the sum of the input values X multiplied times to the activation function, as well as by updating through learning using a backpropagation algorithm.

Deep Neural Network (DNN)
An artificial neural network (ANN) works by finding the optimal weight (w) of each node in a perceptron structure, shown in Figure 8, by delivering the sum of the input values X multiplied times w to the activation function, as well as by updating w through learning using a backpropagation algorithm. An artificial neural network (ANN) works by finding the optimal weight ( ) of each node in a perceptron structure, shown in Figure 8, by delivering the sum of the input values X multiplied times to the activation function, as well as by updating through learning using a backpropagation algorithm. However, a DNN based on an ANN is deeper than the ANN, as the DNN consists of two hidden layers between the input and output layers. Figure 9 shows the ANN and DNN structures. However, a DNN based on an ANN is deeper than the ANN, as the DNN consists of two hidden layers between the input and output layers. Figure 9 shows the ANN and DNN structures.

Deep Neural Network (DNN)
An artificial neural network (ANN) works by finding the optimal weight ( ) of each node in a perceptron structure, shown in Figure 8, by delivering the sum of the input values X multiplied times to the activation function, as well as by updating through learning using a backpropagation algorithm. However, a DNN based on an ANN is deeper than the ANN, as the DNN consists of two hidden layers between the input and output layers. Figure 9 shows the ANN and DNN structures. A DNN performs learning through a backpropagation algorithm, which is the same as the learning method of an ANN. The backpropagation algorithm continuously updates the weights (w) to minimize the cost function from the weights by using gradient descent, as shown in Equation (1): where η is the learning rate and C is the cost function. The cost function is defined as the square of the difference between the predicted value [ f (z)] and the actual target value (y), as shown in Equation (2): where f (z), expressed in Equation (3), is the sigmoid function that is used as an activation function and z is the forward function value delivered by each layer: However, if the activation function is a sigmoid and the cost function, which is defined by the sum of the squared errors, is applied for gradient descent, the differential value [ f (z)] rapidly approaches zero. Thus, a slow learning problem occurs as the minimum value in a local part is reached, and further learning is not performed, as shown in Figure 10.
However, if the activation function is a sigmoid and the cost function, which is defined by the sum of the squared errors, is applied for gradient descent, the differential value [ ( )] rapidly approaches zero. Thus, a slow learning problem occurs as the minimum value in a local part is reached, and further learning is not performed, as shown in Figure 10. To overcome this slow learning problem, the DNN actively uses a cost function of the crossentropy type, rather than the sum of the squared errors. The cross-entropy function is defined in Equation (4): The differential of the above function with respect to w can be expressed as shown in Equation (5): It can be seen from Equation (5) that the differential with respect to is proportional to the difference between the predicted value and the actual target value, ( ) . Thus, when the error of the input value is large, the convergence speed increases, and when the error is small, the speed decreases to prevent divergence.
The activation function was introduced to explain that the backpropagation algorithm was a sigmoid function. However, an ANN using a sigmoid function as the activation function determines based on the gradient during the learning process. This gradient approaches zero when the number of hidden layers increases inside a structure, resulting in the vanishing gradient phenomenon, in which the weight is rarely updated, as shown in Figure 11. To overcome this slow learning problem, the DNN actively uses a cost function of the cross-entropy type, rather than the sum of the squared errors. The cross-entropy function is defined in Equation (4): The differential of the above function with respect to w can be expressed as shown in Equation (5): It can be seen from Equation (5) that the differential with respect to w is proportional to the difference between the predicted value and the actual target value, f (z) − y. Thus, when the error of the input value is large, the convergence speed increases, and when the error is small, the speed decreases to prevent divergence.
The activation function was introduced to explain that the backpropagation algorithm was a sigmoid function. However, an ANN using a sigmoid function as the activation function determines w based on the gradient during the learning process. This gradient approaches zero when the number of hidden layers increases inside a structure, resulting in the vanishing gradient phenomenon, in which the weight is rarely updated, as shown in Figure 11. To solve this problem, the DNN introduces a restricted linear unit (ReLU) function as a new activation function. The ReLU function is a simple function that outputs the input value if the input value is greater than zero and outputs zero if the input value is zero or lower. This activation function completes a DNN structure that can perform classification and prediction more accurately by implementing the hidden layers deeper, and wider, to solve the vanishing gradient problem, which has not been overcome before. Table 7 compares the sigmoid function, which is a representative ANN activation function, and the ReLU function, which is frequently used in DNNs.  To solve this problem, the DNN introduces a restricted linear unit (ReLU) function as a new activation function. The ReLU function is a simple function that outputs the input value if the input value is greater than zero and outputs zero if the input value is zero or lower. This activation function completes a DNN structure that can perform classification and prediction more accurately by implementing the hidden layers deeper, and wider, to solve the vanishing gradient problem, which has not been overcome before. Table 7 compares the sigmoid function, which is a representative ANN activation function, and the ReLU function, which is frequently used in DNNs. Table 7. Comparison of types of activation functions.
To solve this problem, the DNN introduces a restricted linear unit (ReLU) function as a new activation function. The ReLU function is a simple function that outputs the input value if the input value is greater than zero and outputs zero if the input value is zero or lower. This activation function completes a DNN structure that can perform classification and prediction more accurately by implementing the hidden layers deeper, and wider, to solve the vanishing gradient problem, which has not been overcome before. Table 7 compares the sigmoid function, which is a representative ANN activation function, and the ReLU function, which is frequently used in DNNs.

Activation function Equation Graph
Sigmoid f z 1 1

ReLU
f z 0 0 0 Furthermore, when the DNN solves the problem of classifying multiple classes, it introduces the softmax function in the output layer. The th output of the softmax function is defined as the exponential function of the th input divided by the exponential function for all inputs, as shown in Equation (6): The output of the softmax function is a real number between 0 and 1, and the sum of the outputs is 1, which is a critical property of the softmax function. This property enables stochastic analysis and classification of the outputs of the function.

Porosity Prediction Model based on Artificial Intelligence Techniques
The relationship between the welding signals and porosity in arc welding is difficult to express in a sample mathematical formula. To deal with this complex relationship, this study proposes an algorithm for predicting porosity defects in welds, using an ANN and a DNN, that are currently widely used in machine learning. Figure 8 shows schematics of the ANN and DNN model structures employed in this study. For the ANN structure in Figure 12a, six feature variables, X1 ( ), X2 ( ), X7 ( ), X8 ( ), X9 (s ), and X12 (s V ), were selected as input values for the input layer, and one hidden layer consisting of 24 nodes was selected. Figure 11. Vanishing gradients.
To solve this problem, the DNN introduces a restricted linear unit (ReLU) function as a new activation function. The ReLU function is a simple function that outputs the input value if the input value is greater than zero and outputs zero if the input value is zero or lower. This activation function completes a DNN structure that can perform classification and prediction more accurately by implementing the hidden layers deeper, and wider, to solve the vanishing gradient problem, which has not been overcome before. Table 7 compares the sigmoid function, which is a representative ANN activation function, and the ReLU function, which is frequently used in DNNs.

Activation function Equation Graph
Sigmoid f z 1 1

ReLU
f z 0 0 0 Furthermore, when the DNN solves the problem of classifying multiple classes, it introduces the softmax function in the output layer. The th output of the softmax function is defined as the exponential function of the th input divided by the exponential function for all inputs, as shown in Equation (6): The output of the softmax function is a real number between 0 and 1, and the sum of the outputs is 1, which is a critical property of the softmax function. This property enables stochastic analysis and classification of the outputs of the function.

Porosity Prediction Model based on Artificial Intelligence Techniques
The relationship between the welding signals and porosity in arc welding is difficult to express in a sample mathematical formula. To deal with this complex relationship, this study proposes an algorithm for predicting porosity defects in welds, using an ANN and a DNN, that are currently widely used in machine learning. Figure 8 shows schematics of the ANN and DNN model structures employed in this study. For the ANN structure in Figure 12a, six feature variables, X1 ( ), X2 ( ), X7 ( ), X8 ( ), X9 (s ), and X12 (s V ), were selected as input values for the input layer, and one hidden layer consisting of 24 nodes was selected.
Furthermore, when the DNN solves the problem of classifying multiple classes, it introduces the softmax function in the output layer. The kth output of the softmax function is defined as the exponential function of the kth input divided by the exponential function for all inputs, as shown in Equation (6): The output of the softmax function is a real number between 0 and 1, and the sum of the outputs is 1, which is a critical property of the softmax function. This property enables stochastic analysis and classification of the outputs of the function.

Porosity Prediction Model based on Artificial Intelligence Techniques
The relationship between the welding signals and porosity in arc welding is difficult to express in a sample mathematical formula. To deal with this complex relationship, this study proposes an algorithm for predicting porosity defects in welds, using an ANN and a DNN, that are currently widely used in machine learning. Figure 8 shows schematics of the ANN and DNN model structures employed in this study. For the ANN structure in Figure 12a As shown in Figure 13, the output value was set to 1 for the sections with porosity in the weld and 0 for the other sections. The DNN structure in Figure 9b has the same input and output variables, and the same number of nodes of the ANN model, but the hidden layer number was different. As shown in Figure 13, the output value was set to 1 for the sections with porosity in the weld and 0 for the other sections. The DNN structure in Figure 9b has the same input and output variables, and the same number of nodes of the ANN model, but the hidden layer number was different. As shown in Figure 13, the output value was set to 1 for the sections with porosity in the weld and 0 for the other sections. The DNN structure in Figure 9b has the same input and output variables, and the same number of nodes of the ANN model, but the hidden layer number was different.  Table 8 compares the hyperparameters, such as learning rates, epochs, optimizers, and activation functions of the ANN and DNN models. The same learning rate and epoch were used for both models. For the activation function, the sigmoid function and ReLU function were used for the ANN, and DNN, respectively. In addition, the output layer classifies the output values as probabilistic by setting the softmax function. For the optimization function, gradient descent and the Adam optimizer [20], proposed by Kingma and Lei Ba, were used for the ANN, and DNN, respectively. The ANN and DNN models implemented to detect and predict porosity defects were evaluated. The indicators for evaluating the classification model are classified into "True" and "False" as shown in Figure 14, and can be divided into 2 × 2 matrices.  Table 8 compares the hyperparameters, such as learning rates, epochs, optimizers, and activation functions of the ANN and DNN models. The same learning rate and epoch were used for both models. For the activation function, the sigmoid function and ReLU function were used for the ANN, and DNN, respectively. In addition, the output layer classifies the output values as probabilistic by setting the softmax function. For the optimization function, gradient descent and the Adam optimizer [20], proposed by Kingma and Lei Ba, were used for the ANN, and DNN, respectively.

Prediction model evaluation for detecting porosity
The ANN and DNN models implemented to detect and predict porosity defects were evaluated. The indicators for evaluating the classification model are classified into "True" and "False" as shown in Figure 14, and can be divided into 2 × 2 matrices.  where the accuracy was evaluated by the following Equation (7): For the data of the ANN and DNN models, the raw data sets obtained during two experimental runs, mentioned in the previous sections, were used. For the training data, the raw data sets from Figures 3 and 4 were employed, excluding test data set 1 for the 8-10 s region, and test data set 2 for the 12-14 s region in Figure 3a. In other words, 2044 training data sets were obtained in total by randomly extracting the data from the region with porosity, and from the region with no porosity. Furthermore, 191 test data sets were obtained from test data sets 1 and 2 for the 2 s regions.
The test data sets are shown in Figure 15. It can be seen from test data set 1 for the 8-10 s section, Where the accuracy was evaluated by the following Equation (7): For the data of the ANN and DNN models, the raw data sets obtained during two experimental runs, mentioned in the previous sections, were used. For the training data, the raw data sets from Figures 3 and 4 were employed, excluding test data set 1 for the 8-10 s region, and test data set 2 for the 12-14 s region in Figure 3a. In other words, 2044 training data sets were obtained in total by randomly extracting the data from the region with porosity, and from the region with no porosity. Furthermore, 191 test data sets were obtained from test data sets 1 and 2 for the 2 s regions.
The test data sets are shown in Figure 15. It can be seen from test data set 1 for the 8-10 s section, shown in Figure 15a, that five porosity defects of 0.7-2.4 mm, including two pits, occurred in the weld. In addition, it can be seen from test data set 2 for the 12-14 s section, shown in Figure 15b, that three porosity defects of 0.9-1.7 mm, including one pit, occurred in the weld. where the accuracy was evaluated by the following Equation (7): For the data of the ANN and DNN models, the raw data sets obtained during two experimental runs, mentioned in the previous sections, were used. For the training data, the raw data sets from Figures 3 and 4 were employed, excluding test data set 1 for the 8-10 s region, and test data set 2 for the 12-14 s region in Figure 3a. In other words, 2044 training data sets were obtained in total by randomly extracting the data from the region with porosity, and from the region with no porosity. Furthermore, 191 test data sets were obtained from test data sets 1 and 2 for the 2 s regions.
The test data sets are shown in Figure 15. It can be seen from test data set 1 for the 8-10 s section, shown in Figure 15a, that five porosity defects of 0.7-2.4 mm, including two pits, occurred in the weld. In addition, it can be seen from test data set 2 for the 12-14 s section, shown in Figure 15b, that three porosity defects of 0.9-1.7 mm, including one pit, occurred in the weld. Six feature variables extracted through preprocessing were set as the input values, and the output values were set as shown in Figure 9. Next, the ANN and DNN models were trained using the structural parameters mentioned in Table 7. The training results for data sets 1, and 2, are shown in Figure 16, respectively. It can be seen from Figure16 that the loss of function value of the ANN is lower than that of the DNN at an early stage of training. However, approximately after 5000 epochs, the loss function value of the ANN no longer converges, whereas that of the DNN still converges. Six feature variables extracted through preprocessing were set as the input values, and the output values were set as shown in Figure 9. Next, the ANN and DNN models were trained using the structural parameters mentioned in Table 7. The training results for data sets 1, and 2, are shown in Figure 16, respectively. It can be seen from Figure 16 that the loss of function value of the ANN is lower than that of the DNN at an early stage of training. However, approximately after 5000 epochs, the loss function value of the ANN no longer converges, whereas that of the DNN still converges. Furthermore, the training accuracy of the ANN is approximately 81%, whereas the training accuracy of the DNN is 99% or higher. These results confirm that the training performance of the DNN is better than that of the ANN. Table 9 outlines the training results of the ANN and DNN models for loss function and accuracy over 100,000 epochs. Furthermore, the training accuracy of the ANN is approximately 81%, whereas the training accuracy of the DNN is 99% or higher. These results confirm that the training performance of the DNN is better than that of the ANN. Table 9 outlines the training results of the ANN and DNN models for loss function and accuracy over 100,000 epochs.   Figure 17 presents the prediction results of the ANN and DNN models obtained using test data sets 1 and 2. Figure 17a shows the prediction results for test data set 1. Among the five porosity defects in test data set 1, the ANN model detected defects near defects 1 and 4, although with some errors. The DNN model detected all five porosity defects in test data set 1. Furthermore, the prediction results for test data set 2, shown in Figure 17b, indicate that among the three porosity defects, the ANN model detected only defect 1, while the DNN model detected all three porosity defects, although some errors occurred in the front of defect 1.     Figure 17 presents the prediction results of the ANN and DNN models obtained using test data sets 1 and 2. Figure 17a shows the prediction results for test data set 1. Among the five porosity defects in test data set 1, the ANN model detected defects near defects 1 and 4, although with some errors. The DNN model detected all five porosity defects in test data set 1. Furthermore, the prediction results for test data set 2, shown in Figure 17b, indicate that among the three porosity defects, the ANN model detected only defect 1, while the DNN model detected all three porosity defects, although some errors occurred in the front of defect 1. Furthermore, the training accuracy of the ANN is approximately 81%, whereas the training accuracy of the DNN is 99% or higher. These results confirm that the training performance of the DNN is better than that of the ANN. Table 9 outlines the training results of the ANN and DNN models for loss function and accuracy over 100,000 epochs.   Figure 17 presents the prediction results of the ANN and DNN models obtained using test data sets 1 and 2. Figure 17a shows the prediction results for test data set 1. Among the five porosity defects in test data set 1, the ANN model detected defects near defects 1 and 4, although with some errors. The DNN model detected all five porosity defects in test data set 1. Furthermore, the prediction results for test data set 2, shown in Figure 17b, indicate that among the three porosity defects, the ANN model detected only defect 1, while the DNN model detected all three porosity defects, although some errors occurred in the front of defect 1.

Porosity Detection and Prediction Systems for Field Application
In this study, arc welding process signals were measured in real-time in the process shown in Figure 18. In addition, feature variables were extracted through a preprocessing process and applied to the DNN model to develop a system that can detect and predict porosity defects. Furthermore, for test data set 2, the prediction accuracies of the ANN and DNN models are 79.0%, and 89.5%, respectively. Thus, the DNN model shows approximately 10% higher prediction performance than the ANN model in this case. In this study, arc welding process signals were measured in real-time in the process shown in Figure 18. In addition, feature variables were extracted through a preprocessing process and applied to the DNN model to develop a system that can detect and predict porosity defects. In the porosity detection and prediction system developed for the field application, test experiments were conducted under the conditions summarized in Table 11. The test sheet material and welding method are the same as in this study, and the test region and the results are shown in Figure 19.
In the porosity detection and prediction system developed for the field application, test experiments were conducted under the conditions summarized in Table 11. The test sheet material and welding method are the same as in this study, and the test region and the results are shown in Figure 19.  The results of the verification test indicated that eight pits were generated in the welded part. However, porosity was detected and predicted at the point where pits did not occur, such as at the red point. This can be determined by the error value, but it is most likely an internal porosity that cannot be seen with the naked eye as shown in Figure 10. Experiments were performed using the developed porosity detection and prediction system to verify its field application.

Conclusions
This study developed a system for detecting and predicting porosity defects, generated in the welded part, by using the galvannealed high-strength steel sheet, used for automobile parts in the GMAW process. The primary achievements can be summarized as follows.
1) The welding current and arc voltage signals generated in the GMAW process were measured in real-time, and the feature variables were extracted through preprocessing. In addition, the correlation between the feature variable and the porosity was analyzed to select the feature variable that is considered to be the defect signal.
2) An artificial intelligence technique suitable for nonlinear arc welding process prediction was used, and a model was developed to detect and predict porosity by comparing DNN and ANN models.
3) The predictive performance of ANN model and DNN model was evaluated. The evaluation result shows that the predictive performance of the DNN model is 15.2% higher than the ANN model on average. 4) An experiment was performed using the developed system to evaluate it for field application. The results indicated that all the pits generated in the welded part were detected.
5) It has the advantage of easy site application and low initial equipment cost because it is an NDT system that detects and predicts porosity defects in the weld by using only arc welding process signals without additional devices.