Development of Defect Detection AI Model for Wire + Arc Additive Manufacturing Using High Dynamic Range Images

: Wire + arc additive manufacturing (WAAM) utilizes a welding arc as a heat source and a metal wire as a feedstock. In recent years, WAAM has attracted signiﬁcant attention in the manufacturing industry owing to its advantages: (1) high deposition rate, (2) low system setup cost, (3) wide diversity of wire materials, and (4) sustainability for constructing large-sized metal structures. However, owing to the complexity of arc welding in WAAM, more research efforts are required to improve its process repeatability and advance part qualiﬁcation. This study proposes a methodology to detect defects of the arch welding process in WAAM using images acquired by a high dynamic range camera. The gathered images are preprocessed to emphasize features and used for an artiﬁcial intelligence model to classify normal and abnormal statuses of arc welding in WAAM. Owing to the shortage of image datasets for defects, transfer learning technology is adopted. In addition, to understand and check the basis of the model’s feature learning, a gradient-weighted class activation mapping algorithm is applied to select a model that has the correct judgment criteria. Experimental results show that the detection accuracy of the metal transfer region-of-interest (RoI) reached 99%, whereas that of the weld-pool and bead RoI was 96%.


Introduction
Wire + arc additive manufacturing (WAAM) is a metal three-dimensional (3D) printing technique (see Figure 1). Unlike conventional 3D printing based on polylactic acid filament, WAAM uses a metal wire as the feedstock and electric arc as the heat input. Because WAAM is processed using three-axis computer numerical control routers or industrial six-axis robots, geometric parts with a high degree of freedom can be fabricated [1]. In addition, WAAM operated by robot welding techniques can overcome the size restriction constrained by bed size on other 3D printing technologies (e.g., powder bed fusion process), thereby enabling the production of large components. Furthermore, it consumes less raw material and energy compared with other metal 3D printing processes that use powders, and laser or electron beams. According to recent research studies [2,3], the material flexibility in WAAM is expanding from typical low-carbon steel to high-performance alloys (e.g., titanium, nickel, and tungsten). Two different types of materials can be processed in one product via WAAM [4,5]. Owing to these benefits, WAAM has attracted significant interest from researchers and manufacturers. Despite these benefits, WAAM presents several technical challenges and research issues that hinder its wide implementation in real industrial applications [6]. First, as recently developing technology, the optimized processing guidance depending on various materials and shapes is not well defined. Second, due to the characteristics of arc welding in WAAM such as the non-equilibrium thermal cycles, WAAM can cause defects, e.g., voids, cracks, microstructural inhomogeneity, and poor surface roughness so that postprocessing (e.g., machining) must be performed. The defect deteriorates mechanical strength, incurring additional cost. Third, as the most critical problem, a single failure during WAAM can affect the entire process, which can lead to discarding the under-processed product.
To avoid production failure during WAAM, it is very important to monitor and control the welding process. The study carried out by Xia [7] mentioned that the process monitoring and feed-back control for additive manufacturing process is key success factor. Other research, by Tang et al. [8], tried to use the image acquisition technique for monitoring and controlling. However, the direct monitoring of weld-pool/bead is still missing and the used AI model for image classification still has a simple structure. To overcome the limitation of monitoring and controlling in WAAM, this study provides a methodology to monitor welding processing focusing on bead generation. The proposed methodology adopts a recently developed image recognition algorithm based on artificial intelligence (AI). Among various kinds of AI algorithm, convolutional neural network (CNN) showing high performance image classification is applied for detecting problems on metal transfer and weld-pool/bead shape. Due to the complexity of the arc recognition AI model, it will be introduced in the consecutive paper.
Through the proposed methodology, this paper proves that an AI algorithm such as CNN can be applied to the image-captured harsh environment with a high temperature. Some helpful guidance for using CNN in monitoring of arc welding is provided in this paper. The application of transfer learning of CNN shows another possibility to improve fault detection accuracy. Moreover, the developed method can be extended to real time control of welding during WAAM.

State-of-the-Art
To avoid any defect in WAAM product, nondestructive evaluation methods (e.g., eddy current, ultrasonic, and industrial computerized tomography) have been utilized. However, they increase the total manufacturing cost and decrease the effectiveness of WAAM [9]. Furthermore, these methods can be performed after a product is additively manufactured, which means that the product cannot be fixed during WAAM in real time. In this context, real-time monitoring and control approaches are being considered as promising solutions to maintain process stability and part quality [10][11][12]. Despite these benefits, WAAM presents several technical challenges and research issues that hinder its wide implementation in real industrial applications [6]. First, as recently developing technology, the optimized processing guidance depending on various materials and shapes is not well defined. Second, due to the characteristics of arc welding in WAAM such as the non-equilibrium thermal cycles, WAAM can cause defects, e.g., voids, cracks, microstructural inhomogeneity, and poor surface roughness so that post-processing (e.g., machining) must be performed. The defect deteriorates mechanical strength, incurring additional cost. Third, as the most critical problem, a single failure during WAAM can affect the entire process, which can lead to discarding the under-processed product.
To avoid production failure during WAAM, it is very important to monitor and control the welding process. The study carried out by Xia [7] mentioned that the process monitoring and feed-back control for additive manufacturing process is key success factor. Other research, by Tang et al. [8], tried to use the image acquisition technique for monitoring and controlling. However, the direct monitoring of weld-pool/bead is still missing and the used AI model for image classification still has a simple structure. To overcome the limitation of monitoring and controlling in WAAM, this study provides a methodology to monitor welding processing focusing on bead generation. The proposed methodology adopts a recently developed image recognition algorithm based on artificial intelligence (AI). Among various kinds of AI algorithm, convolutional neural network (CNN) showing high performance image classification is applied for detecting problems on metal transfer and weld-pool/bead shape. Due to the complexity of the arc recognition AI model, it will be introduced in the consecutive paper.
Through the proposed methodology, this paper proves that an AI algorithm such as CNN can be applied to the image-captured harsh environment with a high temperature. Some helpful guidance for using CNN in monitoring of arc welding is provided in this paper. The application of transfer learning of CNN shows another possibility to improve fault detection accuracy. Moreover, the developed method can be extended to real time control of welding during WAAM.

State-of-the-Art
To avoid any defect in WAAM product, nondestructive evaluation methods (e.g., eddy current, ultrasonic, and industrial computerized tomography) have been utilized. However, they increase the total manufacturing cost and decrease the effectiveness of WAAM [9]. Furthermore, these methods can be performed after a product is additively manufactured, which means that the product cannot be fixed during WAAM in real time. In this context, real-time monitoring and control approaches are being considered as promising solutions to maintain process stability and part quality [10][11][12].
Regarding part quality, many research works emphasize the sensing and tracking of weld seam [13]. The machine vision technologies are adopted in arc seam tracking [14].
Due to complication of classification of arc in WAAM, this topic will be handled in another paper by the authors. Jiao et al. [15] and Feng et al. [16] explain the application of deep learning for weld pool. The metal transfer and bead shape as well as weld pool should be monitored concurrently, a process which is still lacking.
In the defect classification stage, which determines part quality, a reliable classifier must be designed to distinguish between different types of defects. Many researchers have investigated and discussed the development of different classification algorithms. Machine learning methods such as artificial neural networks (ANNs), support vector machines (SVMs), and fuzzy systems are the most widely used in welding defect recognition.
The main application of fuzzy theory in welding defect detection has been reported in the late 1990s [17], where Liao [18] investigated a fuzzy expert system method for classifying defect types with better classification accuracy than the fuzzy k-nearest neighbor and multilayer perceptron. Furthermore, Baniukiewicz [19] investigated a new type of complex classifier comprising fuzzy systems and ANNs. However, a tradeoff exists between the accuracy and interpretability of fuzzy defect detection. The SVM and ANN are the most typically used methods for defect detection. El Ouafi et al. [20] simulated welding parameters (welding time, current, voltage, thickness, etc.) to establish a method for evaluating the welding quality of an ANN. Zapata et al. [21] modified an ANN to improve the detection accuracy of individual and overall defect characteristics. Yuan et al. [22] investigated adaptive tissue and adaptive feedforward neural networks to identify the essential features of defects and effectively reduce identification errors. To achieve high accuracy and improve classification efficiency, Mu et al. [23] proposed an automatic classification algorithm that combines principal component analysis and the SVM to select the optimal dataset. Inspired by this, Chen et al. [24] applied a bee algorithm to extract defect features and used a hierarchical multiclass SVM to achieve a maximum accuracy of 95%. Su et al. [25] built an automatic defect identification system for solder joints by extracting the texture features of weld defects. Han et al. [26] combined ELM and M-estimation and proposed a new ME-ELM algorithm. This algorithm can effectively improve the anti-interference and robustness of the model, and it provides high accuracy in predicting welding defects. Typically, these shallow machine learning methods are combined with the feature extraction process to ultimately affect the machine-learning prediction results. However, it is difficult to determine the features to extract in WAAM, not regular welding. Therefore, to implement automatic feature learning and weld defect prediction, an efficient deep learning method must be designed.
However, these approaches have not been extensively investigated because of insufficient knowledge regarding machine vision and artificial intelligence (AI) in WAAM. To fabricate a satisfactory component, significant efforts have been expended in monitoring and controlling the welding process during WAAM. The considerable process variables for monitoring and controlling in WAAM are the wire feed speed, torch movement speed, tool path, voltage, current, and layer height. Early research efforts focused on identifying the relationship between process variables and fabricated component in terms of geometry [27], surface roughness [28], and mechanical properties [29]. The underlying physics mechanism for controlling WAAM has been revealed based on theory and experiment. For instance, Yang et al. [30] investigated the effects of interlayer cooling time on surface quality and component geometry. They used a thermal imaging technique to obtain the surface temperature field of the deposited components. In addition, the internal flux of the weld pool was investigated through high-energy synchrotron radiation experiments [31].
Xia et al. [7] summarized monitoring and controlling studies regarding WAAM using visual, spectral, acoustic, and thermal data. It was reported that combining the data from these sensors with AI techniques provided new opportunities for monitoring and controlling the welding of WAAM. Recently, many research groups have been actively performing investigations in order to develop deep learning models for arc welding monitoring and control.
As a new machine learning field, deep learning indicates significant potential in defect detection trough the continuous reduction in dimensions during feature learning to avoid the effect of feature extraction on the identification results, thereby effectively improving the accuracy of defect detection. Deep learning models with digital image processing techniques are frequently adopted in various fields, and their usefulness has been proven by their outstanding performances in recognizing the environment and controlling a selfdriving car [32]. Hence, they can be utilized for the anomaly detection of welding in WAAM. Xiong and Ding used a deep neural network to predict bead geometry with respect to different process variables [33,34]. However, the prediction of geometry was restricted to dimensional values such as the width, height, and toe angle of the bead. These values cannot represent the possible variety of bead defects. To extend the capability of anomaly detection, researchers have started to use image data [35,36] from various types of vision sensors.
Zhang et al. [35] proposed a new weld-pool data collection method that can achieve high model classification performance. They used a convolutional neural network (CNN) to classify weld seam quality. Wang et al. [36] predicted the trend of weld-pool features using Pred-NET, a CNN algorithm. Liu et al. [37] achieved high prediction accuracy on a relatively small dataset of weld defects based on the VGG-16 full CNN. Nevertheless, the sample size was relatively small in some areas, which affected the prediction results. Therefore, many researchers use transfer learning to overcome the problem of small samples and use deep CNN models trained on ImageNet as feature extractors to migrate to small datasets in other disciplines with favorable results [38]. It is noteworthy that these small datasets are completely different from ImageNet. Zhang et al. [39] obtained an identification accuracy of 97.041% after investigating medical imaging using a transfer learning method. Ren et al. [40] investigated automatic surface detection of deep convolutional activation feature models based on deep transfer learning. Compared with other methods, the accuracy of Ren's method improved by 0.66-25.5% in the detection of defects during classification. Pan et al. [41] used a transfer learning algorithm that modified the structure of the existing MobileNet to monitor welding defects in a small dataset. The image-based deep learning method contributed positively to feature learning and does not affect the prediction results. It indicated significant potential for classifying welding defects. Moreover, it has been shown that using transfer learning, features can be extracted from fewer data and new data.
From the previous works, we aim to develop a defect detection model for real-time process management using an AI model via transfer learning and digital image processing. The studies based on AI primarily focused on weld-pool features only. However, to develop the real-time control of WAAM, considering only weld-pool features is insufficient because WAAM is significantly affected by other factors, e.g., wire feeding and arc stability. In this study, we considered other factors such as metal transfer and weld-pool/bead shape to improve the accuracy of abnormal detection in WAAM. For real-time process monitoring, process data are collected in real-time with a high dynamic range (HDR) camera. With the HDR imaging system, the sharp, low-saturation images of process can be obtained. Data collection in industrial sites is limited. As with this study, it is difficult to collect data in the basic research stage or abnormal data. Therefore, the AI model was developed through transfer learning from the pre-developed CNN model.

Experimental Setup
In this study, gas tungsten arc welding (GTAW)-based WAAM is investigated. Inconel 625 is used as the wire material and low-carbon steel is used as the substrate. Inconel 625 is a nickel-based super alloy with excellent corrosion resistance and exhibits high strength, toughness, and oxidation resistance at high temperatures (up to 980 • C) [42]. Accordingly, it is widely used in chemical, energy, aerospace, automotive, marine, oil, and gas industries.
The process starts with arc generation between the tungsten electrode and substrate, and the wire material and substrate are melted and welded by the heat input. Subsequently, a component can be additively manufactured via the layer-by-layer stacking mechanism, as shown in Figures 1 and 2.
robot arm with a Fanuc R-J3iB controller, a Miller Dynasty 400 GTAW power source, and a generic wire feeder. The welding speed, welding current, and wire feed speed were controlled using the robot and power source controllers. To monitor the welding process in WAAM, an HDR camera is attached to the robot arm, and the metal transfer, arc formation, and weld-pool/bead shape are recorded (see Figure 2b). The HDR technique is useful for recording real-world scenes containing a wide range of light intensities, e.g., direct sunlight to extreme shaded environments [43]. The HDR camera (model: WL2-H7ML-M35, WELDVIS) effectively manages the bright light source of the arc, allowing the extraction of features from the clear images. Because the lightweight HDR camera is attached to the robot arm, images with a fixed field-of-view can be obtained. Accordingly, the images can be efficiently used for further analysis. Furthermore, images of metal transfer can be measured using this HDR camera.

Data Gathering and Preprocessing
Metal transfer, arc shape, and weld-pool/bead are important features that must be verified for process stability. For the real-time monitoring of the features, the images were obtained using an HDR camera at 50 frames per second. The monitored images are segmented into three regions of interest (RoIs): (1) metal transfer, (2) arc shape, and (3) weldpool/-bead, as shown in Figure 2b. In this study, we focus on two RoIs, i.e., metal transfer and weld-pool/bead. Because the arc features show characteristics that differ from those of the others, it requires a more sophisticated method. This will be investigated in future work.
To obtain a uniform and regular weld bead, the wire must be smoothly melted and transferred to a weld-pool with or without a droplet, as shown in Figure 3a. By contrast, Figure 3b shows an abnormal metal transfer, discontinuously transferring the droplet into the weld pool. Consequently, abnormal weld-beads (e.g., humping beads) are generated. To detect this problem, the image data of the metal transfer RoI are gathered and classified into normal and abnormal classes. The gathered images are difficult to use as raw images As shown in Figure 2a, the system setup comprised a six-axis Fanuc ArcMate 100ib robot arm with a Fanuc R-J3iB controller, a Miller Dynasty 400 GTAW power source, and a generic wire feeder. The welding speed, welding current, and wire feed speed were controlled using the robot and power source controllers. To monitor the welding process in WAAM, an HDR camera is attached to the robot arm, and the metal transfer, arc formation, and weld-pool/bead shape are recorded (see Figure 2b). The HDR technique is useful for recording real-world scenes containing a wide range of light intensities, e.g., direct sunlight to extreme shaded environments [43]. The HDR camera (model: WL2-H7ML-M35, WELDVIS) effectively manages the bright light source of the arc, allowing the extraction of features from the clear images. Because the lightweight HDR camera is attached to the robot arm, images with a fixed field-of-view can be obtained. Accordingly, the images can be efficiently used for further analysis. Furthermore, images of metal transfer can be measured using this HDR camera.

Data Gathering and Preprocessing
Metal transfer, arc shape, and weld-pool/bead are important features that must be verified for process stability. For the real-time monitoring of the features, the images were obtained using an HDR camera at 50 frames per second. The monitored images are segmented into three regions of interest (RoIs): (1) metal transfer, (2) arc shape, and (3) weld-pool/-bead, as shown in Figure 2b. In this study, we focus on two RoIs, i.e., metal transfer and weld-pool/bead. Because the arc features show characteristics that differ from those of the others, it requires a more sophisticated method. This will be investigated in future work.
To obtain a uniform and regular weld bead, the wire must be smoothly melted and transferred to a weld-pool with or without a droplet, as shown in Figure 3a. By contrast, Figure 3b shows an abnormal metal transfer, discontinuously transferring the droplet into the weld pool. Consequently, abnormal weld-beads (e.g., humping beads) are generated. To detect this problem, the image data of the metal transfer RoI are gathered and classified into normal and abnormal classes. The gathered images are difficult to use as raw images in the development of the AI model because strong noise is included, and the edge of the wire material in the image is vague owing to the light reflection of the arc. To obtain a clear metal transfer RoI, the contour line of the feed wire is extracted and labeled as normal or abnormal based on its shape, as shown in Figure 4a,b. The Canny edge detection algorithm [44], which has a lower error rate than other edge detection algorithms, is used for contour line extraction. The contoured images in two classes (normal and abnormal) were used for training and testing in the development of an AI model that can discriminate the failure of metal transfer.
wire material in the image is vague owing to the light reflection of the arc. To obtain a clear metal transfer RoI, the contour line of the feed wire is extracted and labeled as normal or abnormal based on its shape, as shown in Figure 4a,b. The Canny edge detection algorithm [44], which has a lower error rate than other edge detection algorithms, is used for contour line extraction. The contoured images in two classes (normal and abnormal) were used for training and testing in the development of an AI model that can discriminate the failure of metal transfer.  To detect the defects, the bead shape should be verified. The quality of the bead shape is affected by various process parameters (e.g., arc current, wire feed rate, and travel speed). If these process parameters are incorrectly set [45,46], defects such as humping can occur [47,48]. Nguyen et al. [49] limited the range of the torch traveling speed during the welding process when humping occurred. In addition, humping degrades the overall weld quality in a single-or multi-layer process. In this study, humping is defined and reproduced as an abnormal class. The well-processed welding should maintain a uniform height and width of the bead (see Figure 5b). By contrast, the defective welding shows an uneven or wavy bead shape (see Figure 5c). The RoI of the weld-pool/bead based on the HDR image in this study is depicted as normal and abnormal in Figure 6a,b, respectively. Compared with the metal transfer RoI, the weld-pool/bead RoI is clearly shown without the interference of arc light. The main difference between normal and abnormal cases is shown by the shape of the weld-bead; in the development of the AI model because strong noise is included, and the edge of the wire material in the image is vague owing to the light reflection of the arc. To obtain a clear metal transfer RoI, the contour line of the feed wire is extracted and labeled as normal or abnormal based on its shape, as shown in Figure 4a,b. The Canny edge detection algorithm [44], which has a lower error rate than other edge detection algorithms, is used for contour line extraction. The contoured images in two classes (normal and abnormal) were used for training and testing in the development of an AI model that can discriminate the failure of metal transfer.  To detect the defects, the bead shape should be verified. The quality of the bead shape is affected by various process parameters (e.g., arc current, wire feed rate, and travel speed). If these process parameters are incorrectly set [45,46], defects such as humping can occur [47,48]. Nguyen et al. [49] limited the range of the torch traveling speed during the welding process when humping occurred. In addition, humping degrades the overall weld quality in a single-or multi-layer process. In this study, humping is defined and reproduced as an abnormal class. The well-processed welding should maintain a uniform height and width of the bead (see Figure 5b). By contrast, the defective welding shows an uneven or wavy bead shape (see Figure 5c). The RoI of the weld-pool/bead based on the HDR image in this study is depicted as normal and abnormal in Figure 6a,b, respectively. Compared with the metal transfer RoI, the weld-pool/bead RoI is clearly shown without the interference of arc light. The main difference between normal and abnormal cases is shown by the shape of the weld-bead; To detect the defects, the bead shape should be verified. The quality of the bead shape is affected by various process parameters (e.g., arc current, wire feed rate, and travel speed). If these process parameters are incorrectly set [45,46], defects such as humping can occur [47,48]. Nguyen et al. [49] limited the range of the torch traveling speed during the welding process when humping occurred. In addition, humping degrades the overall weld quality in a single-or multi-layer process. In this study, humping is defined and reproduced as an abnormal class. The well-processed welding should maintain a uniform height and width of the bead (see Figure 5b). By contrast, the defective welding shows an uneven or wavy bead shape (see Figure 5c).
wire material in the image is vague owing to the light reflection of the arc. To obtain a clear metal transfer RoI, the contour line of the feed wire is extracted and labeled as normal or abnormal based on its shape, as shown in Figure 4a,b. The Canny edge detection algorithm [44], which has a lower error rate than other edge detection algorithms, is used for contour line extraction. The contoured images in two classes (normal and abnormal) were used for training and testing in the development of an AI model that can discriminate the failure of metal transfer.  To detect the defects, the bead shape should be verified. The quality of the bead shape is affected by various process parameters (e.g., arc current, wire feed rate, and travel speed). If these process parameters are incorrectly set [45,46], defects such as humping can occur [47,48]. Nguyen et al. [49] limited the range of the torch traveling speed during the welding process when humping occurred. In addition, humping degrades the overall weld quality in a single-or multi-layer process. In this study, humping is defined and reproduced as an abnormal class. The well-processed welding should maintain a uniform height and width of the bead (see Figure 5b). By contrast, the defective welding shows an uneven or wavy bead shape (see Figure 5c). The RoI of the weld-pool/bead based on the HDR image in this study is depicted as normal and abnormal in Figure 6a,b, respectively. Compared with the metal transfer RoI, the weld-pool/bead RoI is clearly shown without the interference of arc light. The main difference between normal and abnormal cases is shown by the shape of the weld-bead; The RoI of the weld-pool/bead based on the HDR image in this study is depicted as normal and abnormal in Figure 6a,b, respectively. Compared with the metal transfer RoI, the weld-pool/bead RoI is clearly shown without the interference of arc light. The main difference between normal and abnormal cases is shown by the shape of the weld-bead; as such, additional preprocessing is not required. The defined RoI of the weld-pool/bead contains long width pixels, and the humping beads can be located at any location in the weld-pool/bead RoI. Hence, the reduced square area from the weld-pool/bead RoI is randomly extracted from the original weld-pool/bead RoI, and these extracted areas are labeled as "normal" and "abnormal" from the normal and abnormal weld-pool/bead RoIs, respectively, as shown in Figure 6. Furthermore, random extraction from the RoI is facilitated in class data augmentation. as such, additional preprocessing is not required. The defined RoI of the weld-pool/bead contains long width pixels, and the humping beads can be located at any location in the weld-pool/bead RoI. Hence, the reduced square area from the weld-pool/bead RoI is randomly extracted from the original weld-pool/bead RoI, and these extracted areas are labeled as "normal" and "abnormal" from the normal and abnormal weld-pool/bead RoIs, respectively, as shown in Figure 6. Furthermore, random extraction from the RoI is facilitated in class data augmentation.
(a) (b) Figure 6. (a) Normal bead shape and labeled data; (b) abnormal bead shape and labeled data.

CNN
To develop an AI model-based real-time anomaly detection algorithm, a CNN model is selected since CNN has demonstrated the best performance in image-based classification. In this study, five CNN architectures (see Figure 7) are investigated as the real-time anomaly detection AI model for RoIs. Two of them (basic CNN) are designed by the authors, whereas the other algorithms adopt the VGG16 architecture [50] for transfer learning. The VGG16 architecture is selected because it has been widely used and recognized for its good performance in image classification. In particular, it has already trained weights using a large dataset. Therefore, the transfer learning approach with new data can be applied effectively. Transfer learning is useful when the available data are limited [51][52][53][54].
Two CNN models proposed by the authors comprise three convolutional and pooling layers (basic CNN for weld-pool/bead and basic CNN for metal transfer), whereas VGG16 exhibits a more complex structure with 16 layers. This deep layer of VGG16 is known as deep CNN. According to the initial weight setup among the layers, VGG16 can be classified into three types: (1) VGG16 with no pretrained weights (VGG16), (2) a transfer learning model with VGG16 based on pretrained weights (VGG16-PRETR), and (3) a transfer learning and fine-tuning model with VGG16 based on pretrained weights (VGG16-PRETR-FINETUNE). A diagram of the tested CNN structures is shown in Figure  7.  To develop an AI model-based real-time anomaly detection algorithm, a CNN model is selected since CNN has demonstrated the best performance in image-based classification. In this study, five CNN architectures (see Figure 7) are investigated as the real-time anomaly detection AI model for RoIs. Two of them (basic CNN) are designed by the authors, whereas the other algorithms adopt the VGG16 architecture [50] for transfer learning. The VGG16 architecture is selected because it has been widely used and recognized for its good performance in image classification. In particular, it has already trained weights using a large dataset. Therefore, the transfer learning approach with new data can be applied effectively. Transfer learning is useful when the available data are limited [51][52][53][54].
Two CNN models proposed by the authors comprise three convolutional and pooling layers (basic CNN for weld-pool/bead and basic CNN for metal transfer), whereas VGG16 exhibits a more complex structure with 16 layers. This deep layer of VGG16 is known as deep CNN. According to the initial weight setup among the layers, VGG16 can be classified into three types: (1) VGG16 with no pretrained weights (VGG16), (2) a transfer learning model with VGG16 based on pretrained weights (VGG16-PRETR), and (3) a transfer learning and fine-tuning model with VGG16 based on pretrained weights (VGG16-PRETR-FINETUNE). A diagram of the tested CNN structures is shown in Figure 7.
Regarding the CNN model with no pretrained weights (VGG16), the weights within all layers are randomly initialized at the beginning of training; subsequently, training is performed using the gathered data. The transfer-learned CNN model (VGG16-PRETR) uses the weights pretrained by ImageNet data [35] and these weights are fixed, excluding the last fully connected layer. The weights of the last fully connected layer are randomly initialized in the beginning, and the training of the image data changes these weights for fitting. In transfer learning with a fine-tuned CNN model (VGG16-PRETR-FINETUNE), the weights from the first layer to the third layer are fixed with pretrained ones, and the weights after the third layer and the fully connected layer are randomly initialized at the beginning of training; subsequently, training is performed with the image data. Furthermore, the pretrained weight used in VGG16-PRETR-FINETUNE is from the pretrained weight using the ImageNet database.

Training and Testing of CNN
The acquisition of abnormal data is challenging because of the lack of abnormality during processing. If an imbalance between normal and abnormal data exists, the model performance can deteriorate. Up-sampling and down-sampling can be applied to manage the imbalance. Up-sampling is a method for extracting more samples from fewer data belonging to a corresponding classification, and vice versa. To match the balance between normal and abnormal data, down-sampling is applied to the image data of metal transfer and weld-pool/bead because of the shortage of abnormal data. The number of images to be trained for metal transfer is 1162 images (581 normal/abnormal images for each). The amount of prepared training data for weld-pool/bead is 2650 images (1325 normal/abnormal images for each). The training data are segmented into five folds to perform a five-fold cross validation, as described in Figure 8, which can avoid overfitting or bias [55]. Among the five folds, one fold is used as the validation set, and the others are used for the training sets. The generated validation set is a fold that is fully independent from the other training folds and is used to evaluate the training status during training. After the training phase with one validation fold in the first split has completed, the other fold is designated as the validation fold in the next split. This process is repeated until all the folds have been used for validation. An overview of the five-fold cross validation performed in this study is presented in Figure 8. During the training of the CNN models, the cross-entropy loss function is selected as training measure. Regarding the CNN model with no pretrained weights (VGG16), the weights within all layers are randomly initialized at the beginning of training; subsequently, training is performed using the gathered data. The transfer-learned CNN model (VGG16-PRETR) uses the weights pretrained by ImageNet data [35] and these weights are fixed, excluding the last fully connected layer. The weights of the last fully connected layer are randomly initialized in the beginning, and the training of the image data changes these weights for training sets. The generated validation set is a fold that is fully independent from the oth training folds and is used to evaluate the training status during training. After the traini phase with one validation fold in the first split has completed, the other fold is designat as the validation fold in the next split. This process is repeated until all the folds have be used for validation. An overview of the five-fold cross validation performed in this stu is presented in Figure 8. During the training of the CNN models, the cross-entropy lo function is selected as training measure. The CNN models are trained on a 64-bit Windows 10 operating system, with 128 G of memory and an NVIDIA Titan RTX GPU. The building, training, validation, and pr diction of the CNN models are programmed and executed using the Keras [56] libra and TensorFlow [57] backend engine.

Result and Discussion
The trained CNN models explained in Section 3.2.1 are evaluated using new test da from an additional experiment. The test data are prepared using 1294 images (647 imag The CNN models are trained on a 64-bit Windows 10 operating system, with 128 GB of memory and an NVIDIA Titan RTX GPU. The building, training, validation, and prediction of the CNN models are programmed and executed using the Keras [56] library and TensorFlow [57] backend engine.

Result and Discussion
The trained CNN models explained in Section 3.2.1 are evaluated using new test data from an additional experiment. The test data are prepared using 1294 images (647 images for each of normal/abnormal) for metal transfer and 2860 images (1430 images for each of normal/abnormal images) for the weld-pool/bead.

Performance Evaluation Measure
The proposed CNN models are evaluated using an independent test dataset from the training data in each cross-validation fold. Accuracy, sensitivity, specificity, and receiver operating characteristic (ROC) curves are used as measures for the comprehensive evaluation of the CNN models. The accuracy, sensitivity, and specificity scores are calculated as follows (Table 1): TN TN + FP TP and TN denote the numbers of images correctly predicted as abnormal and normal, respectively. FP and FN denote the numbers of images incorrectly predicted as abnormal and normal from normal and abnormal, respectively. The measure "accuracy" refers to the proportion of correctly classified numbers in the total data. The ratio of abnormal data among the abnormal data that have received abnormal determination is known as the "sensitivity", and the ratio of the data that have been determined to normal among normal data is known as the "specificity". The ROC curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system when its discrimination threshold is varied. The ROC curve is plotted by the true positive rate against the false positive rate at various thresholds (see Figure 9). Additionally, the area under the ROC curve (AUC) is calculated. A larger AUC represents better performance.

specificity TN TN FP
TP and TN denote the numbers of images correctly predicted as abnormal and normal, respectively. FP and FN denote the numbers of images incorrectly predicted as abnormal and normal from normal and abnormal, respectively. The measure "accuracy" refers to the proportion of correctly classified numbers in the total data. The ratio of abnormal data among the abnormal data that have received abnormal determination is known as the "sensitivity", and the ratio of the data that have been determined to normal among normal data is known as the "specificity". The ROC curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system when its discrimination threshold is varied. The ROC curve is plotted by the true positive rate against the false positive rate at various thresholds (see Figure 9). Additionally, the area under the ROC curve (AUC) is calculated. A larger AUC represents better performance.  Additionally, to overcome the weakness of the deep learning model as a non-interpretable black box model, a visual explanation technique such as gradient-weighted class activation mapping (Grad-CAM) [58] is used to validate the effectiveness of the models. In recent studies, several researchers have attempted to understand AI models through visual explanations using the VGG16 model and Grad-CAM [59][60][61][62]. To apply Grad-CAM, we use the following values for each model; the output of the last convolutional layer, and the weight of the fully connected layer that follows it. Grad-CAM enables one to understand which features in each RoI are observed and learned by the models. It shows the emphasized important areas used for classifying normal and abnormal images.

Defect Detection Performance for Metal Transfer Abnormality
The performances of the CNN models using the original image (see Figure 3) and the preprocessed image to the contoured image (see Figure 4) are shown in Table 2.  Figure 9 shows the graphs of the ROC curve for each CNN model. The y-axis represents the probability of a well-predicting abnormality with the true positive rate (TPR) as the sensitivity. The x-axis is the false positive rate (FPR 1-specificity), which refers to the rate at which normal data are incorrectly classified as abnormal. Therefore, the higher the sensitivity and specificity, the larger the AUC covers. The AUC is the reference area for calculating the ROC curve. Models with a high AUC should not be unconditionally adopted. Even with a high AUC, other measures such as accuracy, sensitivity, and specificity should be considered simultaneously. An AUC that is closer to 1 signifies a better model; if it is 0.5, it signifies that a random prediction model with a one-half probability is created. Therefore, a model with an AUC of 0.5 or less should not be used.
According to Table 2 and Figure 9, VGG16-PRETR demonstrates the best performance for original image classification. Because the weights of VGG16-PRETR are obtained by pretraining a significant amount of image data from ImageNet, it possesses strong capability in classifying images, and this enables abnormalities to be identified within the original image. Meanwhile, the fine-tuned model (VGG16-PRETR-FINETUNE) trained by the original images deteriorates the classification ability, which suggests that the performance of VGG16-PRETR-FINETUNE cannot be improved with insufficient data.
When the classification with preprocessed images is tested, VGG16-PRETR-FINETUNE demonstrates better performance than VGG16-PRETR. Previous studies explaining the effects and mechanisms of fine-tuning [38] have shown that deep CNN models can be more specialized for specific classification tasks when certain convolution blocks are fine-tuned. Fine-tuning yields general feature information using the weights of the initial convolution blocks learned in advance. Subsequently, the weights of the convolution blocks are retrained to obtain abstract feature information that fits the data provided. The experimental results show that the VGG-PRETR-FINETUNE CNN model using fine-tuning can be beneficial for certain classification tasks with simple features in the images.
The AI model is known as the "black box" model. This implies that the manner in which the model is trained and the important feature information for classification are unknown. Using only the values of performance measures from the experiments is insufficient for determining the best model and for verifying whether the classification criteria are well trained. In this study, to verify the training validity of the CNN models, the authors applied Grad-CAM, which can visualize the learning of feature information in the images. The results of Grad-CAM are shown in Figure 10.
As shown in Figure 10, the image is separated into two sections. The original image and trained features from the original image are located on the left section, and the preprocessed image and their trained features on the right section. In each section, the left image represents the true normal case when the CNN model classifies normal data as normal. The right image shows a true abnormal case in which the CNN model classifies abnormal data as abnormal. The image on the top represents the trained data (original and preprocessed images), whereas the images below the first row in Figure 10 show the Grad-CAM images for each CNN model, such as VGG16, VGG16-PRETR, VGG16-PRETR-FINETUNE, and basic CNN for metal transfer. The bright red color in the Grad-CAM image indicates areas with the most prominent effect on model judgment for classification between normal and abnormal cases. By verifying the red area, the feature in the image trained by the CNN model can be visually interpreted.
Regarding the training of the original image, the best performance model, VGGG16-PRETR, comprises a red area spread across the entire area of the Grad-CAM image. On the contrary, in the true abnormal case, the area emphasized in red is distributed in an abnormal rod shape and at the bottom of the image. The bottom area constitutes the background and is not associated with metal. The learned features shown in red differed between true normal and true abnormal; therefore, the classification performance of VGGG16-PRETR is the best. The VGG16 model and basic CNN for metal transfer classification are represented by a wide range of red colors, signifying that these models do not successfully discriminate the targeting features. These models fail to learn the metal edge during the feature training. However, VGG16-PRETR-FINETUNE manages to learn some of the edge parts of the metal rod, but the learned area is extremely narrow as a local area.
Considering the model building using the preprocessed images, Grad-CAM shows the edge features of the image from classification. This pattern can be observed in the Grad-CAM images of VGG16, VGG16-PRETR, and VGG16-PRETR-FINETUNE models. In the true normal case, the red area is centered on the metal edges. In the true abnormal case, a red area is formed on the molten circular metal part. The visual description of Grad-CAM provides an understanding regarding an AI model known as an uninterpretable "black box". This means that the preprocessed image contributed to the improvement in edge feature recognition by the AI model. Therefore, it is assumed that the preprocessed images appear more suitable for training CNN models in the metal transfer RoI.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 12 of 20 The AI model is known as the "black box" model. This implies that the manner in which the model is trained and the important feature information for classification are unknown. Using only the values of performance measures from the experiments is insufficient for determining the best model and for verifying whether the classification criteria are well trained. In this study, to verify the training validity of the CNN models, the authors applied Grad-CAM, which can visualize the learning of feature information in the images. The results of Grad-CAM are shown in Figure 10. As shown in Figure 10, the image is separated into two sections. The original image and trained features from the original image are located on the left section, and the preprocessed image and their trained features on the right section. In each section, the left image represents the true normal case when the CNN model classifies normal data as normal. The right image shows a true abnormal case in which the CNN model classifies abnormal data as abnormal. The image on the top represents the trained data (original and preprocessed images), whereas the images below the first row in Figure 10 show the Grad-CAM images for each CNN model, such as VGG16, VGG16-PRETR, VGG16-PRETR-FINETUNE, and basic CNN for metal transfer. The bright red color in the Grad- As shown in Table 2 and Figure 9b, the AUC value of the CNN models trained with the preprocessed image is close to 1. However, the accuracies of the basic CNN for metal transfer, VGG16, and VGG16-PRETR are as low as 0.738, 0.731, and 0.638, respectively. These values are low despite the high AUC value. Meanwhile, the sensitivity is high, and the specificity is low at 0.476, 0.464, and 0.276, respectively, implying that the classification criterion is sensitive to abnormality detection. In this case, the abnormality can be classified perfectly. However, many normal data tend to be classified as abnormal.
Lobo et al. [63] reported the problem wherein the AUC can be high owing to an inappropriate model (which overestimated or underestimated in all predictions). ROC curve tells something about how well the samples of the abnormal class can be separated from the normal class, while the prediction accuracy hints on the actual performance of classifier. AUC is the area under the ROC curve. Hence, if all of the probabilities are above 0.5, an AUC of approximately 1 can still be achieved if all of the abnormal cases have higher the number of probabilities than all of the normal cases. Based on Table 2, the VGG16-PRETR model using preprocessed images shows a low average accuracy of 0.638 and a high average AUC of 0.990. One of these models classified all 647 abnormal cases as abnormal. Conversely, in the case of normal images, 136 are matched with normal, and the remaining 513 are misclassified as abnormal. It appears that the model is overfitting for abnormal predictions. Other overfitting models (Basic CNN for metal transfer, VGG16, VGG16-PRETR) also show general characteristics of low accuracy and high AUC. Figure 11 is the distribution of abnormal prediction rates for all individual data of the VGG16-PRETR model using preprocessed images. The probability of predicting the normal class as abnormal is widely distributed below 1. The probability of predicting abnormal class as abnormal is distributed close to 1. In this case, a decision threshold close to 1, which will yield an error rate (probability of misclassification abnormal) of close to zero. It is noteworthy that because the AUC only measures the ranking of the probabilities, it doesn't reveal whether the probabilities are well calibrated (e.g., a systematic bias does not exist). Therefore, all model performance values (AUC, sensitivity, specificity, and accuracy) must be considered. The basic CNN for metal transfer, VGG16, and VGG16-PRETR models cannot be used when all model performance values are considered.
This result is confirmed via Grad-CAM as well. A comparison of the effectiveness among CNN models using preprocessed images is shown in Figure 10. As shown, except for the high-performance model VGG16-PRETR-FINETUNE, the other three models (basic CNN for metal transfer, VGG16, VGG16-PRETR) comprise a color area spread across the entire area. This means that the feature information is not well specified. For example, VGG16-PRETR with the widest colored area indicates the lowest accuracy. The basic CNN for metal transfer indicates the second highest accuracy among the CNN models with preprocessed images. When considering the Grad-CAM image, the red area is distributed in the lower part of the image, not the edge of the metal. In the case of the basic CNN for metal transfer, which is not a deep CNN structure, it appears that feature learning does not perform well. In the case of the VGG-16PRETR-FINETUNE, which has the best performance, it seems that we learn the straight and curved edge features as intended.
Based on the results shown in Table 2, Figures 9b and 10, a high-performance model (VGG16-PRETR-FINETUNE) can be defined for all measurements (AUC, sensitivity, specificity accuracy, and feature information visualization) developed using preprocessing. Therefore, all model performance values (AUC, sensitivity, specificity, and accuracy) must be considered. The basic CNN for metal transfer, VGG16, and VGG16-PRETR models cannot be used when all model performance values are considered.
This result is confirmed via Grad-CAM as well. A comparison of the effectiveness among CNN models using preprocessed images is shown in Figure 10. As shown, except for the high-performance model VGG16-PRETR-FINETUNE, the other three models (basic CNN for metal transfer, VGG16, VGG16-PRETR) comprise a color area spread across the entire area. This means that the feature information is not well specified. For example, VGG16-PRETR with the widest colored area indicates the lowest accuracy. The basic CNN for metal transfer indicates the second highest accuracy among the CNN models with preprocessed images. When considering the Grad-CAM image, the red area is distributed in the lower part of the image, not the edge of the metal. In the case of the basic CNN for metal transfer, which is not a deep CNN structure, it appears that feature learning does not perform well. In the case of the VGG-16PRETR-FINETUNE, which has the best performance, it seems that we learn the straight and curved edge features as intended.
Based on the results shown in Table 2, Figures 9b and 10, a high-performance model (VGG16-PRETR-FINETUNE) can be defined for all measurements (AUC, sensitivity, specificity accuracy, and feature information visualization) developed using preprocessing.

Detection Performance of Weld-Pool/Bead
The performance of the tested CNN models for the weld-pool/bead RoI is shown in Table 3. Figure 12, below the table, shows the ROC curves for the classification performance of the CNN models. The highest AUC is 0.995, obtained by the basic CNN for the weld-pool/bead. The VGG16 model has the second highest score, 0.982, with the highest accuracy of 0.965. The VGG16-PRETR model, which has the lowest AUC of 0.869, has the lowest accuracy of 0.794. Regarding the weld-pool/bead, each model demonstrates different strengths in each measure. Generally, the basic CNN for the weld-pool/bead and VGG16 demonstrates good performances when the AUC, sensitivity, specificity, and accuracy are considered simultaneously. Both models exhibits good performance values that exceeds 0.9 in each measure. Unlike metal transfer, the weld-pool/bead RoI appears to be sufficient with a simple structure and primitive training using less data. This is because the CNN structure has been proven robust for image classification in other fields. This characteristic of the CNN appears to fit the weld-pool/bead RoI image well. As explained in Section 4.2, Grad-CAM visualizes the feature information that is the basis for classification using the CNN model (see Figure 13). In Figure 13, the top image shows the data used for training and testing. The normal image of the bead shape maintains a uniform height (a straight feature on the upper part). The abnormal bead does not maintain a uniform height and yields a ball-like shape (curve characteristic). The feature information visualization for each model is highlighted in red in Figure 13.
Based on Figure 13, for the highest accuracy obtained by VGG16, a red area appears on the top of the bead in the true normal case. In the true abnormal case, the red area appears on the curved edge of the bead. VGG16 appears to have classified the bead shape with the edge feature information. In the basic CNN for the weld-pool/bead with the highest AUC, a large red area appears at the bottom of the bead in the true normal case. The partial red area appears above the bead, as a small area. The bead pattern is observed as the main feature information. In the true abnormal case, a red area appears on the outside of the bead edge. The surrounding background, not the bead, appears to be the main characteristic information used for the classification. For VGG16-PRETR-FINETUNE, some areas appear similar to the basic CNN for the weld-pool/bead and VGG16. However, no consistent feature information is available in the two test images between the normal and abnormal cases. For VGG16-PRETR, the red area is randomly distributed. It appears that feature learning does not perform as intended. In fact, VGG16-PRETR has the lowest AUC, accuracy, and specificity. Based on the results of Table 3, Figures 12 and 13, a highperformance model (VGG16) can be selected for all measurements (AUC, sensitivity, specificity, accuracy, and feature information visualization).
In Section 4.2, the effectiveness of fine-tuning in classifying specific images is described. Originally, the weights of VGG16 trained by the ImageNet data are designed to predict 1000 classes. Therefore, VGG16 contains pretrained weights to classify them. However, a sufficient amount of data is required for VGG16 to classify 1000 classes. Fine- The highest AUC is 0.995, obtained by the basic CNN for the weld-pool/bead. The VGG16 model has the second highest score, 0.982, with the highest accuracy of 0.965. The VGG16-PRETR model, which has the lowest AUC of 0.869, has the lowest accuracy of 0.794. Regarding the weld-pool/bead, each model demonstrates different strengths in each measure. Generally, the basic CNN for the weld-pool/bead and VGG16 demonstrates good performances when the AUC, sensitivity, specificity, and accuracy are considered simultaneously. Both models exhibits good performance values that exceeds 0.9 in each measure. Unlike metal transfer, the weld-pool/bead RoI appears to be sufficient with a simple structure and primitive training using less data. This is because the CNN structure has been proven robust for image classification in other fields. This characteristic of the CNN appears to fit the weld-pool/bead RoI image well.
As explained in Section 4.2, Grad-CAM visualizes the feature information that is the basis for classification using the CNN model (see Figure 13). In Figure 13, the top image shows the data used for training and testing. The normal image of the bead shape maintains a uniform height (a straight feature on the upper part). The abnormal bead does not maintain a uniform height and yields a ball-like shape (curve characteristic). The feature information visualization for each model is highlighted in red in Figure 13.
In the basic CNN for the weld-pool/bead, the model is designed by the authors and can cause overfitting; therefore, its performance can be restricted to only specific data. However, the deep structure of CNNs, such as VGG16, can overcome this drawback. Experimental results proved that deep CNN model structures such as VGG16-PRETR-FINE-TUNE and VGG16 are high-performance models for metal transfer and weld-pool/bead RoIs. This suggests that it is more advantageous to use the proven deep CNN structure than the basic CNN for the weld-pool/bead designed by the authors.

Conclusions
In this study, the authors developed a methodology to detect abnormalities in WAAM based on HDR camera images, and AI models based on the CNN are adopted. The tested CNN models are proven to be applicable for defect detection in WAAM. The image data gathered by the HDR camera are segmented into three areas; the RoI for metal transfer, arc shape, and weld-pool/bead. The targeted areas are the metal transfer and  Figure 13, for the highest accuracy obtained by VGG16, a red area appears on the top of the bead in the true normal case. In the true abnormal case, the red area appears on the curved edge of the bead. VGG16 appears to have classified the bead shape with the edge feature information. In the basic CNN for the weld-pool/bead with the highest AUC, a large red area appears at the bottom of the bead in the true normal case. The partial red area appears above the bead, as a small area. The bead pattern is observed as the main feature information. In the true abnormal case, a red area appears on the outside of the bead edge. The surrounding background, not the bead, appears to be the main characteristic information used for the classification. For VGG16-PRETR-FINETUNE, some areas appear similar to the basic CNN for the weld-pool/bead and VGG16. However, no consistent feature information is available in the two test images between the normal and abnormal cases. For VGG16-PRETR, the red area is randomly distributed. It appears that feature learning does not perform as intended. In fact, VGG16-PRETR has the lowest AUC, accuracy, and specificity. Based on the results of Table 3, Figures 12 and 13, a high-performance model (VGG16) can be selected for all measurements (AUC, sensitivity, specificity, accuracy, and feature information visualization).

Based on
In Section 4.2, the effectiveness of fine-tuning in classifying specific images is described. Originally, the weights of VGG16 trained by the ImageNet data are designed to predict 1000 classes. Therefore, VGG16 contains pretrained weights to classify them. However, a sufficient amount of data is required for VGG16 to classify 1000 classes. Fine-tuning, which reuses some pretrained weights, is more effective when training using insufficient data because some features of the objects are already included in the model weights. In this study, VGG16, which has no pretrained weights, indicates the best performance, implying that the trained data are sufficient for obtaining the correct weight for the classification of WAAM images. A simple image such as a metal transfer RoI image is compatible with a small amount of data. However, a deep CNN structure requires a significant amount of data to learn complex images such as the weld-pool/bead RoI. The pretrained weights facilitate learning with insufficient data.
In addition, a deep CNN structure is useful in industrial processes such as WAAM. In the basic CNN for the weld-pool/bead, the model is designed by the authors and can cause overfitting; therefore, its performance can be restricted to only specific data. However, the deep structure of CNNs, such as VGG16, can overcome this drawback. Experimental results proved that deep CNN model structures such as VGG16-PRETR-FINETUNE and VGG16 are high-performance models for metal transfer and weld-pool/bead RoIs. This suggests that it is more advantageous to use the proven deep CNN structure than the basic CNN for the weld-pool/bead designed by the authors.

Conclusions
In this study, the authors developed a methodology to detect abnormalities in WAAM based on HDR camera images, and AI models based on the CNN are adopted. The tested CNN models are proven to be applicable for defect detection in WAAM. The image data gathered by the HDR camera are segmented into three areas; the RoI for metal transfer, arc shape, and weld-pool/bead. The targeted areas are the metal transfer and weld-pool/bead RoIs in this study. To improve the performance of the AI model, preprocessing is applied to the image. In designing the CNN architecture, the authors define a simple architecture of the CNN, and a predefined architecture from well-known CNN models such as VGG16 is also adopted. As VGG16 is not originally designed for defect detection, transfer learning is applied to VGG16 and then tested for its applicability. Through model training and testing, the effectiveness of the CNN model for defect detection in WAAM is confirmed.
To verify the CNN model training, Grad-CAM is performed. Grad-CAM indicates that the tested CNN model is well trained to identify the designated characteristic area in the image. Additionally, Grad-CAM enables model developers to understand the training process of CNNs such that a better model can be developed.
According to our experiments, the basic CNN model, which has a simple architecture, performed better than VGG16 with no pretraining. However, in cases where the pretrained CNN models (VGG16-PRETR and VGG16-PRETR-FINETUNE) are applied to transfer learning, they performed better than the previous ones. This means that transfer learning using the CNN can improve detection ability using pretrained features from other objects. If sufficient data are available for training, then the training to retain all weights from the beginning will yield satisfactory performance. However, if data are insufficient, then transfer learning using a pretrained CNN model is advantageous. For the model evaluation, Grad-CAM visualizes the recognized features and enables model developers to understand the CNN model, which is regarded as a black box model. In addition, Grad-CAM can support multifaceted analysis in the selection of an AI model.
This study suggests that the AI model can be the basis of real-time monitoring for a welding process during WAAM. Defect detection can be combined with the control of process variables such as current, voltage, and feed rate. To increase the reliability of defect detection, an ensemble model that judges combinations with other process variables can be considered in the near future. For further research, the types of abnormality in WAAM must be diversified, and a generalized model must be developed to detect them. More diverse data (e.g., shape and sound) will be considered to develop more robust AI models in the near future.