Deep Learning-Based Classiﬁcation of Weld Surface Defects

: In order to realize the non-destructive intelligent identiﬁcation of weld surface defects, an intelligent recognition method based on deep learning is proposed, which is mainly formed by convolutional neural network (CNN) and forest random. First, the high-level features are automatically learned through the CNN. Random forest is trained with extracted high-level features to predict the classiﬁcation results. Secondly, the weld surface defects images are collected and preprocessed by image enhancement and threshold segmentation. A database of weld surface defects is established using pre-processed images. Finally, comparative experiments are performed on the weld surface defects database. The results show that the accuracy of the method combined with CNN and random forest can reach 0.9875, and it also demonstrates the method is e ﬀ ective and practical.


Introduction
As a traditional processing method, the welding process is widely used in aerospace, machine design, energy generation, shipbuilding, petrochemical engineering, and other industries.Welding is a complicated process, which is often affected by the welding process and environmental uncertainties, and it is easy to produce welding defects such as overlap, pore, spatter, slag inclusion, and incomplete fusion.Therefore, it is necessary to conduct research on the detection and identification of weld surface defects.Common non-destructive testing methods for welds include ultrasonic testing [1], ray detection [2], and eddy current testing [3].However, these methods have some limitations: (1) ray detection can easily result in side-effects on the human body; (2) ultrasonic testing is susceptible to the location, orientation, and shape of the defect; (3) eddy current testing is only applicable to conductive metal materials or non-metallic materials that can induce eddy currents.
During the last decade, computer vision became an important field of artificial intelligence, which has been widely used in weld defect detection.It generally includes four steps: image acquisition, image preprocessing, feature extraction, and classification [4].Feature extraction is one of the most critical technologies.Many studies have been carried out on feature extraction and classification of weld defects.Sun et al. [5] isolated the defect features of thin-walled workpiece based on the background subtraction of Gaussian mixture model and constructed a decision tree to discern weld defects.Valavanis et al. [6] extracted texture and geometric features of different weld defects and used support vector machine (SVM), k-nearest neighbor (KNN), and artificial neural network (ANN) to complete the classification.Yin et al. [7] proposed a new method to automatically extract four geometric features from Lissajous figures and use machine learning-based classifiers to identify defects.Boaretto et al. [8] obtained defect features by the exposure technique of double wall double image (DWDI) and used multi-layer perception (MLP) to continuously classify defect or no-defect.Zapata et al. [9] selected the shape and direction of weld defects as classification features, and proposed an adaptive network based fuzzy inference system to detect welding defects.Zahran et al. [10] extracted feature from the power density spectra (PDS) of the weld segmented areas and used ANN to match features in order to recognize different defects.In References [11,12], a method based on principal component analysis (PCA) and SVM was proposed.It can effectively transform weld defects images to principal component space through PCA and complete the classification using SVM.However, the above feature extraction and classification methods only utilize the texture or geometric features of the weld defects images, and it is not ideal to form suitable features for classification.
The history of deep learning originates from neuroscience experiments.As early as 1943, neurologist McCulloch and mathematician Pitts constructed the first artificial neuron model according to the structure and working principle of biological neurons, which was called the "MCP model".Since then, many researchers have proposed various neural network models.However, due to the problem of gradient disappearance in the training method proposed at that time, it has been difficult to use this powerful model.This problem was not solved until the emergence of deep learning methods [13] in 2006.Krizhevshy et al. [14] used AlexNet to make a breakthrough in ImageNet image classification competition, deep learning showed unprecedented development prospects.After that, researchers have proposed ZFNet, VGGNet, GoogLeNet, and ResNet models for large-scale image classification.Compared with the traditional feature extraction methods, these methods do not need to implement any features of the pre-selected image.It can learn the high-level features of the target from the sample data by supervised learning [15].
Recently, deep learning methods have been widely applied in weld defects detection.Yang et al. [16] proposed a modified convolutional neural network (CNN) to implement classification in X-ray weld images.Hou et al. [17] construct a deep convolutional neural network (DCNN) to extracted high-level features from X-ray images.Khumaidi et al. [18] introduced the idea of Gaussian kernel into deep learning.The purpose is to ensure that the main information of the image is extracted while minimizing the occurrence of noise and interference and improving the classification accuracy.In Reference [19], Liu Bin et al. utilized the VGG16 based fully CNN to inspect welding defect images using the idea of transfer learning.High precision classification is achieved with a relatively small data set.In Reference [20], Hou et al. proposed a deep neural network based on sparse automatic encoder.It learns the feature information of the image in an unsupervised way, and then uses the softmax classifier with supervised learning to identify the defects such as pore and undercut.In Reference [21], Du et al. adopted feature pyramid network (FPN) and RoIAlign to realize defect detection of X-ray images of automobile aluminum casting parts.The above methods usually need to construct deep CNN architecture and utilize softmax to handle classification tasks.However, in the case of fewer training samples, there is a poor performance for softmax that the features are confusing.
In order to solve the problems mentioned, an intelligent recognition method based on deep learning is proposed.What is more, random forest is selected as the final classification algorithm because random forest has good fault-tolerant ability and low generalization errors on the issue of dealing with classification.The overview of the proposed method is shown in Figure 1.It is mainly formed by two modules: CNN and forest random.The CNN architecture is constructed to automatically extract high-level features of weld surface defects images.Random forest is trained by using the high-level features to accomplish the classification task.In addition, the welding surface defects images including normal, overlap, spatter, and pore are preprocessed by image enhancement and threshold segmentation.A database of weld surface defects is established using pre-processed images and is taken as input data.Finally, comparative experiments are carried out to verify the effective and practical of proposed method.The structure of this paper is organized as follows.The Section 2 gives specific architecture of the CNN module.In Section 3, the classification module applying random forests is described in detail.Weld defect image pre-processing and experimentation are given in Section 4. The Section 5 summarizes the article.

Background
The basic idea of deep learning is to construct a deep nonlinear network structure.It initializes the weights through unsupervised pre-training methods, and then fine-tunes through supervised training.The problem of gradient disappearance is effectively suppressed with ReLU activation function.Deep learning can gradually transform the initial low-level feature into high-level features through multi-layer processing, which is beneficial to complete complex tasks such as image [22], speech [23], and language [24].

Feature Extraction Module
The task of CNN is to automatically extract high-level features of the weld surface defects images.Inspired by LeNet-5 [32], a deep CNN architecture is constructed by overlaying multiple convolutional layers.A typical convolutional layer usually contains convolution, nonlinear function activation and pooling operations.Figure 2 shows the specific architecture of the CNN.The CNN mainly is made up of three convolutional layers, two pooling layers, one fully connected layer, and one softmax layer.The input of the CNN contains training and validation images with 80 × 120 (height × width).In our paper, the convolutional layers are named C1, C2, and C3, respectively, which consists of 4, 8, 16 filters.The size of corresponding filters is 5 × 5, 3 × 3, and 2 × 2. Let x l denotes the out of convolutional layers, and it can be expressed as: where (J, I) denotes the size of the filters, J is the height of the filters, and I is the width of the filters.b l denotes the bias of the convolutional layer.x l−1 denotes the output of the previous convolutional layer.w l denotes the weight of convolutional layer.f (•) is the nonlinear activation function.ReLU activation function is selected and is shown as Equation (2).
Pooling operation [33] is another important component.The pooling layers are named S1 and S2.Two max-pooling layers with stride 2 are used to combine a 2 × 2 patch of the convolutional layers C2 and C3.Each pooling layer corresponds to the previous convolutional layer.The neurons in the pooling layer perform aggregation statistics on specific regions in the convolution layer to achieve the purpose of down-sampling for the input feature map.Common pooling operations include average pooling, maximum pooling, and random pooling.The maximum pooling used in this paper is expressed as follows: where u(n, n) is the window function, which is applied to calculate the maximum value of a j in the neighborhood.After a series of convolution, activation, and pooling, the fully connected layer splices the feature that is extracted from convolutional layer C3 and max-pooling layer S2 into a 128-dimensional vector.The output of the fully connected layer can be obtained by: where b m denotes the bias of the fully connected layer.w m denotes the weights of the fully connected layer.x m−1 denotes the output of the previous max-pooling layer.f (•) is the ReLU activation function.
The CNN is trained by the cross-entropy function.The details explain of CNN are given in Table 1.

Random Forest Algorithm
Random forest [34] is a machine learning algorithm proposed by Leo breiman in 2001.It is an integrated algorithm based on decision tree.The main idea is to use multiple weak classifiers to form a strong classifier by voting.For the classification task, the random forest algorithm mainly includes three steps: bootstrap sampling, constructing decision tree, and voting.
The bootstrap sampling refers to a uniform sampling method with a return.Self-sampling is useful when the data set is small.It can generate different training sets from the initial data set, which is beneficial to improve the image recognition ability.Decision tree construction is usually one of three methods: iterative dichotomiser 3 (ID3), C4.5, classification and regression tree (CART).This paper uses the CART algorithm to generate decision trees.CART assumes that the decision tree is a binary tree, and the internal node features are divided into left and right branches according to the threshold.Generally, the feature with the smallest Gini index in the subset and its corresponding threshold are selected as the optimal feature and the optimal segmentation point.The data set is allocated to the left branch node and the right branch node.The feature tree with the smallest Gini index splits the binary tree to form a classification tree to form a random forest.The Gini index of feature M is defined as follows: where Based on the results of each classification tree, a voting mechanism is used to determine the final classification result.The classification function is expressed as follows: where I(•) denotes the indicative function, ranging from 0 to 1. H(x) denotes the combined classification model.h i denotes the single decision tree classification model.

Classification Module
Training random forests with 128-dimensional vectors to achieve classification results.The primary task of the random forest algorithm is to create decision trees.A decision tree consists of a terminal node, several internal nodes, and several leaf nodes.First, T sample sets D l (l = 1, 2, 3, ..., T) that every sample set contains m training samples are sampled by using the bootstrap from deep feature vector v m , and will be used as the terminal root of the decision tree.Terminal nodes are designed to ensure that the training sample sets for each decision tree are different.Secondly, in order to avoid over-fitting of the classification results, it is necessary to randomly choose features in the process of establishing internal nodes in a single decision tree.f features are elected from h features without replacement.The rule for feature selection is generally followed as f = √ h, here ( f ≤ h).Next, the feature attribute with the smallest Gini index in the feature subset f is calculated based on Equation (5).The attribute and the corresponding threshold are taken as the optimal partition attribute and the optimal split point at the non-special node.The decision tree stops splitting until the Gini index of the feature in the sample set is less than a predetermined threshold.Repeat the above steps until T decision trees are generated.
Another task of the random forest is to perform predictions using the created random forest.The classification model consisting of T decision trees uses the voting method based on Equation ( 6) to calculate the largest number of votes among decision trees, and the decision tree with the largest number of votes is selected the final classification result.
In a nutshell, the classification process with random forest is summarized in Algorithm 1 in details.

Settings and Experimental Environment
In our experiment, the welding equipment used to produce welds images includes ABB-IRB2600industrial robot, Kemppi-SYN welding machines, kemppi-DT400 wire feeder and welding torch.The welding parameters are set as follows: wire feed rate is 4 m/min, welding speed is 50 cm/min, weld shield gas is CO 2 , gas flow rate is 20 L/min, the adjustment range of the arc voltage is 16-25 V.All weld images are produced with same equipment and constant parameters.400 weld surface defects images are collected which include normal, overlap, spatter and pore four categories.Each type of property contained 100 weld defects with different appearances and different postures with a resolution of 120 × 80. Image preprocessing is given in Section 4.2.A database is established using pre-processed weld surface defects images.
In order to verify the effectiveness and practicality of the proposed method for the weld surface defects identification, a comparison experiment with other methods is carried out, which is given in Sections 4.3 and 4.4.The experimental environment is a 64-bit Windows 10 system with i5 CPU, 8 GB memory, and 2.30 GHz basic frequency.The software programming environment is python.The implementation is based on the framework in Keras, and the backend of Keras are TensorFlow and Theano.The weld surface defects images used for model training were randomly selected for each category of 80 images, and the remaining 80 were used for model validation.

Image Preprocessing
The actual welding process is often affected by a complex welding environment, weld parameters, and human factors.The acquired images of weld surface defects usually contain a mass of redundant information.In order to effectively extract the weld defect feature information for the CNN, the images need to be pre-processed.
Figure 3 shows partially examples of pre-processed results for weld surface defects images.All weld surface defects images that are collected in Section 4.1 are preprocessed through filtering, enhancement, and segmentation.First, the median filter is used to dispose the collected images of weld surface to eliminate the noise of the original images.Secondly, in order to enhance the gradation value of the weld area, the images are enhanced by gradation processing.Next, the OTSU algorithm is chosen to isolate the weld defect area from the other elements in the original images, which segmented an appropriate area of the weld surface defects to facilitate subsequent feature extraction.

Evaluation of Feature Extraction Module
In the feature extraction module, stochastic gradient descent (SGD) is selected as optimizer of the CNN.The learning rate is set as an adaptive value, which it trains the model with a predetermined value (0.001) and decays with a constant value 1 × 10 −8 .The value of momentum is 0.9.The CNN is trained using the pre-processed weld surface defects database.The training process is shown in Figure 4. Figure 4a shows the trend for accuracy of the training and validation accuracy as the number of iterations increases.After fifteen epochs, the CNN can achieve a maximum training accuracy of 94.69. Figure 4b gives the trend for the objective function loss value of the training and validation as the number of iterations increases.The loss value of objective function gradually becomes smoother to prove that the CNN is convergent.In Reference [6], texture and geometrical features need to be artificially described.In References [11,12], all images need to be vectorized in advance, and the covariance matrix dimension after the vector is too large, which affects the feature extraction speed.In our feature extraction module, the CNN automatically learns high-level features from the images.In order to evaluate the effectiveness of our feature extraction module, we compared the feature extraction method in our paper with traditional image process methods.As is given in Figure 5, our method extracted the high-level features, which can better represent the semantic information of the image and facilitate the classification algorithm to identify.However, the texture or geometric features segmented by the traditional method are not obvious enough, which can cause feature confusion and affect the accuracy of classification.Furthermore, the feature extraction module in our paper can avoid supererogatory human consumption and simplify process.

Comparison to Other Methods
In order to evaluate the effectiveness of the classification module, a comparative experiment is carried out.In this section, two machine learning-based classifiers are applied to compare with our classification module for defects recognition and they are SVM and softmax.The high-level features that are extracted in Section 2.2 are used as training data.Then, the train random forest, SVM and softmax are trained with the input of extracted high-level features.The classification accuracy results in terms of the high-level features can be obtained as follows and are listed in Table 2.It can be seen from Table 2 that the accuracy of the method combined with CNN and random forest can reach 0.9875.Moreover, it shows a better performance than other classifiers.It indicates that the random forest has better generalization ability and robustness.Simultaneously, it also can demonstrate the method is effective and practical.In addition, the SVM and softmax classifiers show high classification accuracies, which can reach to 0.95 and 0.9469, respectively.It can testify that the CNN for feature extraction is effective.

Conclusions
This paper proposes a method based on deep learning to identify weld defects.The high-level feature of the weld surface defects images are extracted by the CNN, which fully exploits the ability of the CNN to extract image features and simplifies the feature extraction process.The classification identification is realized by random forest classification algorithm with stronger generalization ability and robustness.Experiments were carried out based on the weld surface defects database.The results show that the maximum accuracy of the proposed method is 0.9875, which can meet the requirements of classification of weld surface defects in actual production.The effectiveness and practicability of the proposed method are verified.
The image feature extraction and recognition process of weld surface defects achieved in this paper are all carried out offline, which is difficult to meet the real-time application in actual industrial production.Secondly, there are intermittent steps in the method proposed in this paper, which affects the recognition efficiency to some extent.Therefore, defect identification and optimized integration of algorithms for real-time extracted weld surface images will be the focus of the next phase.

Figure 1 .
Figure 1.The overview of proposed method.

(p i ) 2 ,
Gini D j = 1 − N i=1 D j denotes to the jth training set of the sample, and p i denotes the proportion of the ith sample in the training set.

Figure 3 .
Figure 3. Partial examples of pre-processed results for weld surface defects images including normal, overlap, spatter, and pore.All weld surface defects images have a resolution of 120 × 80.

Figure 4 .
Figure 4. (a) The accuracy of the training and validation.(b) The objective function loss value of the training and validation.

Figure 5 .
Figure 5.Comparison of feature extraction module with traditional method.

Table 1 .
The details of CNN architecture.
(5)w n bootstrap sample D l from input data 6: Build a decision tree T on D l by recursively repeating the following steps for each terminal root 7: Select f features without replacement from the h features 8: Calculate the smallest Gini index of feature attribute among the feature subset f based on Equation(5) 9: end procedure 10: Output: the ensemble of trees {T} T 1 11: Classification: 12:

Table 2 .
Accuracy results of different classification methods.