Deep Learning-Based Crack Identification for Steel Pipelines by Extracting Features from 3D Shadow Modeling

Automatic crack identification for pipeline analysis utilizes three-dimensional (3D) image technology to improve the accuracy and reliability of crack identification. A new technique that integrates a deep learning algorithm and 3D shadow modeling (3D-SM) is proposed for the automatic identification of corrosion cracks in pipelines. Since the depth of a corrosion crack is below the surrounding area of the crack, a shadow of the crack is projected when the crack is exposed under light sources. In this study, we analyze the shadow areas of cracks through 3D shadow modeling (3D-SM) and identify the evolving cracks through the shape analysis of the shadows. To denoise the 3D images, the connected domain analysis is implemented so that the shadow groups of the evolving cracks can be retained and the scattered shadow groups that occur due to insignificant defects can be eliminated. Moreover, a novel deep neural network is developed to process the 3D images. The proposed automatic crack identification method successfully processes the 3D images efficiently and accurately diagnoses the corrosion cracks. Experimental results show that the proposed method achieves satisfactory performance with 93.53% accuracy and a 92.04% regression rate.


Introduction
Steel pipelines play an important role in gas and liquid transportation over long distances. However, due to the harsh environment of the construction locations, pipelines always suffer from structural damage problems caused by corrosion, cracks, etc. Pipeline corrosion has been defined for decades as a major source of pipeline deterioration in transmission lines [1]. In common engineering practices, corrosion crack identification highly relies on manual detection and subjective decision. In recent years, nondestructive evaluation (NDE) approaches have been proposed for corrosion crack identification in steel pipelines. NDE methods detect corrosion cracks of steel pipelines through different mediums, such as X-ray, gamma-ray radiography, ultrasonic, thermography, eddy current [2,3], fiber optic distributed [4,5] and electrical capacitance sensors [6][7][8][9][10][11][12]. However, due to the different needs for heat or wave sources and complicated data analyzers of different NDE methods, such applications can encounter various difficulties in terms of operational and monitoring requirements. Moreover, it may not be possible to achieve thorough monitoring of the wide fields [13].
Automatic detection technology of corrosion cracks has attracted increasing applications in actual projects. The early image processing algorithms based on binary image processing (BIP) technology are studied by pioneer researchers to detect the corrosion and cracks of steel pipelines. BIP extracts crack information by determining the optimal threshold, but due to the existence of noise information in steel pipeline images, BIP are still unable to accurately identify cracks [14]. Based on the strong edge features of cracks, some researchers have carried out research on crack identification algorithms based on edge features [15], studied image segmentation methods based on fractal features [16], and considered the comprehensive detection of the crack area and boundary information. However, for automatic identification of pipeline cracks, many algorithms have the problem of low accuracy and low speed. Real-time processing of pipeline images is difficult to be realized automatically.
For mechanical and civil structures, real-time structural health monitoring is difficult when the types of structures are complicated or measured signals are corrupted due to environmental noise. Thus, model-based structural damage identification approaches are not effective. Deep learning can satisfactorily address this problem due to its superior adaptive learning of datasets [17][18][19][20][21]. A large and deep convolutional neural network (CNN) was employed to make a classification of 1.2 million high-resolution images by Krizhevsky et al. [22]. Derivation and application of CNNs were presented by Bouvrie [23]. Real-time vibration-based damage detection and localization methods were proposed using CNNs [24][25][26][27][28][29]. Wavelets were used to indicate damage occurrence of the structure under seismic load excitation by monitoring the mutations in the wavelet details of responses. [30][31][32]. A new signal-processing algorithm was formed by combining wavelets, neural networks and Hilbert transform, inspired by the deep learning model [33][34][35][36][37]. A Bayesian network was studied to evaluate the reliability of specific components according to their serviceability and inter-component correlations [38][39][40][41][42], and a hybrid response surface method was used [43].
Due to its flourishing development, deep neural networks have shown powerful capabilities to establish surrogate models with high accuracy and stability for crack detection. Taking the benefits of deep learning, an automatic crack identification method is proposed. The proposed method integrates the 3D-SM and convolutional deep neural network to extract the features of the 3D-SM images and identify the corrosion cracks. The proposed method provides a promising way to detect pipeline corrosion cracks regardless of the disturbing backgrounds and conditions of the 3D images of the pipeline's outer surface.

Methodology
The proposed method integrates the 3D-SM with CNN for automatic crack identification in pipelines. Figure 1 shows a flowchart of the proposed method. To ensure the quality of the pipeline's outer surface images, each pixel of the images is covered under the maximum intensity of light beam projection. Once the maximum intensity of all pixels is collected, digitalization of the collected data will be carried out by converting the images to binary maps (shadow maps). In the digitalization of images, we compare the light intensity at each pixel of the pipeline surface with the maximum light intensity and label the pixels accordingly. If the maximum light intensity exceeds the pipeline surface intensity, this pixel will be marked as a shaded area and labeled with the binary number "0". Otherwise, the pixel will be mark as a non-shaded area and labeled with the binary number "1". After the labeling, the shadow maps are fed to a CNN as labeled data. First, we train and calculate the network I/P and O/P weights. Then, we compute the derivative of the error for weights by the backpropagation (BP) algorithm. In the CNN, the convolution layer is used for up-sampling while the subsampling layer is used for down-sampling. The following sections present more in-depth discussions on the proposed work.

Three-Dimensional (3D) Shadow Modeling
The proposed method implements the feature of crack shadows that are projected due to the height differences between cracks and pipeline surfaces. The identification of cracks is carried out through analysis of the shapes of the shadows. This method is developed under the following assumptions: 1.
The projected light source (in this work, the high intensity light torches) is infinite and projected light beams are parallel; 2.
The diffusion, reflection and refraction of light are ignorable.
According to these assumptions, both corrosion slops and corrosion cracks can generate shadow areas under the lighting of the projection light source, as shown in Figure 2. On the other hand, bidirectional light projection provides an effective way to capture shadows caused by corrosion cracks and prevents interference from corrosion slopes. Figure 3 shows the schematic plots of the induced shadow area under bidirectional projection. Since cracks consist of continuous partial grooves, we utilize bidirectional light projection to prepare the pipeline images for crack identification.

Three-Dimensional (3D) Shadow Modeling
The proposed method implements the feature of crack shadows that are projected due to the height differences between cracks and pipeline surfaces. The identification of cracks is carried out through analysis of the shapes of the shadows. This method is developed under the following assumptions: 1. The projected light source (in this work, the high intensity light torches) is infinite and projected light beams are parallel; 2. The diffusion, reflection and refraction of light are ignorable.
According to these assumptions, both corrosion slops and corrosion cracks can generate shadow areas under the lighting of the projection light source, as shown in Figure 2. On the other hand, bidirectional light projection provides an effective way to capture shadows caused by corrosion cracks and prevents interference from corrosion slopes. Figure 3 shows the schematic plots of the induced shadow area under bidirectional projection. Since cracks consist of continuous partial grooves, we utilize bidirectional light projection to prepare the pipeline images for crack identification.    Under the bidirectional projection light, the size of the shadow area depends on the projection direction. Therefore, the projection angle (θ) is an important parameter. When the angle is 90°, the shadow areas will not be generated, but when the angle is 0°, infinitely long shadow areas will be generated. By changing the angle of projection direction from 0° to 90°, the 3D-SM can simulate a variety of shadow settings for the cracks on the pipeline's surface. Besides, if the selected projection angle satisfies the condition of generating the shadow areas of a small crack, it can also generate the shadow areas for other cracks. The main advantage of the 3D-SM simulation is that it can identify various cracks and even micro cracks by only adjusting a single variable for control (projection angle). Figure  4 presents the approach used to define the projection angle (θ) using projection of two light beams. Under the bidirectional projection light, the size of the shadow area depends on the projection direction. Therefore, the projection angle (θ) is an important parameter. When the angle is 90 • , the shadow areas will not be generated, but when the angle is 0 • , infinitely long shadow areas will be generated. By changing the angle of projection direction from 0 • to 90 • , the 3D-SM can simulate a variety of shadow settings for the cracks on the pipeline's surface. Besides, if the selected projection angle satisfies the condition of generating the shadow areas of a small crack, it can also generate the shadow areas for other cracks. The main advantage of the 3D-SM simulation is that it can identify various cracks and even micro cracks by only adjusting a single variable for control (projection angle). Figure 4 presents the approach used to define the projection angle (θ) using projection of two light beams.
The optimal bidirectional projection direction should be perpendicular to the direction of the plane of the crack. However, actual cracks often have various directions, so the optimal bidirectional projection direction is not always the same. To reduce the calculation time, we use two representative pairs of bidirectional projection directions for the 3D-SM simulation, namely lateral projection (x-direction) and longitudinal projection (y-direction). The geometry settings of these two projections are shown in Figure 5. The lateral projection is used to identify the longitudinal cracks with the crack angle between approximately 45 • and 90 • and the longitudinal projection is used to identify the transverse cracks with the crack angle between approximately 0 • and 45 • .  The optimal bidirectional projection direction should be perpendicular to the direction of the plane of the crack. However, actual cracks often have various directions, so the optimal bidirectional projection direction is not always the same. To reduce the calculation time, we use two representative pairs of bidirectional projection directions for the 3D-SM simulation, namely lateral projection (x-direction) and longitudinal projection (y-direction). The geometry settings of these two projections are shown in Figure 5. The lateral projection is used to identify the longitudinal cracks with the crack angle between approximately 45°and 90°and the longitudinal projection is used to identify the transverse cracks with the crack angle between approximately 0° and 45°. In the next step, the original 3D image will be converted into to a binary map to reflect the shadow pattern, where "0" indicates the shaded area, and "1" indicates the non-shaded area. A typical binary map is shown in Figure 6. Let S denote the light projection, P denote the 3D pixel of shadow formed upon a fall the light projection S and denote L denote the unit vector to indicate the direction of light The sample point along the light projection can be expressed as: where S k is the kth sample point along the light projection S. The resolution of 3D pixel of shadow depends on the sampling interval ∆S (mm). A smaller sampling interval provides a higher resolution but consumes more computational resources. In this paper, we choose ∆S = 0.5 so that the resolution of 3D pixel of shadow is sufficient to consider all pixels in the light projection.   In the next step, the original 3D image will be converted into to a binary map to reflect the shadow pattern, where "0" indicates the shaded area, and "1" indicates the nonshaded area. A typical binary map is shown in Figure 6. Let S denote the light projection, P denote the 3D pixel of shadow formed upon a fall the light projection S and denote L denote the unit vector to indicate the direction of light The sample point along the light projection can be expressed as: where is the sample point along the light projection . The resolution of 3D pixel of shadow depends on the sampling interval Δ . A smaller sampling interval provides a higher resolution but consumes more computational resources. In this paper, we choose Δ = 0.5 so that the resolution of 3D pixel of shadow is sufficient to consider all pixels in the light projection.  To obtain the final integrated shadow map, the shadow maps under lateral and longitudinal projection are arranged by: where , is the integrated shadow map; , is the binary value of the point , in the horizontal shadow map; , is the binary value of the point , in the vertical shadow map; and , is the binary value of the integrated shadow map.  To obtain the final integrated shadow map, the shadow maps under lateral and longitudinal projection are arranged by: where I c (i, j) is the integrated shadow map; B(i, j) c is the binary value of the point (i, j) in the horizontal shadow map; B c (i, j) is the binary value of the point (i, j) in the vertical shadow map; and (i, j) is the binary value of the integrated shadow map.  The integrated shadow map is noise corrupted. Therefore, the next step of the 3D-SM is noise elimination using the connected domain analysis [44]. The connected domain analysis makes use of the continuity and geometry properties of corrosion cracks. In general, a concerned crack has a continuous linear shape, and its length is much greater than its width. Hence, interconnected shadow points in the integrated shadow map are classified into groups for crack inspection and scattered shadow points are considered as noise to be eliminated. After the noise elimination, crack extraction from shadow groups based on that crack characteristics is carried out. For the corrosion cracks on the pipeline's surface, with which we are presently concerned, they are in a slender linear shape, in which their which height is much less than that of the pipeline's surface. Moreover, their width is relatively uniform in a small range. According to these crack characteristics, the linear pattern analysis technique is used to filter shadow groups. If a shadow group does not fulfill the any of the following three criteria, it will be considered as noise to be eliminated.
First, to ensure that the shadow group has a slender linear pattern, a shape aspect ratio is defined: where , is the width of the section of shadow group ; is the total number of profiles of shadow group ; and is the threshold value of the average profile width difference rate, = 1.0.
Finally, we must check the angle between the current direction and the next direction in each section of the shadow group. Each direction is described a vector from the center of previous section to the center of current section of the shadow group. We use these vectors to calculate the average trend difference value of the shadow group by Equations (4)- (6).  The integrated shadow map is noise corrupted. Therefore, the next step of the 3D-SM is noise elimination using the connected domain analysis [44]. The connected domain analysis makes use of the continuity and geometry properties of corrosion cracks. In general, a concerned crack has a continuous linear shape, and its length is much greater than its width. Hence, interconnected shadow points in the integrated shadow map are classified into groups for crack inspection and scattered shadow points are considered as noise to be eliminated. After the noise elimination, crack extraction from shadow groups based on that crack characteristics is carried out. For the corrosion cracks on the pipeline's surface, with which we are presently concerned, they are in a slender linear shape, in which their which height is much less than that of the pipeline's surface. Moreover, their width is relatively uniform in a small range. According to these crack characteristics, the linear pattern analysis technique is used to filter shadow groups. If a shadow group does not fulfill the any of the following three criteria, it will be considered as noise to be eliminated.
First, to ensure that the shadow group has a slender linear pattern, a shape aspect ratio is defined: where W i,j is the width of the jth section of shadow group i; N i is the total number of profiles of shadow group i; and R P is the threshold value of the average profile width difference rate, R P = 1.0. Finally, we must check the angle between the current direction and the next direction in each section of the shadow group. Each direction is described a vector from the center of previous section to the center of current section of the shadow group. We use these vectors to calculate the average trend difference value of the shadow group by Equations (4)- (6).
where P t i,j is the center point of the section of shadow group i; V i,j is the current direction of the third section of shadow group i; V i,j is the next direction of the jth section of shadow group i; N i is the total number of profiles of shadow group i; and R t is the average direction difference threshold, R t = π/12. To ensure that an extracted crack has slender linear patterns, we use the thresholds given in Equations (3), (4) and (7) to control each quantified index. When a shadow group does not fulfill the above criteria, it is considered as a noise group to be eliminated. In this study, the threshold values are assigned to be noise. Figure 8 demonstrates an example for noise shadow groups cleaning. Several shadow groups were considered as noise groups because they did not fulfill the criteria. It is obvious that the resultant shadow map shown in Figure 8c filtered unnecessary scatter shadow groups and depicted a clearer crack pattern.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 10 of 23 (7) where , is the center point of the section of shadow group ; , is the current direction of the third section of shadow group ; , is the next direction of the section of shadow group ; is the total number of profiles of shadow group ; and is the average direction difference threshold, = /12.
To ensure that an extracted crack has slender linear patterns, we use the thresholds given in Equations (3), (4) and (7) to control each quantified index. When a shadow group does not fulfill the above criteria, it is considered as a noise group to be eliminated. In this study, the threshold values are assigned to be noise. Figure 8 demonstrates an example for noise shadow groups cleaning. Several shadow groups were considered as noise groups because they did not fulfill the criteria. It is obvious that the resultant shadow map shown in Figure 8c filtered unnecessary scatter shadow groups and depicted a clearer crack pattern.

Convolutional Neural Network (CNN)
In this section, the detailed design of a CNN for corrosion crack identification of pipelines using the 3D-SM images is introduced. A CNN consists of alternating convolution with sub-sampling operations and a general multi-layer fully connected network as the last layer. Compare with other types of neural network models, the sub-sampling layers and convolutional layers improve the configural and spatial invariance of the neural network as well as increasing its computational efficiency.
In a CNN, the convolution layer is used to extract the representative local features of the previous layer, and the sub-sampling layer is used to reduce the complexity of the network by combining similar features. Therefore, one of the most useful features of deep learning is its ability to use the outputs of intermediate layers as another group of data. These data can be considered as the learned features through network adaptive learning. The learned features can be subsequently used for similarity comparison. The parameters of the CNN can be learned effectively using the training data.

Convolution Layers and Sub-sampling Layers
A convolution converts the pixels of the input image in its receptive field into a single value. A convolution layer of a CNN applies a convolution operation to the input and transfers the result to the next layer. The convolutional operation combines the multiple input feature maps with convolutional kernels and outputs the new feature maps before activation [19]. In the proposed method, this process is defined as:

Convolutional Neural Network (CNN)
In this section, the detailed design of a CNN for corrosion crack identification of pipelines using the 3D-SM images is introduced. A CNN consists of alternating convolution with sub-sampling operations and a general multi-layer fully connected network as the last layer. Compare with other types of neural network models, the sub-sampling layers and convolutional layers improve the configural and spatial invariance of the neural network as well as increasing its computational efficiency.
In a CNN, the convolution layer is used to extract the representative local features of the previous layer, and the sub-sampling layer is used to reduce the complexity of the network by combining similar features. Therefore, one of the most useful features of deep learning is its ability to use the outputs of intermediate layers as another group of data. These data can be considered as the learned features through network adaptive learning. The learned features can be subsequently used for similarity comparison. The parameters of the CNN can be learned effectively using the training data.

Convolution Layers and Sub-Sampling Layers
A convolution converts the pixels of the input image in its receptive field into a single value. A convolution layer of a CNN applies a convolution operation to the input and transfers the result to the next layer. The convolutional operation combines the multiple input feature maps with convolutional kernels and outputs the new feature maps before activation [19]. In the proposed method, this process is defined as: where * denotes the (2D) convolution operation; x l−1 i is the ith output map in layer l−1; k l ij is the kernel weight; b l j is the bias; M j is a set of input feature maps; f (·) is a nonlinear function that is applied component-wise.
To train the CNN model using the BP algorithm, we need to obtain the gradients of each layer. In the CNN model, a down-sampling layer l + 1 and convolution layer l follow each other. For sensitivity evaluation, the interested nodes in the current layer l are connected to the unit nodes of the next layer l + 1, and both layer's connections are associated with weights defined at layer l + 1. For example, a down-sampling layer map has weights β and we use the previous result scaled by β to calculate β l . Each map j repeats the same computation in the convolutional layer as well as in the sub-sampling layer. The sub-sampling layer is defined as: where • denotes the pointwise product of vectors.; u l j is the layer's gradient; f (·) is a typical convolutional transformation; up(·) is an operation of up-sampling which tiles each pixel in the input horizontally and vertically n times in the output if the subsampling layer subsamples by a factor of n. Moreover, the gradients regarding bias b j and kernel weight k l ij are described as: where (p l−1 i ) uv is the patch in x l−1 i multiplied by k l ij via elementwise operation to calculate the element at (u, v) in the output map x l j in the convolutional layer. For a sub-sampling layer, the down-sampled version of the input maps has the form: where down(·) is a sub-sampling operation. Similar as the convolution layers, the weights and bias parameters of the sub-sampling layers are updated using gradients. The gradients of the sub-sampling layers with respect to the bias b j and kernel weight k l ij can be described as:

Learning Combinations of Feature Maps
Let α ij be the weight of input map i and output map j. The output map j can be given as: The output map j is subject to the constraints: The softmax operation derivative can be written as: The error derivative of α i and c i is expressed by: By applying the chain rule, we can calculate the network error gradient with respect to the implicit weights c i :

Enforcing Sparse Combinations
Assume that the sparseness constraints of the weight distribution α i is a combination of a regularization penalty term Ω(α i ) and the final error function. The regularization penalty Ω(α i ) satisfies: Figure 9 shows an example to demonstrate the processes of convolution. Herein, the input data are composed of a 7 × 7 × 3 dataset, which means the network consists of 7 × 7 (width × height) pixels with three color channels (blue, red and green).
The softmax operation derivative can be written as: The error derivative of and is expressed by: By applying the chain rule, we can calculate the network error gradient with respect to the implicit weights :

Enforcing Sparse Combinations
Assume that the sparseness constraints of the weight distribution is a combination of a regularization penalty term Ω and the final error function. The regularization penalty Ω satisfies: Ω = (20) Figure 9 shows an example to demonstrate the processes of convolution. Herein, the input data are composed of a 7 × 7 × 3 dataset, which means the network consists of 7 × 7 (width × height) pixels with three color channels (blue, red and green).   Figure 10 demonstrates the process of max pooling, which is defined as taking the maximum value of the specific data window area. There is another pooling method in the CNN model, known as average pooling, which involves taking the average value instead of the maximum value. In this paper, we applied the max pooling. Figure 10 demonstrates the process of max pooling, which is defined as taking maximum value of the specific data window area. There is another pooling method in CNN model, known as average pooling, which involves taking the average value inste of the maximum value. In this paper, we applied the max pooling.

Results and Discussions
The proposed 3D image based automatic corrosion crack identification method u lizes a 3D light and shadow model to convert the complex 3D crack images into bin shadow maps. Then, the proposed method carries out denoising of the image and extra cracks from the shadow map. The difference between the traditional 2D grayscale ima and 3D crack images is that many pipelines that suffer damages, such as cracking a corrosion, have distinctive characteristics in the third dimension, which all are irretrie ble in 2D images. By feeding the dataset of 3D shadow maps to a trained CNN, the shad areas can be automatically marked in red.

3D Crack Images Database
A total of 900 raw pipeline crack images with sizes of 256 × 256 pixels were us Among the 900 raw images, 700 randomly chosen raw images were used as the train dataset and the remaining 200 raw images were used as the testing dataset. As shown Section 3, the projection angle (horizontal and vertical projection angle) is the only para eter that controls the proposed method. We determine that the optimal horizontal a vertical projection angles are 38° and 76°, respectively, by repeating tests. Figure shows some representative raw training images and the associated resultant shadow m obtained by the proposed 3D-SM. It is found that the some of the raw images had distu ing backgrounds and conditions. Therefore, cleaning the interference noise is challeng and essential for the identification of cracks.

Results and Discussions
The proposed 3D image based automatic corrosion crack identification method utilizes a 3D light and shadow model to convert the complex 3D crack images into binary shadow maps. Then, the proposed method carries out denoising of the image and extracts cracks from the shadow map. The difference between the traditional 2D grayscale images and 3D crack images is that many pipelines that suffer damages, such as cracking and corrosion, have distinctive characteristics in the third dimension, which all are irretrievable in 2D images. By feeding the dataset of 3D shadow maps to a trained CNN, the shaded areas can be automatically marked in red.

3D Crack Images Database
A total of 900 raw pipeline crack images with sizes of 256 × 256 pixels were used. Among the 900 raw images, 700 randomly chosen raw images were used as the training dataset and the remaining 200 raw images were used as the testing dataset. As shown in Section 3, the projection angle (horizontal and vertical projection angle) is the only parameter that controls the proposed method. We determine that the optimal horizontal and vertical projection angles are 38 • and 76 • , respectively, by repeating tests. Figure 11 shows some representative raw training images and the associated resultant shadow map obtained by the proposed 3D-SM. It is found that the some of the raw images had disturbing backgrounds and conditions. Therefore, cleaning the interference noise is challenging and essential for the identification of cracks. Appl. Sci. 2021, 11

Pipeline Corrosion Cracks Identification Using CNN.
In the above sections, 3D-SM were studied to identify cracks in the pipeline that were suffering from corrosion. However, due to the 3D-SM having a lot of noise information effects and the model's inherent complexity during the analysis process, and the fact that it often ignores some cracks during the denoising processes, a more efficient way to extract the features of the 3D-SM distribution is employed for the identification of cracks. The schematic diagram of CNN configurations of the setup, training and testing model can be shown in Figure 12. The CNN setup, training and testing model functions are presented in Table 1.

Pipeline Corrosion Cracks Identification Using CNN
In the above sections, 3D-SM were studied to identify cracks in the pipeline that were suffering from corrosion. However, due to the 3D-SM having a lot of noise information effects and the model's inherent complexity during the analysis process, and the fact that it often ignores some cracks during the denoising processes, a more efficient way to extract the features of the 3D-SM distribution is employed for the identification of cracks. The schematic diagram of CNN configurations of the setup, training and testing model can be shown in Figure 12. The CNN setup, training and testing model functions are presented in Table 1.

Pipeline Corrosion Cracks Identification Using CNN.
In the above sections, 3D-SM were studied to identify cracks in the pipeline that were suffering from corrosion. However, due to the 3D-SM having a lot of noise information effects and the model's inherent complexity during the analysis process, and the fact that it often ignores some cracks during the denoising processes, a more efficient way to extract the features of the 3D-SM distribution is employed for the identification of cracks. The schematic diagram of CNN configurations of the setup, training and testing model can be shown in Figure 12. The CNN setup, training and testing model functions are presented in Table 1.

Cnnsetup
The Cnnsetup is used to set up the feature maps, The kernel window is composed of elements with the size of kernelsize × kernelsize. Each element is an independent weight.

cnntrain
The cnntrain is used to reorder the sample, and randomly train and calculate the weights of network input and output. Then, the derivative of the error is calculated with respect to weights by the back propagation (BP) algorithm.

Cnnff
The Cnnff uses the neural network to predict the input vector. The samples are first reordered and then randomly trained.

cnnbp
The cnnbp layer is used for up-sampling and subsampling layer is used for down-sampling.

Cnnapplygrads
The Cnnapplygrads is used for updating of the weight depending on training, testing, labeled training, and labeled testing datasets.
In this work, the CNN consists of two convolutional layers, two sub-sampling layers, and a full connection layer. Detailed settings of each layer including the kernel size, number of tuned parameters and number of connections are included in this section. The architecture of the convolutional layer and sub-sampling layer is depicted in Figure 13. Herein, the xth layer with convolutional operation is deonted as C x and the sub-sampling operation is denoted as S x .  Figure 14 shows some of the crack detection results obtained from the proposed method. In particular, the subplots on the upper row show the input raw images, the subplots on the middle row show the obtained shadow maps obtained from the 3D-SM, and the bottom row shows the automatic crack identification results obtained from the CNN. It is found that the shadow maps successfully depicted the representative features of the corrosion cracks and filtered the insignificant shadow groups. Moreover, the trained CNN provided satisfactory performance in detecting the locations and evolution of the corrosion cracks.

Steel pipeline corrosion cracks
The Input images  As shown in Table 2, C1 is the first convolutional layer composed of six feature maps. By doing the convolutional operation, the feature of the original signal can be enhanced and the effects of the interference noise can be reduced. Each neuron of the feature map is connected with a 12 × 12 pixel neighborhood of input images. The feature map size is 168 × 168 pixels. C1 has 580 tuned parameters (there are four filters in total and each filter has 12 × 12 unit parameter weights and a bias parameter, totaling (12 × 12 + 1) × 4 = 580 tuned parameters). One kernel is used between the input and C1 and it turns out that there are 580 × (168 × 168) = 16,369,920 connections in total. S2 is a sub-sampling layer for compressing the image while retaining useful information. Two 72 × 72 feature maps are used. Each unit of the feature map is connected to a 6 × 6 neighborhood of C1. S2 has (1 + 1) × 2 = 4 tuned parameters and (6 × 6 + 1) × 2 × (72 × 72) = 383,616 connections. The next two layers, C3 and S4, have similar architectures. The kernel sizes of C3 and C4 are 12 × 12 and 6 × 6, respectively, and the output layer is composed of Euclidean radial basis function units. Images that are fed into the network as the input and the output can be obtained by O p = F n (· · · (F 2 (F 1 (X p W (1) )W (2) )) · · · W (n) ), where X p is the input matrix; W (i) is a matrix of shape l x × l x−1 ; and F i (·) is the activation functions at layer i. The error between the output O p and desired output Y p can be calculated by the BP algorithm. Figure 14 shows some of the crack detection results obtained from the proposed method. In particular, the subplots on the upper row show the input raw images, the subplots on the middle row show the obtained shadow maps obtained from the 3D-SM, and the bottom row shows the automatic crack identification results obtained from the CNN. It is found that the shadow maps successfully depicted the representative features of the corrosion cracks and filtered the insignificant shadow groups. Moreover, the trained CNN provided satisfactory performance in detecting the locations and evolution of the corrosion cracks.  Figure 13. Sub-sampling and convolutional layer connection. Figure 14 shows some of the crack detection results obtained from the proposed method. In particular, the subplots on the upper row show the input raw images, the subplots on the middle row show the obtained shadow maps obtained from the 3D-SM, and the bottom row shows the automatic crack identification results obtained from the CNN. It is found that the shadow maps successfully depicted the representative features of the corrosion cracks and filtered the insignificant shadow groups. Moreover, the trained CNN provided satisfactory performance in detecting the locations and evolution of the corrosion cracks.

Evaluation of Accuracy and Reliability of the Present Algorithm
To quantify the performance of the proposed crack identification method, the accuracy rate, regression rate, and F-score were calculated throughout the training process. The accuracy rate is defined as the percentage of correctly classified pixels in all detected pixels in 3D shadow maps; the regression rate is defined as the percentage of correctly classified cracked pixels out of all actual cracked pixels; and F-score is defined as a measure of the training accuracy [45,46].
To verify the effectiveness and accuracy of the resultant CNN in pipeline corrosion crack identification, 200 testing images were employed and 81 selected crack samples are utilized for computing the accuracy and regression rate. Figure 15 and Table 3 show the crack identification results for the 81 crack samples. The testing samples are divided into

Evaluation of Accuracy and Reliability of the Present Algorithm
To quantify the performance of the proposed crack identification method, the accuracy rate, regression rate, and F-score were calculated throughout the training process. The accuracy rate is defined as the percentage of correctly classified pixels in all detected pixels in 3D shadow maps; the regression rate is defined as the percentage of correctly classified cracked pixels out of all actual cracked pixels; and F-score is defined as a measure of the training accuracy [45,46].
To verify the effectiveness and accuracy of the resultant CNN in pipeline corrosion crack identification, 200 testing images were employed and 81 selected crack samples are utilized for computing the accuracy and regression rate. Figure 15 and Table 3 show the crack identification results for the 81 crack samples. The testing samples are divided into 10 groups of 3D shadow maps, denoted as G 1 , G 2 , . . . .., and G 10 . Four indexes that include the true-positive rate (TPR), true-negative rate (TNR), false-positive rate (FPR), and false-negative rate (FNR) are computed. The definitions of these four indexes are: True-positive rate (TPR): The element originally belongs to the target class but it is also correctly identified and marked as the target class; True-Negative rate (TNR): The element does not belong to the target class but it also correctly identified and marked as a non-target class;

False-positive rate (FPR):
The element does not belong to the target class but it is incorrectly identified and marked as the target class; False-negative rate (FNR): The element originally belonged to the target class but was incorrectly identified and marked as a non-target class. Appl. Sci. 2021, 11, x FOR PEER REVIEW 19 of 23 (c) F-score.  In engineering applications, most studies focus on using artificial neural networks to process images taken from structures such as pipelines, bridges, and tunnels. However, even when we use deep learning models such as CNN, this method usually does not have sufficient generalization to reflect the fact of structural health status. By combining the 3D-SM with CNN, the 3D shadow maps' information can be extracted and then fed into CNN for training and testing. This combination will be a useful tool to solve actual crack identification problems of complex structures in mechanical and civil engineering.

Conclusions
In this paper, we proposed an automatic crack identification method to detect corrosion cracks of pipelines using 3D images. The proposed method integrates 3D shadow modeling (3D-SM) and deep learning theory for reliable and efficient analysis. Since the depth of a corrosion cracks is below its surrounding areas, a shadow of the corrosion crack can be projected under light sources. Consequently, 3D shadow images of the pipeline's  Based on the definitions of TPR, TNR, FPR, and FPR, the accuracy rate (P), regression rate (R) and F-score (F) can be determined: where N TP is N FP and N FN are the number of true-positive, false-positive, and falsenegative elements, respectively. Figure 15 shows the resultant accuracy, regression rate and F-score versus the number of epochs (an epoch refers to one cycle through the full training dataset). The results of the ten groups of 3D shadow maps (i.e., G 1 , G 2 , . . . .., and G 10 ) are indicated with solid lines with different markers, while the overall performances of all 80 selected shadow maps is indicated with dotted lines. From the plots, it can be concluded that G 3 has the highest order of accuracy and regression rate-this means it has the highest detection accuracy for cracks-whereas G 5 has the lowest order of accuracy and regression rate, and the other groups' accuracy and regression rates are in between these values. In general, the differences in performance curves between the ten groups of 3D shadow maps are due to the differences in noise in each group. Nevertheless, the overall performance of the proposed method is satisfactory. The accuracy, regression rate and F-score of the overall performance are 93.53%, 92.04% and 91.18%, respectively. The result confirmed that the proposed method can automatically identify corrosion cracks on pipelines with satisfactory performance regardless of the image backgrounds and conditions.
In engineering applications, most studies focus on using artificial neural networks to process images taken from structures such as pipelines, bridges, and tunnels. However, even when we use deep learning models such as CNN, this method usually does not have sufficient generalization to reflect the fact of structural health status. By combining the 3D-SM with CNN, the 3D shadow maps' information can be extracted and then fed into CNN for training and testing. This combination will be a useful tool to solve actual crack identification problems of complex structures in mechanical and civil engineering.

Conclusions
In this paper, we proposed an automatic crack identification method to detect corrosion cracks of pipelines using 3D images. The proposed method integrates 3D shadow modeling (3D-SM) and deep learning theory for reliable and efficient analysis. Since the depth of a corrosion cracks is below its surrounding areas, a shadow of the corrosion crack can be projected under light sources. Consequently, 3D shadow images of the pipeline's outer surface provide information on the corrosion cracks. Three-dimensional shadow maps were built to reflect the actual status of structures. Then, the proposed method analyzed the shape of the shadow to identify the pipeline's cracks. To avoid losing the necessary information on evolving cracks during the denoising process, an efficient means of extracting the features of 3D shadow maps is used. A CNN was developed to process the 3D shadow maps obtained from the 3D-SM. By feeding the feature maps to the CNN, the evolving pipeline cracks were identified regardless the disturbing backgrounds and conditions of the original images. The results of this pipeline corrosion crack identification process showed that the automatic crack identification method achieved satisfactory performance with high stability, accuracy and regression rate.
Author Contributions: W.A.A., the first author, was responsible for the establishing and applying the new methodology introduced in this work, which combined 3D Shadow Modeling with a Deep Learning algorithm for Crack Identification, and carried out the majority of the research project including the writing of the manuscript. Both analytical derivations and statistical methods were closely tested and confirmed by M.N., who suggested the procedures that were utilized. S.-C.K., R.G. and T.W. worked closely with W.A.A. in the reviewing and editing phase. As a leading scholar and pioneer in SHM and fiber optic sensors, Z.W. offered useful guidance and recommendations that contributed greatly to this research project. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: Publicly available corrosion cracking images were analyzed in this study. This images can be found here: https://www.shutterstock.com/search/, accessed on 18 June 2021.

Conflicts of Interest:
The authors declare no conflict of interest.

(i : j)
The binary value of the integrated shadow map B c (i, j) The binary value of the point (i, j) in the horizontal shadow map B c (i, j) The binary value of the point (i, j) in the vertical shadow map I c (i, j) The integrated shadow map K i The aspect ratio of the shadow group L i The total length of the shadow group W i The average width of the shadow group R k The aspect ratio threshold, R k = 10 P i The average profile width difference rate of shadow group i N i The total number of profiles of shadow group i W i,j The width of the jth section of shadow group i R P The threshold value of the average profile width difference rate, R P = 1.0 V i,j The current direction of the third section of shadow group i V i,j The next direction of the jth section of shadow group i P t i,j The center point of the section of shadow group i T i The center point of shadow group i R t The average direction difference threshold, R t = π/12 M j A set of input feature maps. b The bias for each output feature map. n The subsampling layer number.

up(·)
An operation of up-sampling which tiles each pixel in the input horizontally and vertically n times in the output if the subsampling layer subsamples by a factor of n. b j The bias. k l ij The kernel weight.
The patch in x l−1 i multiplied by k l ij via the element-wise procedure to calculate the element at (u, v) in the output map x l j in the convolutional layer. x l j The input maps in the convolutional layer.
β and b The bias parameters. P The accuracy rate R The regression rate