An Enhanced U-Net Approach for Segmentation of Aeroengine Hollow Turbine Blade

: The hollow turbine blade plays an important role in the propulsion of the aeroengine. However, due to its complex hollow structure and nickel-based superalloys material property, only industrial computed tomography (ICT) could realize its nondestructive detection with sufﬁcient intuitiveness. The ICT detection precision mainly depends on the segmentation accuracy of target ICT images. However, because the hollow turbine blade is made of special superalloys and contains many small unique structures such as ﬁlm cooling holes, exhaust edges, etc., the ICT image quality of the hollow turbine blades is often deﬁcient, with artifacts, low contrast, and inhomogeneity scattered around the blade contour, making it hard for traditional mathematical model-based methods to acquire satisfying segmentation precision. Therefore, this paper presents a deep learning-based approach, i.e., the enhanced U-net with multiscale inputs, dense blocks, focal loss function, and residual path in the skip connection to realize the high-precision segmentation of the hollow turbine blade. The experimental results show that our proposed enhanced U-net can achieve better segmentation accuracy for practical turbine blades than conventional U-net and traditional mathematical model-based methods.


Introduction
As an important part of the aeroengine, the hollow turbine blade is made of special nickel-based superalloys and has a complex internal multicavity structure with closed curved surfaces to adapt to the high temperature and pressure working conditions. Its thinwalled structure makes it hard for traditional detection methods, such as three-coordinate measurement, ultrasound, etc., to detect it qualitatively and quantitatively [1]. Computed tomography (CT) is a prospective nondestructive detection method that can "see through" the test object. Hence, CT detection has been widely applied in tasks where traditional methods cannot perform accurate detections, such as material science [2], industrial applications [3], medical imaging [4], geosciences [5], civil engineering [6], etc.
However, due to the hollow turbine blade's complex structure and special material properties, the industrial computed tomography (ICT) images contain blurred and lowcontrast small structures whose precise contours are hard to extract [7], which greatly affects their ICT detection accuracy, as shown in Figure 1. Typically, there are two ways to solve this problem: (1) improve the ICT image quality and (2) increase the precision of image segmentation methods. The first approach requires better hardware of X-ray source, flat panel detector, etc., which is expensive and laborious. Therefore, many efforts have been made to increase the segmentation accuracy of the turbine blade. Typical ICT segmentation methods are derived from natural image processing methods, i.e., they are modified to adapt to ICT image characteristics. Popular ICT image segmentation methods include thresholding method [8], morphological method [9], edge detection [2], active contours [10], fuzzy methods [11], etc. Among them, thresholding is simple and fast. However, thresholding cannot deal with images with artifacts and noise well [12]. Morphological methods can deal with noisy images but are easily affected by artifacts [13]. Edge detection methods are fast but can only handle images without noise, inhomogeneity, and artifacts [14]. Active contours can handle noise and inhomogeneity well but is time-consuming [15]. Fuzzy methods are sensitive to noise and low contrast [16].
With the development of computer technology, deep neural networks have made great progress in the fields of machine learning and computer vision [17]. Using a deep neural network to segment an object among the background is to train the network for specific image segmentation tasks, where the network architecture and the training methods will all affect the final segmentation accuracy [18]. Common deep learning architectures include convolutional neural networks (CNNs), recurrent neural networks (RNNs), encoder-decoder models, and generative adversarial networks (GANs) [19]. Image segmentation models are basically derived from these network architectures with the help of encoders, decoders, skip-connections and dilation modifications, etc. Typical image segmentation methods are fully convolutional networks (FCNs) [20], encoder-decoder approaches [21], multiscale and pyramid network-based models [22], R-CNN approaches [23], dilated convolutional networks [24], RNN-based approaches [25], and attention- [26] and GAN [27]-based models. Among different network structures, the U-net approach is an efficient segmentation network that is very suitable for images with microscopic details, such as medical images with blood vessels and ICT images with complex microstructures [28]. Compared with conventional encoder-decoder convolution neural networks, the bypass connections in the U-net could compensate for the lost high-frequency details during pooling [29].
The most widely applied area of the U-net is medical image segmentation [30]. For example, Baltruschat et al. used the U-net to segment the bone implant, which achieved the overall best mean IoU = 0.906 [31]. Ghosh et al. combined the U-net with the VGG-16 network to segment brain tumors and achieved a pixel accuracy of 0.9975 [32]. Guo et al. used the U-net for breast ultrasound image segmentation and realized the average IOU of 82.7% (±0.02), while Khaled et al. obtained a mean dice similarity coefficient (DSC) of 0.680 (0.802 for main lesions) for dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) with the help of the U-net [33]. Li et al. also applied the U-net in isotropic quantitative differential phase contrast imaging, the result of which shows that the quantitative phase value in the ROI is recovered from 66% to 97% compared with the ground truth [34]. Lee et al. proposed the FUS-net based on the U-net for interference filtering in Typical ICT segmentation methods are derived from natural image processing methods, i.e., they are modified to adapt to ICT image characteristics. Popular ICT image segmentation methods include thresholding method [8], morphological method [9], edge detection [2], active contours [10], fuzzy methods [11], etc. Among them, thresholding is simple and fast. However, thresholding cannot deal with images with artifacts and noise well [12]. Morphological methods can deal with noisy images but are easily affected by artifacts [13]. Edge detection methods are fast but can only handle images without noise, inhomogeneity, and artifacts [14]. Active contours can handle noise and inhomogeneity well but is time-consuming [15]. Fuzzy methods are sensitive to noise and low contrast [16].
With the development of computer technology, deep neural networks have made great progress in the fields of machine learning and computer vision [17]. Using a deep neural network to segment an object among the background is to train the network for specific image segmentation tasks, where the network architecture and the training methods will all affect the final segmentation accuracy [18]. Common deep learning architectures include convolutional neural networks (CNNs), recurrent neural networks (RNNs), encoder-decoder models, and generative adversarial networks (GANs) [19]. Image segmentation models are basically derived from these network architectures with the help of encoders, decoders, skip-connections and dilation modifications, etc. Typical image segmentation methods are fully convolutional networks (FCNs) [20], encoder-decoder approaches [21], multiscale and pyramid network-based models [22], R-CNN approaches [23], dilated convolutional networks [24], RNN-based approaches [25], and attention- [26] and GAN [27]-based models. Among different network structures, the U-net approach is an efficient segmentation network that is very suitable for images with microscopic details, such as medical images with blood vessels and ICT images with complex microstructures [28]. Compared with conventional encoder-decoder convolution neural networks, the bypass connections in the U-net could compensate for the lost high-frequency details during pooling [29].
The most widely applied area of the U-net is medical image segmentation [30]. For example, Baltruschat et al. used the U-net to segment the bone implant, which achieved the overall best mean IoU = 0.906 [31]. Ghosh et al. combined the U-net with the VGG-16 network to segment brain tumors and achieved a pixel accuracy of 0.9975 [32]. Guo et al. used the U-net for breast ultrasound image segmentation and realized the average IOU of 82.7% (±0.02), while Khaled et al. obtained a mean dice similarity coefficient (DSC) of 0.680 (0.802 for main lesions) for dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) with the help of the U-net [33]. Li et al. also applied the U-net in isotropic quantitative differential phase contrast imaging, the result of which shows that the quantitative phase value in the ROI is recovered from 66% to 97% compared with the ground truth [34]. Lee et al. proposed the FUS-net based on the U-net for interference filtering in high-intensity-focused ultrasound images, which performs 15% better than stacked autoencoders (SAE) on evaluated test datasets [35]. Moreover, Rocha et al. compared the U-net with the conventional sliding band filter (SBF) approach, the experimental results of which indicate that the U-net-based models yield more identical results to the ground truth [36].
Because the conventional U-net will pass the low-resolution content through the skip-connections, the high-resolution edge information of the input image may not be sufficiently extracted, resulting in a reduced segmentation accuracy of tiny objects. Many efforts have been made to improve the performance of the conventional U-net. For example, Han et al. proposed a dual-frame U-net by adding a residual path after the pooling and before the unpooling operations for sparse-view CT reconstruction, which received a better peak-signal-to-noise ratio (PSNR) performance at the condition of x2 down-sampling factor compared with conventional U-net [29]. Seo et al. added a residual path with deconvolution and activation operations to the skip connection of the U-net to avoid duplication of low-resolution information of features, which realized a DSC of 98.51% and 89.72 % for liver and liver-tumor segmentation [37], respectively. In addition, Man et al. applied the deformable convolution in the conventional U-net for pancreas segmentation, which can capture the geometry-aware information of the head and tail end of the pancreas with a mean DSC of 86.93 ± 4.92% [38]. Hiasa et al. modified the U-net by inserting the dropout layer before each max pooling layer and after each up-convolution layer, which realized a dice coefficient (DC) of 0.891 ± 0.016 and an average symmetric surface distance (ASD) of 0.994 ± 0.230 mm for muscle segmentation [39]. Moreover, He et al. proposed a hierarchically fused U-net by incorporating contour awareness for prostate segmentation, which achieved an overall DSC accuracy of 0.878 ± 0.029 [40].
In addition to medical imaging, the U-net is also suitable for industrial applications. For example, Wang et al. used the U-net to remove artifacts produced by the projection of the static parts in CT reconstruction, which realized the recovery of rotating part details during in situ nondestructive testing of airplane engines [41]. Li et al. realized acoustic interference striation (AIS) recovery using the U-net. The test results in range-dependent waveguides with nonlinear internal waves demonstrated its effectiveness under different signal-to-noise ratios and different amplitudes and widths [42]. Xiao et al. realized precise ore mask segmentation (IOU = 92.07%) via U-net with the combination of the boundary mask fusion block [43].
In summary, the U-net architecture has good segmentation performance for grayscale images compared to conventional single-resolution neural networks. However, the ICT images of the hollow turbine blade contain severe inhomogeneity and cone beam artifacts [7], which are not conducive to accurate U-net segmentation of the turbine blade edge. At the same time, the turbine blades contain microstructures such as film cooling holes and exhaust edges that are not easy to detect via the U-net. Therefore, this paper presents an enhanced U-net with multiscale input and structural modifications to improve the segmentation performance of the conventional U-net. The contribution of this paper is that a deep-learning approach for the segmentation of the hollow turbine blade is proposed, which is a relatively novel approach in the area of aeroengine engineering. Sample expansion and modification to the U-net are conducted to improve the network's processing effect on inhomogeneity, noise, and artifacts, providing a feasible solution for similar industrial nondestructive testing problems.
The rest of this paper is arranged as follows. Section 2 introduces our enhanced U-net; Section 3 testifies our proposed method by experiments; Section 4 concludes our paper.

Data Source
The schematic diagram of the ICT scanning system is shown in Figure 2. The workpiece to be detected is placed on the rotary table controlled by the computer numerical control (CNC) system. Then the X-ray source projects an X-ray to penetrate the workpiece and leaves a projection on the flat panel detector. By rotating the workpiece at 360 degrees, the projection data of the workpiece in all directions can be obtained. After that, the ICT reconstruction algorithm is used to obtain the CT slice images of the workpiece. The schematic diagram of the ICT scanning system is shown in Figure 2. The workpiece to be detected is placed on the rotary table controlled by the computer numerical control (CNC) system. Then the X-ray source projects an X-ray to penetrate the workpiece and leaves a projection on the flat panel detector. By rotating the workpiece at 360 degrees, the projection data of the workpiece in all directions can be obtained. After that, the ICT reconstruction algorithm is used to obtain the CT slice images of the workpiece. The segmentation objects of this paper are the high-pressure turbine blades of the CFM56-7BE and the Pratt & Whitney F100 aeroengine, as shown in Figure 3. The ICT scanning parameters are as follows: equipment model: AX-4000CT; radiation source: Fine-Tec microfocus radiation source; detector: VAREX XRD4343N (formerly PE); scanning parameters: 400kV voltage and 1.5mA current.

Data augmentation
To augment the training data for segmentation of input images, this paper adopted the following augmentation approaches: •

Translation Transformation
The translation transformation of an image generally moves the image as a whole along the X direction or the Y direction or both the X and Y directions at the same time. After the translation, blank pixels will be generated, which could be handled via filling with black, assuming that the entire image moves along the X direction and along The segmentation objects of this paper are the high-pressure turbine blades of the CFM56-7BE and the Pratt & Whitney F100 aeroengine, as shown in Figure 3. The ICT scanning parameters are as follows: equipment model: AX-4000CT; radiation source: FineTec microfocus radiation source; detector: VAREX XRD4343N (formerly PE); scanning parameters: 400 kV voltage and 1.5 mA current. control (CNC) system. Then the X-ray source projects an X-ray to penetrate the work and leaves a projection on the flat panel detector. By rotating the workpiece at 360 deg the projection data of the workpiece in all directions can be obtained. After that, the reconstruction algorithm is used to obtain the CT slice images of the workpiece.

. Data augmentation
To augment the training data for segmentation of input images, this paper ado the following augmentation approaches: •

Translation Transformation
The translation transformation of an image generally moves the image as a w along the X direction or the Y direction or both the X and Y directions at the same After the translation, blank pixels will be generated, which could be handled via f with black, assuming that the entire image moves along the X direction and a

Data Augmentation
To augment the training data for segmentation of input images, this paper adopted the following augmentation approaches:

• Translation Transformation
The translation transformation of an image generally moves the image as a whole along the X direction or the Y direction or both the X and Y directions at the same time. After the translation, blank pixels will be generated, which could be handled via filling with black, assuming that the entire image moves ∆x along the X direction and ∆y along the Y direction. Then the transformation matrix from the original pixel (x 0 , y 0 ) to the transformed point (x, y) can be expressed as Equation (1): Mirror transformation includes horizontal mirroring and vertical mirroring. The image size remains unchanged before and after the mirror transformation, and no new blank pixels are generated. Suppose the image height is h, and the width is w. Then the transformation matrix from the original pixel (x 0 , y 0 ) to the transformed point (x, y) via horizontal mirroring and vertical mirroring can be expressed in Equations (2) and (3): Rotation transformation typically rotates the image around its center. Rotate the original image by θ clockwise around the center. Then the transformation matrix from the original pixel (x 0 , y 0 ) to the transformed point (x, y) via rotation transformation can be expressed as Equation (4): The scaling transformation in data augmentation is helpful for neural networks to learn features of different scales, which need to be cropped or filled with blank pixels (this paper filled the blank pixel with zero) to remain the same size as the original image when the scale ratio is above or below one.

Architecture of the Enhanced U-net
The traditional U-net is an end-to-end encoder-decoder-based image semantic segmentation network, which is mainly composed of the encoder, the decoder, and the jump connection. The architecture of the U-net resembles a capital "U" letter, which is a symmetrical network with an encoder on the left and the decoder on the right. The encoder is mainly used to extract the image's features, location, semantics, and other information, which is composed of several groups of "convolution + batchnormalization + Relu" operations followed by "Maxpooling" to realize the nonlinear expression capability of the network. Unlike the encoder, the decoder is used for up-sampling, restoring, and decoding the learned abstract semantic features via up-convolution operations to recover the image's semantic information gradually. The input of each decoder layer is fused with the features of the corresponding ender layer through a jump connection to obtain enough image features at different scales. A 1*1 convolution operation will connect the output of the decoder and Softmax operation to map feature maps learned by the U-net network to the number of categories that need to be divided. Then each pixel in the original image is classified according to the probability value generated by the U-net network. Compared with conventional fully convolutional networks, the U-net network performs well in dealing with image contour segmentation due to the deep multiscale fusion of high-level and low-level features through skip connection operations.
In the U-net, the deep feature maps near the end of the network can learn strong abstract semantic information. In contrast, the feature maps near the input layer can learn detailed information about the input, including target location, edge, and other features. Therefore, the noise, low contrast, and inhomogeneity of the input images will jeopardize the details of U-net feather maps and significantly affect the accuracy of the overall segmentation performance. To solve this dilemma, this paper proposed an enhanced U-net for the segmentation of the aeroengine hollow turbine blade. The following improvements are applied to the original U-net network.
(1) Multiscale input: To provide the network with enough image information from different scales, this paper adopted the multiscale input for the network. The advantage of this operation is that the average pooling down-sampling of the original image will not increase the computation complexity of the parameters in the network while the network width of the decoder path is extended.
(2) Dense block: Different from the common hierarchical convolutional neural network block, the dense block makes full use of the feature information at each level of the block, while the hierarchical network block only uses the feature map from the previous layer or two layers without considering the lower level. For each layer, the features of all previous layers in this block are used as input. Its own feature map is used as input for all subsequent layers, which can alleviate the gradient disappearance problem at a certain level and strengthen the full utilization of each feature map. Let x 0 , x 1 , . . . , x l−1 represent the output of the front l−1 layers in the current neural network block, and x l represent the output of the l th convolutional layer; then the dense block can be expressed as follows: where H l (*) represents a certain nonlinear composite function of the l th convolutional layer.
(3) Focal loss function: For the segmentation task of turbine blades, the proportion of blades in the input image is relatively small, i.e., the background is much larger than the area occupied by the object, which causes a serious imbalance between positive and negative samples. During neural network training, the model is more inclined to learn background features, and it is easy to judge positive samples that are difficult to learn as negative samples. To solve this imbalance issue, this paper used the Focal loss function to pay more attention to the contour pixels: In Equation (6), y andŷ represent the network's true value and prediction value; α is the weighting factor, which is used to balance the imbalance of positive and negative samples; γ represents the modulation coefficient, which is used to control the weight of easy-to-classify and difficult-to-classify samples. For the area with extremely low contrast on the edge of the turbine blade, setting the modulation coefficient γ can enable the network model to strengthen the learning of difficult samples during learning, thereby improving the accuracy of image segmentation.
(4) Residual path in the skip connection [37]: The input of the neural network consists of both high-resolution and low-resolution information. For the conventional U-net, the low-resolution information is passed through the next stage twice (through skip connection and pooling), while the high-resolution information is passed through only once (through the skip connection). To solve this dilemma, a residual path that contains a transposed convolution and activation is placed right after the pooling operation. The subtraction of the full-resolution information (original information passed through the skip connection) and low-resolution information (provided by the deconvolution operation of pooling, transposed convolution, and activation) can provide enough high-resolution information for the decoder. The overall network structure of the proposed enhanced U-net is shown in Figure 4.

Training
The training platform used in this paper is based on MATLAB R2021a, where the computer is configured with Intel Core I9 11900k @3.5Ghz, and Nvidia RTX A6000 video card. The operating system is 64-bit Windows 10.

Training Data Set
The original training data set contained a total of 493 slice images of two turbine blades, with the resolution cropped to 512 × 512. Another 180 and 30 slice images of these two turbine blades are used for validation and test, respectively. The ground truth of each image is manually labeled as follows: (1) manually draw the contours of the turbine blade in the CT slice image; (2) automatically fill in the area within the contour using the regional growth method. Figure 5 shows the example of the original CT slice image and its manually labeled ground truth for both turbine blades.

Original image
Ground truth Turbine Blade 1 Turbine Blade 2 Figure 5. Training data samples of two turbine blades.

Training Parameters
The training algorithm of the proposed network is "adam", which could effectively decrease the learning rates of parameters with large parameter gradients and their squared values and increase the learning rates when gradients and squared values are small. The gradient moving average and decay rate of the squared gradient moving average of the training algorithm are set to 0.9 and 0.999, respectively. The initial learning rate is set to 0.001, and the epochs of the training process is 200. The minibatch size of the

Training
The training platform used in this paper is based on MATLAB R2021a, where the computer is configured with Intel Core I9 11900k @3.5 Ghz, and Nvidia RTX A6000 video card. The operating system is 64-bit Windows 10.

Training Data Set
The original training data set contained a total of 493 slice images of two turbine blades, with the resolution cropped to 512 × 512. Another 180 and 30 slice images of these two turbine blades are used for validation and test, respectively. The ground truth of each image is manually labeled as follows: (1) manually draw the contours of the turbine blade in the CT slice image; (2) automatically fill in the area within the contour using the regional growth method. Figure 5 shows the example of the original CT slice image and its manually labeled ground truth for both turbine blades.

Training
The training platform used in this paper is based on MATLAB R2021a, where the computer is configured with Intel Core I9 11900k @3.5Ghz, and Nvidia RTX A6000 video card. The operating system is 64-bit Windows 10.

Training Data Set
The original training data set contained a total of 493 slice images of two turbine blades, with the resolution cropped to 512 × 512. Another 180 and 30 slice images of these two turbine blades are used for validation and test, respectively. The ground truth of each image is manually labeled as follows: (1) manually draw the contours of the turbine blade in the CT slice image; (2) automatically fill in the area within the contour using the regional growth method. Figure 5 shows the example of the original CT slice image and its manually labeled ground truth for both turbine blades.

Original image
Ground truth Turbine Blade 1 Turbine Blade 2 Figure 5. Training data samples of two turbine blades.

Training Parameters
The training algorithm of the proposed network is "adam", which could effectively decrease the learning rates of parameters with large parameter gradients and their squared values and increase the learning rates when gradients and squared values are small. The gradient moving average and decay rate of the squared gradient moving average of the training algorithm are set to 0.9 and 0.999, respectively. The initial learning rate is set to 0.001, and the epochs of the training process is 200. The minibatch size of the

Training Parameters
The training algorithm of the proposed network is "adam", which could effectively decrease the learning rates of parameters with large parameter gradients and their squared values and increase the learning rates when gradients and squared values are small. The gradient moving average and decay rate of the squared gradient moving average of the training algorithm are set to 0.9 and 0.999, respectively. The initial learning rate is set to 0.001, and the epochs of the training process is 200. The minibatch size of the training is set to 8, and the validation frequency is set to every 30 iterations. The learning rate is reduced every ten epochs to improve the training effect by multiplying with a dropping factor of 0.95.

Performance Evaluation
The segmentation results are evaluated by the Jaccard similarity coefficient, as defined by the following equation: where A represents the segmentation result, T represents the ground truth, and ∩ and ∪ are the intersection and the union operations. || represents the number of pixels in the set. A higher Jaccard score means a better segmentation result. Similarly, the dice similarity coefficient (DSC) is also used to evaluate the performance of the segmentation method, which can be expressed as: Another commonly used indicator is the BF score, i.e., how close the predicted boundary of an object matches the ground truth boundary. Compared with the Jaccard similarity coefficient, the BF score correlates better with human qualitative assessment. The BF score is defined as follows: where precision is the fraction of detections that are true positives rather than false positives, while recall refers to the fraction of true positives that are detected rather than missed. Assuming TP, FP, FN, and TN represent the true positive, false positive, false negative, and true negative of the segmentation results, the precision = TP/(TP + FP), recall = TP/(TP + FN).

Segmentation Results
To demonstrate the effectiveness of the proposed method, it was compared with adaptively regularized kernel-based FCM (ARKFCM) [44], DRLSE [45], MAXENTROY [46], EM/MPM [47], continuous max-flow (CMF) [48], and OTSU [12]. MAXENTROPY, CMF, and OTSU methods are parameterless. Parameters of ARKFCM, DRLSE, and EM/MPM are listed in Table 1. Our proposed method is compared with the conventional U-net, the dual-frame U-net [29], and the mU-net [37] approaches. The experimental results are depicted in Figure 6. In addition, a quantitative analysis of the experiments is listed in Table 2.
Mathematics 2022, 10, 4230 9 of 16 Our proposed method is compared with the conventional U-net, the dual-frame Unet [29], and the mU-net [37] approaches. The experimental results are depicted in Figure  6. In addition, a quantitative analysis of the experiments is listed in Table 2.   It can be known from Figure 6 that our proposed approach realized the best segmentation results compared with mathematical model-based segmentation methods and other U-net architectures. The ARKFCM method failed to segment the low contrast detail in the basin and the exhaust edge of both hollow turbine blades. In addition, small film cooling holes in the leading edges were also not segmented. For the DRLSE method, the target pixels on the edge of the hollow turbine blade are more likely to be mistakenly identified as the background, resulting in a certain dilation of the entire target object compared to the ground truth. Compared with the DRLSE method, the object dilation caused by this mis-segmentation is more obvious in the CMF method. Excessive dilation can cause severe deformation of the target, affecting the accuracy of subsequent measurements and registrations of the workpiece. Furthermore, the CMF method also failed in segmenting details in the blade basin and exhaust edge, as well as small film cooling holes scattered in the blade's leading edge. The segmentation results of the EM/MPM and the MAXENTROPY methods were also ineffective, i.e., dilated edges, lack of details in low-contrast blade basin, and not segmented film cooling holes all appeared in both hollow turbine blades. For the Otsu method, the main issue was that the accuracy was not high, i.e., the edge details and film cooling holes were not successfully identified.
Compared with model-based segmentation methods, the comparison U-net architectures and our proposed enhanced U-net realized higher segmentation accuracies and details. The conventional U-net has a good segmentation effect in target with fewer artifacts and small details with good grayscale distribution, such as film cooling holes in the leading edge. However, it failed to segment small details with low contrast and artifacts, especially in the film cooling holes in the low-contrast leading edge and basin, as depicted in the red rectangle in Figure 6. Compared with dual-frame U-net and mU-net architectures, our proposed approach achieved better performance in segmentation accuracy and edge continuity. In fact, it is difficult for nonprofessionals to distinguish whether these pixels belong to the background or the target. The reason for this problem in conventional U-net is that its receptive field is not large enough. Simply increasing the depth of the network cannot improve the segmentation effect of tiny details with low contrast and artifacts. Therefore, this paper tried to increase the receptive field of the conventional U-net by using multiscale original images as input and adopting the dense block and residual path in the skip connection without increasing the depth of the network, thereby increasing the segmentation effect of indistinguishable details in the hollow turbine blade CT images caused by low contrast and artifacts.
To quantitatively analyze the segmentation effects of the proposed approach and the comparison methods, the BF score, the Jaccard, and the DSC indexes of each segmentation result are listed in Table 2. For the segmentation samples of 1#102, 1#111, 1#113, and 2#741, our proposed approach achieved the highest BF scores, Jaccard, and DSC indexes among all the segmentation results. Although the conventional U-net realized the highest BF score of sample 2#743, it is only 0.001 higher than our proposed approach. Moreover, the smoothness of the object contour in the segmentation results of the conventional U-net is not as good as our proposed method. More importantly, the conventional U-net cannot segment the film cooling holes in the leading edge in both samples, demonstrating our proposed approach's superiority. The statistical T-test differences between each comparison method and our proposed approach are listed in Table 3. It can be known from Table 3 that the p-values of traditional mathematical model-based methods are all smaller than 0.01, which means that our proposed approach outperforms these traditional methods. For conventional U-net, its p-values of Jaccard and DSC indexes are smaller than 0.05, which means that our proposed approach performs better than the conventional U-net. For dual-frame U-net and mU-net, their p-values range 0.45~0.98, which means that our proposed enhanced U-net can achieve a slightly enhanced segmentation effect. It is worth noting that the p-value of dual-frame U-net is higher than that of mU-net, which indicates that the performance of dual-frame U-net is closer to our proposed approach. In order to quantitatively analyze the effectiveness of our proposed approach, four groups of ablation experiments were conducted in this paper to verify the improvement effect of multi-input, dense block, focal loss function, and residual path in the skip connection on turbine blade segmentation, respectively. The results are listed in Table 4. It can be known from Table 4 that the dense block and the multi-input modifications played more critical roles in the improvement of the segmentation results. The residual path and focal loss function modifications further enhanced the effect of the proposed architecture.

Processing Time
To study the segmentation efficiency of the proposed approach, the running time of the proposed approach and all comparison methods are listed in Table 5. It can be known from Table 5 that the deep learning-based methods consumed much less time compared with conventional model-based methods. The most time-saving MAX-ENTROPY method is still an order of magnitude more than the deep learning-based approaches. For the U-net architecture, the more layers the network has, the more segmentation time it consumes. However, it is worth noting that deep learning algorithms run on GPU, while conventional model-based methods run on CPU. This difference also determines that the efficiency of these methods is only horizontally comparable. To study the robustness of the proposed approach, this paper evaluated it on sample 1#268 with severe inhomogeneity, noise, and low contrast. The comparison results with conventional U-net are depicted in Figure 7.
It can be known from Figure 7 that the proposed approach can realize a better segmentation effect compared with conventional U-net in terms of low-quality images. For the region with insufficient contrast, the conventional U-net can easily cause mis-segmentation. For example, in the blade tail, the severe low contrast and inhomogeneity directly affect the U-net judgment of the edge contour. Pixels that originally belonged to the object were mis-segmented as the background.
In summary, the deep learning approach outperforms conventional model-based methods in segmentation accuracy and protecting tiny details. The essential reason is that, compared with conventional mathematical model-based segmentation methods, the deep neural network-based approach has a good learning effect on the continuity between slice images. This is because the physical distance between two adjacent layers is very small (often around 0.1 mm), so the interlayer variation of the sliced image is relatively small. When this layer is affected by uneven contrast or artifacts, conventional mathematical model algorithms are prone to mis-segmentation. The deep neural network not only learns the association between the real image of the current layer and the corresponding ground truth but also the real image information of adjacent layers. Therefore, when an ambiguous structure only exists in a few layers, it will not have a large impact on the convergence effect of the deep neural network.

Robustness of the Proposed Architecture
To study the robustness of the proposed approach, this paper evaluated it on sample 1#268 with severe inhomogeneity, noise, and low contrast. The comparison results with conventional U-net are depicted in Figure 7. It can be known from Figure 7 that the proposed approach can realize a better segmentation effect compared with conventional U-net in terms of low-quality images. For the region with insufficient contrast, the conventional U-net can easily cause mis-segmentation. For example, in the blade tail, the severe low contrast and inhomogeneity directly affect the U-net judgment of the edge contour. Pixels that originally belonged to the object were mis-segmented as the background.
In summary, the deep learning approach outperforms conventional model-based methods in segmentation accuracy and protecting tiny details. The essential reason is that, compared with conventional mathematical model-based segmentation methods, the deep neural network-based approach has a good learning effect on the continuity between slice images. This is because the physical distance between two adjacent layers is very small (often around 0.1 mm), so the interlayer variation of the sliced image is relatively small. When this layer is affected by uneven contrast or artifacts, conventional mathematical model algorithms are prone to mis-segmentation. The deep neural network not only learns the association between the real image of the current layer and the corresponding ground truth but also the real image information of adjacent layers. Therefore, when an ambiguous structure only exists in a few layers, it will not have a large impact on the convergence effect of the deep neural network.

Limitations
Due to the particular structure and material characteristics of the aeroengine turbine blade, its CT images have very obvious artifacts and inhomogeneity at the edges and inner cavities. Therefore, the conventional model-based image segmentation methods are often unsatisfactory for the CT image of the turbine blade, where the loss of details and missegmentation often occur. The use of supervised deep learning methods can effectively

Limitations
Due to the particular structure and material characteristics of the aeroengine turbine blade, its CT images have very obvious artifacts and inhomogeneity at the edges and inner cavities. Therefore, the conventional model-based image segmentation methods are often unsatisfactory for the CT image of the turbine blade, where the loss of details and mis-segmentation often occur. The use of supervised deep learning methods can effectively improve the segmentation effect of turbine blade CT images. The experimental results of this paper have demonstrated that the deep learning method has completely surpassed the conventional model method, whether in terms of processing speed or segmentation accuracy. In this paper, some improvements have been made on the basis of the conventional U-net architecture. Compared with the conventional U-net, the segmentation accuracy has been improved to a certain extent even under the influence of artifacts and inhomogeneity. However, due to a large number of tiny air film cooling holes in the turbine blades and the complex exhaust edges at the tail, it is still difficult to acquire precise details, even with the help of deep learning approaches. These difficult-to-divide air cooling film holes and exhaust edges are essential for turbine blades, which play an important role in the heat dissipation of turbine blades under extremely high temperatures and high-pressure conditions. Therefore, further efforts are needed to improve the segmentation accuracy of turbine blade details.

Conclusions
This paper presented an enhanced U-net for the segmentation of the aeroengine hollow turbine blade. The enhanced U-net is modified based on the conventional U-net architecture, where the multiscale input, dense block, focal loss function, and residual path in the skip connection are added to improve the receptive field of the network without increasing its longitudinal depth. Experiments were conducted based on a set of ICT slice images of two practical hollow turbine blades, namely, the CFM56-7BE and Pratt & Whitney F100 blades. The experimental results indicate that our proposed approach has the best segmentation results in terms of segmentation accuracy and protection of tiny details with low contrast and artifacts compared with typical model-based algorithms and the conventional U-net. Future work of this paper will focus on improving the segmentation accuracy of tiny objects in the blade, such as film cooling holes and exhaust edges, etc.