Deep Learning-Based Ultrasonic Testing to Evaluate the Porosity of Additively Manufactured Parts with Rough Surfaces

: Ultrasonic testing (UT) has been actively studied to evaluate the porosity of additively manufactured parts. Currently, ultrasonic measurements of as-deposited parts with a rough surface remain problematic because the surface lowers the signal-to-noise ratio (SNR) of ultrasonic signals, which degrades the UT performance. In this study, various deep learning (DL) techniques that can effectively extract the features of defects, even from signals with a low SNR, were applied to UT, and their performance in terms of the porosity evaluation of additively manufactured parts with rough surfaces was investigated. Experimentally, the effects of the processing conditions of additive manufacturing on the resulting porosity were ﬁrst analyzed using both optical and scanning acoustic microscopy. Second, convolutional neural network (CNN), deep neural network, and multi-layer perceptron models were trained using time-domain ultrasonic signals obtained from additively manufactured specimens with various levels of porosity and surface roughness. The experimental results showed that all the models could evaluate porosity accurately, even that of the as-deposited specimens. In particular, the CNN delivered the best performance at 94.5%. However, conventional UT could not be applied because of the low SNR. The generalization performance when using newly manufactured as-deposited specimens was high at 90%.


Introduction
Additive manufacturing (AM) is the process of depositing materials, layer upon layer, to create an object from a 3D computer-aided design [1][2][3][4]. The distinctive advantages of AM are that it can be used to produce innovative, complex designs and the fact that it is lightweight compared to conventional subtractive or casting manufacturing. Owing to these advantages, this manufacturing method has been actively studied in various fields [5][6][7]. A current major concern is manufacturing defects that can occur in the interior of AM parts and their effects on the integrity of these parts [8,9]. Porosity, defined as air-filled cavities inside a material, is a typical manufacturing defect found in AM parts, and is the result of deviation from the optimal AM processing conditions. Because porosity can severely aggravate the mechanical properties of AM parts, it has been of significant interest to researchers to evaluate the extent of porosity in manufactured parts [10][11][12].
Ultrasonic testing (UT) is a well-known non-destructive testing method and can be used to effectively evaluate the porosity [11][12][13] and properties of a material, including its strength, elastic modulus, and material density [14][15][16]. Previous studies have reported testing performance of the three models was evaluated and compared with respect to the surface roughness. Furthermore, the applicability of the conventional UT was also considered for performance comparison. (3) The generalization performance was evaluated using newly manufactured AM specimens that were not used to train the DL model in order to verify the generalizability of the pre-trained model.

Fabrication of Porosity-Induced Specimens with Different Levels of Surface Roughness
Ten AM specimens with various levels of porosity were created using a commercial selective laser melting machine (SLM 280 2.0, SLM Solutions, Lübeck, Germany) and commercially pure Ti powder with a particle diameter range of 20-63 µm. The specimens were 20 mm × 10 mm × 3 mm in size and were numbered from #1 to #10. The porosity content was controlled by varying the AM processing parameters, such as the laser power and laser scanning speed, which are known as the main indices that affect the porosity content of AM specimens. The processing parameters were selected after several preliminary tests and are summarized in Table 1. Generally, increasing the laser power or decreasing the scanning speed causes over-melting porosity. The opposite situation gives rise to porosity with a lack of fusion (LOF) [43]. Recent studies have found that porosity may also develop differently as a result of the variation in other properties such as the conduction-keyhole mode conversion of the melt pool [44], laser absorptivity [45], energy dissipation rate, and interaction time [46]. A detailed analysis of porosity mechanisms is provided in the next section. After specimen fabrication, both sides of the surfaces were polished to obtain the "smooth condition" using wire electrical discharge machining (EDM), where the arithmetic mean roughness (R a ) measured by a general roughness tester was 0.65 µm. To consider different surface roughness conditions, we fabricated 10 additional AM specimens, numbered from #1 to #10 , the surfaces of which on both sides had different degrees of roughness. The surface on one side was polished to attain the "medium condition" by a general hand grinder that was used to separate specimens from the baseplate of a 3D printer. The other side was "rough condition", which corresponded to the as-deposited raw surface. The R a of each surface was 3.1 µm and 6.4 µm, respectively, which are also listed in Table 2 together with those of the #1-#10 specimens. Except for the roughness conditions, all the other properties were the same as #1-#10 specimens. A photograph of the AM specimens is shown in Figure 1. Only the porosity of the specimens with smooth surfaces was examined with SAM. The training/testing datasets were constructed by using all three surface conditions.

Porosity Examination
SAM was used to quantify the porosity content of the AM specimens. C-mode imaging using a scanning acoustic microscope (HS-1000, Sonix, Springfield, VA, USA) was conducted with a 75 MHz focusing-type transducer. Only AM specimens finished to obtain the "smooth condition" were tested because the scanning quality of this imaging is highly influenced by the surface condition of specimens [47]. The focal point and the C-mode window were positioned at the center of the specimen. The window size was set to 1.5 mm, which corresponded to half the thickness of the specimen, to obtain comprehensive results for porosity. After the SAM analysis, a general optical microscope (GX53, OLYMPUS, Inc., Olympus, Tokyo, Japan) was used to analyze the shape and type of porosity. For the OM analysis, the surfaces of AM specimens with the "smooth condition" were additionally polished to lower their surface roughness to 0.04 μm. Figure 2 shows the C-mode images that were obtained. The images clearly show that the porosities, represented as bright spots, are distributed differently depending on the AM processing conditions. These C-mode images were used to quantify the amount of porosity by using the open-source software ImageJ [48]. Based on the 6 dB drop method, each image was subjected to the binarization process, and the calculated porosity contents, defined as the ratio of the pore area to the total area in two-dimensional space, are summarized in Table 3. Note that these porosity contents are relative values [48]. According to the amount of porosity, they were labeled from "Porosity level 1" to "Porosity level 10". The measured porosity content increases as the specimen number increases. Because specimen #1 was manufactured under the optimal processing conditions, its porosity content was the lowest at 0.7% as determined by SAM. Almost no porosity was observed in the OM image shown in Figure 3a. Under this condition, the volume laser energy input (LEI) in the melt pool was 70 J/mm 3 . The porosity contents of specimens #2 and #3 were in the range of 2.5-7.5%. The LEI of these specimens was 75 and 90 J/mm 3 , respectively, which was within the over-melting condition; this causes not only welded particles and wavy surfaces but also entrapped gas, resulting in small pits and gas porosity, as shown in Figure 3b. Specimens #4-#8 were manufactured under LEI conditions of 55-65 J/mm 3 . The porosity content was within the range 7.5-27.5%, slightly higher than those of #2 and #3. Generally, more porosity is created under LOF conditions than under over-melting conditions [49]. The lack of LEI prevents the powder in the inter-and intra-layers from melting sufficiently, which results in LOF porosity with un-melted powder, as shown in Figure 3c. Although these specimens had similar LEI levels, more pores were observed in specimens #7 and #8 than in #4-#6, as shown in Figure 2. This may be

Porosity Examination
SAM was used to quantify the porosity content of the AM specimens. C-mode imaging using a scanning acoustic microscope (HS-1000, Sonix, Springfield, VA, USA) was conducted with a 75 MHz focusing-type transducer. Only AM specimens finished to obtain the "smooth condition" were tested because the scanning quality of this imaging is highly influenced by the surface condition of specimens [47]. The focal point and the C-mode window were positioned at the center of the specimen. The window size was set to 1.5 mm, which corresponded to half the thickness of the specimen, to obtain comprehensive results for porosity. After the SAM analysis, a general optical microscope (GX53, OLYMPUS, Inc., Olympus, Tokyo, Japan) was used to analyze the shape and type of porosity. For the OM analysis, the surfaces of AM specimens with the "smooth condition" were additionally polished to lower their surface roughness to 0.04 µm. Figure 2 shows the C-mode images that were obtained. The images clearly show that the porosities, represented as bright spots, are distributed differently depending on the AM processing conditions. These C-mode images were used to quantify the amount of porosity by using the open-source software ImageJ [48]. Based on the 6 dB drop method, each image was subjected to the binarization process, and the calculated porosity contents, defined as the ratio of the pore area to the total area in two-dimensional space, are summarized in Table 3. Note that these porosity contents are relative values [48]. According to the amount of porosity, they were labeled from "Porosity level 1" to "Porosity level 10". The measured porosity content increases as the specimen number increases. Because specimen #1 was manufactured under the optimal processing conditions, its porosity content was the lowest at 0.7% as determined by SAM. Almost no porosity was observed in the OM image shown in Figure 3a. Under this condition, the volume laser energy input (LEI) in the melt pool was 70 J/mm 3 . The porosity contents of specimens #2 and #3 were in the range of 2.5-7.5%. The LEI of these specimens was 75 and 90 J/mm 3 , respectively, which was within the over-melting condition; this causes not only welded particles and wavy surfaces but also entrapped gas, resulting in small pits and gas porosity, as shown in Figure 3b. Specimens #4-#8 were manufactured under LEI conditions of 55-65 J/mm 3 . The porosity content was within the range 7.5-27.5%, slightly higher than those of #2 and #3. Generally, more porosity is created under LOF conditions than under over-melting conditions [49]. The lack of LEI prevents the powder in the inter-and intra-layers from melting sufficiently, which results in LOF porosity with un-melted powder, as shown in Figure 3c. Although these specimens had similar LEI levels, more pores were observed in specimens #7 and #8 than in #4-#6, as shown in Figure 2. This may be due to the insufficient interaction time of the laser to melt the powder owing to the higher scanning speed [49]. The LEI of specimens #9 and #10 was 85-90 J/mm 3 , within the over-melting condition. Despite the LEI being similar to that of #2 and #3, the porosity content was significantly higher (over 27.5%). The reason may be a combination of high LEI and low laser power, in which case a shallow melt pool is generated, which may not be able to penetrate the previously deposited layers. This result may yield a large number of pits with un-melted powder between the interlayers, with the result that these specimens have the highest porosity, as shown in Figure 3d. due to the insufficient interaction time of the laser to melt the powder owing to the higher scanning speed [49]. The LEI of specimens #9 and #10 was 85-90 J/mm 3 , within the over-melting condition. Despite the LEI being similar to that of #2 and #3, the porosity content was significantly higher (over 27.5%). The reason may be a combination of high LEI and low laser power, in which case a shallow melt pool is generated, which may not be able to penetrate the previously deposited layers. This result may yield a large number of pits with un-melted powder between the interlayers, with the result that these specimens have the highest porosity, as shown in Figure 3d.    Lev. 4 (7.5-10) Lev. 5 (10-12.5) Lev. 6 (12.5-15) Lev. 7 (22.5-25) Lev. 8 (25-27.5) Lev. 9 (27.5-30) Lev. 10 (30-) due to the insufficient interaction time of the laser to melt the powder owing to the higher scanning speed [49]. The LEI of specimens #9 and #10 was 85-90 J/mm 3 , within the over-melting condition. Despite the LEI being similar to that of #2 and #3, the porosity content was significantly higher (over 27.5%). The reason may be a combination of high LEI and low laser power, in which case a shallow melt pool is generated, which may not be able to penetrate the previously deposited layers. This result may yield a large number of pits with un-melted powder between the interlayers, with the result that these specimens have the highest porosity, as shown in Figure 3d.

Ultrasonic Measurements
Ultrasonic measurements were conducted by a pulse-echo mode using a contact transducer. This method is a well-known nondestructive testing technique that uses a pulsed signal with a broad bandwidth and several back-wall echo signals [50]. A sche-

Ultrasonic Measurements
Ultrasonic measurements were conducted by a pulse-echo mode using a contact transducer. This method is a well-known nondestructive testing technique that uses a pulsed signal with a broad bandwidth and several back-wall echo signals [50]. A schematic diagram and an image of the experimental setup are shown in Figure 4. A pulsed voltage signal, generated by a commercial pulser/receiver, was sent to a 5 MHz piezoelectric transducer. A longitudinal wave with a wavelength of approximately 1.2 mm was emitted by the transducer and was then incident on the AM specimen. At this wavelength, the ultrasonic diffraction effect is negligibly small because the ultrasonic wave propagation distance is in the range of the near field zone, which is obtained by D 2 /4λ = 19 mm, where D is the transducer diameter and λ is the wavelength. The back-wall echo was received by the same transducer, and the ultrasonic signal was displayed and saved on a commercial oscilloscope. This echo signal reflects the effects of porosity in the ultrasonic propagation direction in the form of variations in the ultrasonic arrival time and ultrasonic attenuations [50]. To minimize the ultrasonic measurement errors, a pneumatic device that can apply a consistent pressure of 0.4 MPa was used, such that the contact condition between the transducer and the specimen was maintained consistently in each measurement [50]. Ultrasonic signals were obtained for the three different surface conditions: smooth, medium, and rough. Typical signals obtained for specimens #1 and #1 prepared with three different surface conditions are plotted in Figure 5. The three measured signals overlap with each other. Although the porosity contents of specimens #1 and #1 are the same, the difference in their ultrasonic amplitudes is clearly visible when the three signals are compared. In particular, the amplitude loss is very large on the surface of the "rough condition" specimen. The SNRs of these signals were 33 dB, 16 dB, and 10 dB, respectively, as shown in Figure 5. These SNRs are attributed to imperfect contact between the transducer and test specimens. The presence of air gaps owing to the imperfect contact results in impedance mismatch and also multi-reflections of the incident ultrasonic waves. Consequently, except for the "smooth condition", the additional changes in the properties of the incident ultrasonic waves make it difficult to evaluate the porosity of rough surfaces by using conventional UT such as ultrasonic velocity and ultrasonic attenuation coefficient measurements [22].

Structures of Deep Learning Models
Artificial intelligence using neural network uses the group of correlated nodes motivated by biological neurons [51,52]. The simplest model is the MLP, which consists of an input layer, an output layer, and one hidden layer, where nodes are fully connected with each other [53]. DNN consists of more than two hidden layers with input and output layers, with the deeper structures being able to enhance the feature extraction capability. CNN is a type of MLP designed to use a feature extractor that requires minimal pre-processing with a fully connected neural network [51,54]. The feature extractor consists of one or more convolutional and pooling layers. A high-level feature map obtained from the feature extractor enables the network to be deeper with fewer parameters [55].
In this study, the CNN, DNN, and MLP models were used, and their performance was compared. Among several types of CNN, a one-dimensional (1D) CNN, which is effective not only to derive features from shorter segments of overall data but also accepts any type of signal as the input, was used. This network comprised a 1D array-type input layer, two convolutional layers, two max-pooling layers, a fully connected layer, and an output layer [56]. The input layer was restricted to (5000 × 1) nodes, which cor- Figure 5. Ultrasonic signals obtained from the three different surface conditions. The signals were cropped to 5000 sampling numbers and were plotted to overlap with each other.

Structures of Deep Learning Models
Artificial intelligence using neural network uses the group of correlated nodes motivated by biological neurons [51,52]. The simplest model is the MLP, which consists of an input layer, an output layer, and one hidden layer, where nodes are fully connected with each other [53]. DNN consists of more than two hidden layers with input and output layers, with the deeper structures being able to enhance the feature extraction capability. CNN is a type of MLP designed to use a feature extractor that requires minimal pre-processing with a fully connected neural network [51,54]. The feature extractor consists of one or more convolutional and pooling layers. A high-level feature map obtained from the feature extractor enables the network to be deeper with fewer parameters [55].
In this study, the CNN, DNN, and MLP models were used, and their performance was compared. Among several types of CNN, a one-dimensional (1D) CNN, which is effective not only to derive features from shorter segments of overall data but also accepts any type of signal as the input, was used. This network comprised a 1D array-type input layer, two convolutional layers, two max-pooling layers, a fully connected layer, and an output layer [56]. The input layer was restricted to (5000 × 1) nodes, which corresponded to the sampling numbers of the original ultrasonic signals. Note that the down-sampling of input nodes from original signals can reduce the computation time during model training. However, this down-sampling can also decrease the ability of the DL model to extract porosity features in the ultrasonic signals. In the first convolutional layer, the kernel size was set wide with (50 × 1) to restrain noise effectively, and the sizes of feature map and stride were 32 and (5 × 1), respectively. In the second convolutional layer, the kernel size was set to (4 × 1), considerably smaller than the first layer to extract a large number of features. The sizes of the feature map and stride in the second layer were 64 and (2 × 1), respectively. According to several simulation studies [34,35], these two convolutional layers with different kernel sizes showed good performance in noisy conditions. After each convolutional layer, one pooling layer was used, where both pooling and stride sizes were (2 × 1). The fully connected layer was set to (1000 × 1) nodes and connected to the output layer based on the softmax function F(s i ) with cross-entropy (CE) loss for classification, derived as follows [34]: where s is the predicted output, subscript i indexes each output class, K is the total class numbers, and t i is the real output. The activation function was a rectified linear unit (ReLU), presented as follows [57]: To prevent an overfitting problem, dropout regularization [58] with a 70% training probability, which is a trick method to deactivate several nodes during training, was used before and after the fully connected layer. This dropout is also effective for the stronger robustness of the model. The learning rate was set to 0.001 after several trials. Note that too large a learning rate shows a corresponding effect for the down-sampling of the input nodes. Details of the CNN model are shown in Figure 6a. In the fully connected DNN model, instead of using the convolutional layer, the fully connected layers were set deeper than the used CNN model. Two hidden layers, i.e., (1000 × 1) and (1000 × 1) nodes, were used. The previous simulation studies [34,35] also reported that the deeper structures showed a better feature extraction ability of the DNN model. In the MLP model, only one hidden layer with (1000 × 1) nodes was used. The other parameters of the DNN and MLP models, such as the number of nodes of the input and output, and the dropout rate, were set to correspond to those of the CNN model, as shown in Figure 6b,c, and Table 4. The Relu and F(s i ) with CE functions were also used. All the models were designed using TensorFlow and Keras.

Procedures to Train and Test the Models
The CNN, DNN, and MLP models were trained using the prepared training dataset to derive each specific function for porosity evaluation, after which the testing performance was compared. Training was conducted by using a classification method based on supervised learning for three models, with the respective training datasets generating the class label as the output. Ten labeled classes with porosity levels ranging from 1 to 10 were used with the levels based on the results of the porosity content measurements obtained with SAM. Ultrasonic signals were acquired from the surfaces with three different roughness levels. For each surface roughness level, 100 ultrasonic signals were measured, of which 80 ultrasonic signals were randomly extracted and used as the training dataset. The remaining 20 signals were used as the testing dataset. Considering previous research [36], the amount of data used to form the training and testing datasets is sufficient. The training and testing datasets for each of the 10 porosity levels and the three different roughness conditions are summarized in Table 5.  Figure 7 shows learning curves of the CNN, DNN, and MLP models for the "rough condition", which represents the testing accuracy and cost as a function of the number of epochs. The testing accuracy was defined as the classification performance at each epoch on the testing data, and was calculated as: Testing accuracy (%) = m 1 /n 1 ·100, where m 1 is the number of testing data points classified well, and n 1 is the overall number of testing data points. The performance of the model can be evaluated from the testing accuracy, which represents the classification performance at each epoch on the testing data. The cost represents the error between the real output and the predicted output of the tested model based on the testing dataset. Figure 7 shows that the testing accuracy of the respective CNN, DNN, and MLP model reached a global maximum and oscillated after approximately the 40th epoch. Therefore, we monitored the testing accuracy from the 40th epoch until the end, and the epoch that provided the highest testing accuracy was chosen for the respective CNN, DNN, and MLP model [59]. A commercial CPU device was used to train all models. The computation time for the respective CNN, DNN, and MLP model was approximately 760, 500, and 96 s, respectively.  Table 6 lists the testing performance of the three different models for the various surface roughness levels, and these results are compared in Figure 8. The testing performance results are also listed as confusion matrices in Appendix A. For the "smooth condition" surface, each CNN and DNN model delivered average testing performance of 98.5% and 98.0%, respectively. Although the performance of the MLP model was relatively lower than that of the others, it also performed well at 96.0%. Increased roughness levels caused the performance of all the models to decrease; however, the performance of all the models exceeded 80.5% in terms of their accuracy. The model that delivered the best performance for the "medium condition" and "rough condition" surfaces was CNN. The performance only decreased to 97.5% and 94.5% for surfaces with these two roughness levels, i.e., decreases of 1% and 4%, respectively, when compared with the "smooth condition" surface. Those of the DNN were 95.5% and 89.5% for the "medium condition" and "rough condition", respectively, a slightly larger decrease of 2.5% and 8.5%, respectively. This tendency is more pronounced for the MLP model, in which case the performance decreased to 92.0% and 80.5%, i.e., decreases of 4% and 16.5%, respectively. In comparison, the performance of all models decreased slightly at porosity levels 1 and 2 compared with the other levels regardless of the surface roughness conditions. Above porosity level 3, the average performance for all roughness conditions for each of the CNN, DNN, and MLP models was 99.4%, 96.3%, and 92.5%, respectively. However, below level 2, these values decreased to 86.7%, 86.7%, and 77.5%, respectively.   Table 6 lists the testing performance of the three different models for the various surface roughness levels, and these results are compared in Figure 8. The testing performance results are also listed as confusion matrices in Appendix A. For the "smooth condition" surface, each CNN and DNN model delivered average testing performance of 98.5% and 98.0%, respectively. Although the performance of the MLP model was relatively lower than that of the others, it also performed well at 96.0%. Increased roughness levels caused the performance of all the models to decrease; however, the performance of all the models exceeded 80.5% in terms of their accuracy. The model that delivered the best performance for the "medium condition" and "rough condition" surfaces was CNN. The performance only decreased to 97.5% and 94.5% for surfaces with these two roughness levels, i.e., decreases of 1% and 4%, respectively, when compared with the "smooth condition" surface. Those of the DNN were 95.5% and 89.5% for the "medium condition" and "rough condition", respectively, a slightly larger decrease of 2.5% and 8.5%, respectively. This tendency is more pronounced for the MLP model, in which case the performance decreased to 92.0% and 80.5%, i.e., decreases of 4% and 16.5%, respectively. In comparison, the performance of all models decreased slightly at porosity levels 1 and 2 compared with the other levels regardless of the surface roughness conditions. Above porosity level 3, the average performance for all roughness conditions for each of the CNN, DNN, and MLP models was 99.4%, 96.3%, and 92.5%, respectively. However, below level 2, these values decreased to 86.7%, 86.7%, and 77.5%, respectively.
Conventional UT is based on ultrasonic velocity and ultrasonic attenuation coefficient measurements [13]. The use of these methods requires not only the first back-wall echo signal but also the second echo in pulse-echo mode to be measured to extract the ultrasonic velocity and attenuation coefficient parameters. A comparison of the extent to which the parameters vary enables the porosity to be evaluated. These parameters are calculated as follows [13]: where v is the ultrasonic velocity, d is the ultrasonic wave propagation distance corresponding to twice the thickness of the specimen, τ is the time-of-flight difference between two consecutive echoes, a is the ultrasonic attenuation coefficient, and A 1 and A 2 are the amplitudes of two consecutive echoes, respectively. Generally, the amplitude of the second echo is smaller than that of the first echo because the second echo is propagated over a longer distance.   Table 6 lists the testing performance of the three different models for the various surface roughness levels, and these results are compared in Figure 8. The testing performance results are also listed as confusion matrices in Appendix A. For the "smooth condition" surface, each CNN and DNN model delivered average testing performance of 98.5% and 98.0%, respectively. Although the performance of the MLP model was relatively lower than that of the others, it also performed well at 96.0%. Increased roughness levels caused the performance of all the models to decrease; however, the performance of all the models exceeded 80.5% in terms of their accuracy. The model that delivered the best performance for the "medium condition" and "rough condition" surfaces was CNN. The performance only decreased to 97.5% and 94.5% for surfaces with these two roughness levels, i.e., decreases of 1% and 4%, respectively, when compared with the "smooth condition" surface. Those of the DNN were 95.5% and 89.5% for the "medium condition" and "rough condition", respectively, a slightly larger decrease of 2.5% and 8.5%, respectively. This tendency is more pronounced for the MLP model, in which case the performance decreased to 92.0% and 80.5%, i.e., decreases of 4% and 16.5%, respectively. In comparison, the performance of all models decreased slightly at porosity levels 1 and 2 compared with the other levels regardless of the surface roughness conditions. Above porosity level 3, the average performance for all roughness conditions for each of the CNN, DNN, and MLP models was 99.4%, 96.3%, and 92.5%, respectively. However, below level 2, these values decreased to 86.7%, 86.7%, and 77.5%, respectively.   Figure 9 shows two consecutive echoes measured from specimens with the three surface roughness levels. For the "smooth condition" two echoes are clearly observed. However, for the other roughness levels, the levels of the second echo and background noise are almost similar owing to the amplitude loss from the rough surfaces. Consequently, rough surface conditions make it difficult to employ conventional UT for porosity evaluation.
Several reasons could exist for the high performance of the DL models in terms of their porosity evaluation of AM parts with rough surfaces. The first simple reason is their excellent ability to perform feature extraction. The use of DL models with deep and wide structures with hidden nodes is known to be more effective for extracting features than conventional UT [36]. The second reason is that the training dataset of the DL model consists of the raw ultrasonic signals, whereas conventional UT, which includes the ultrasonic velocity and attenuation measurements, only uses the velocity and attenuation coefficient parameters extracted from the ultrasonic signals. When the raw signal is used for training, various properties including not only the velocity and attenuation but also the ultrasonic backscattering and non-linearity can be used as features. Although not to the same extent as the velocity and attenuation, backscattering and non-linearity are also known to be related to the porosity content, which enhances the performance when DL models are used [50]. Our experimental results showed that the rougher the surface, that is, the lower the SNR, the more effective is the DL model. At the same time, the CNN model outperformed the DNN and MLP models because the CNN model, which uses a pre-processor, is beneficial for feature extraction from the waveform even for low SNRs. The waveform of the ultrasonic wave propagating through the porous medium varied locally. As mentioned above, the typical waveform variation is the delay in the arrival time and ultrasonic attenuation owing to local elastic inhomogeneity at the boundary of the pores. When the CNN model is used, both the convolutional and pooling layers in the pre-processor assign a greater weight to this variation in the waveform, thereby enabling the CNN to achieve more effective feature extraction than the other models.
Metals 2021, 11, x FOR PEER REVIEW 13 of 20 models are used [50]. Our experimental results showed that the rougher the surface, that is, the lower the SNR, the more effective is the DL model. At the same time, the CNN model outperformed the DNN and MLP models because the CNN model, which uses a pre-processor, is beneficial for feature extraction from the waveform even for low SNRs. The waveform of the ultrasonic wave propagating through the porous medium varied locally. As mentioned above, the typical waveform variation is the delay in the arrival time and ultrasonic attenuation owing to local elastic inhomogeneity at the boundary of the pores. When the CNN model is used, both the convolutional and pooling layers in the pre-processor assign a greater weight to this variation in the waveform, thereby enabling the CNN to achieve more effective feature extraction than the other models. Figure 9. Ultrasonic 1st and 2nd echo signals measured from surfaces with the "smooth condition", "medium condition", and "rough condition".
Note that, in addition to the surface roughness issue, porosity with an irregular distribution pattern may affect the UT performance. For example, if the porosity is distributed non-uniformly in the direction parallel to the surface attached to the ultrasonic transducer, the UT performance may deteriorate depending on the positions at which measurements are conducted (where the surface in contact with the transducer is assumed to be constant). Generally, porosity originates from a lack of uniformity along the building direction because the cooling rate is varied during AM building. In contrast, the plane normal to the building direction is relatively uniform [21]. In our experiments, the ultrasonic measurement was conducted using a transducer attached to the surface in the direction normal to the building direction, as shown in Figure 4. In other words, an ultrasonic wave propagating in a direction parallel to the building direction reflects the effects of a non-uniform pore; however, the average porosity along this path is almost uniform in the direction parallel to the surface attached to the transducer. Therefore, there may be few errors in the UT performance owing to the irregular pattern in which the porosity is distributed. However, in the case of low levels of porosity, this assumption may be difficult to establish. In fact, our experimental results indicated that, below porosity level 2, the performance is slightly lower.

Evaluation of the Generalization Performance
To verify the applicability of the pre-trained model, a generalization performance test was carried out on newly fabricated AM specimens, which were not utilized to train the models. The generalization test was conducted on specimens in the as-deposited condition, i.e., the "rough condition". Only the pre-trained CNN model, which delivered the best performance for this roughness condition, was used. Two new specimens were manufactured by using the same AM process but different AM processing parameters. These parameters did not correspond to the processing conditions of the existing 10 AM specimens that were used to train the models. One-hundred ultrasonic signals were ob- Figure 9. Ultrasonic 1st and 2nd echo signals measured from surfaces with the "smooth condition", "medium condition", and "rough condition".
Note that, in addition to the surface roughness issue, porosity with an irregular distribution pattern may affect the UT performance. For example, if the porosity is distributed non-uniformly in the direction parallel to the surface attached to the ultrasonic transducer, the UT performance may deteriorate depending on the positions at which measurements are conducted (where the surface in contact with the transducer is assumed to be constant). Generally, porosity originates from a lack of uniformity along the building direction because the cooling rate is varied during AM building. In contrast, the plane normal to the building direction is relatively uniform [21]. In our experiments, the ultrasonic measurement was conducted using a transducer attached to the surface in the direction normal to the building direction, as shown in Figure 4. In other words, an ultrasonic wave propagating in a direction parallel to the building direction reflects the effects of a non-uniform pore; however, the average porosity along this path is almost uniform in the direction parallel to the surface attached to the transducer. Therefore, there may be few errors in the UT performance owing to the irregular pattern in which the porosity is distributed. However, in the case of low levels of porosity, this assumption may be difficult to establish. In fact, our experimental results indicated that, below porosity level 2, the performance is slightly lower.

Evaluation of the Generalization Performance
To verify the applicability of the pre-trained model, a generalization performance test was carried out on newly fabricated AM specimens, which were not utilized to train the models. The generalization test was conducted on specimens in the as-deposited condition, i.e., the "rough condition". Only the pre-trained CNN model, which delivered the best performance for this roughness condition, was used. Two new specimens were manufactured by using the same AM process but different AM processing parameters. These parameters did not correspond to the processing conditions of the existing 10 AM specimens that were used to train the models. One-hundred ultrasonic signals were obtained for each specimen and were used as input to the pre-trained model. Figure 10 shows the results of the generalization performance of the two AM specimens using the pre-trained CNN model. This model assessed the Test#1 specimen as "Porosity level 2" with the highest probability of 89% and "Porosity level 1" with the second highest of 8%. This model also rated the Test-#2 specimen as "Porosity level 8" with 91% and "Porosity level 7" with 7%.
pre-trained CNN model. This model assessed the Test#1 specimen as "Porosity level 2" with the highest probability of 89% and "Porosity level 1" with the second highest of 8%. This model also rated the Test-#2 specimen as "Porosity level 8" with 91% and "Porosity level 7" with 7%.
To validate the results obtained by the pre-trained CNN model, SAM was also used to assess the porosity content of the tested specimens. Because SAM cannot be employed to examine as-deposited specimens with rough surfaces, the test specimens were additionally polished using wire EDM. Figure 11 shows the obtained C-mode images. The porosity contents that were calculated from these images are presented in Table 7 alongside the assessment with the pre-trained CNN model. The calculated porosity contents of Test#1 and Test#2 were 4.3% and 27%, respectively, which were within the range of "Porosity level 2" and "Porosity level 8", respectively. In other words, the SAM results were in good correspondence with the results assessed as having the highest probability by the CNN model. In addition, the average generalization performance for the "rough condition" was 90%, which is slightly lower than the testing performance in Section 3.2. This might be due to differences in the AM processing conditions [1] and the experimental environment.  To validate the results obtained by the pre-trained CNN model, SAM was also used to assess the porosity content of the tested specimens. Because SAM cannot be employed to examine as-deposited specimens with rough surfaces, the test specimens were additionally polished using wire EDM. Figure 11 shows the obtained C-mode images. The porosity contents that were calculated from these images are presented in Table 7 alongside the assessment with the pre-trained CNN model. The calculated porosity contents of Test#1 and Test#2 were 4.3% and 27%, respectively, which were within the range of "Porosity level 2" and "Porosity level 8", respectively. In other words, the SAM results were in good correspondence with the results assessed as having the highest probability by the CNN model. In addition, the average generalization performance for the "rough condition" was 90%, which is slightly lower than the testing performance in Section 3.2. This might be due to differences in the AM processing conditions [1] and the experimental environment.

Conclusions
In this work, DL techniques were used in conjunction with UT to evaluate the porosity of AM parts with rough surfaces. Key research outcomes were as follows.
(1) Various porosity mechanisms were investigated through SAM and OM analysis.
Porosity contents increased in the order of normal (the relative porosity content measured by SAM: 0.7%), over-melting (4.2%), LOF (16.1%), and over-melting with low laser power conditions (34%). (2) A comparison of the performance results of the various DL models showed that all the models were highly accurate at over 80.5%, even for the as-deposited specimens with surfaces in the "rough condition". In particular, CNN was the most effective at 94.5%. Owing to the low SNR of the measured ultrasonic signal, conventional UT using ultrasonic velocity and ultrasonic attenuation coefficient measurements could not be used to assess "medium condition" and "rough condition" surfaces. (3) A generalization test was also conducted using newly as-deposited AM specimens that were not used for training to evaluate the applicability of the pre-trained CNN model. The test results confirmed the model's high evaluation performance of 90.0%, which corresponded well with the results obtained with SAM.
These results suggest that the use of DL could be expected to enhance the UT performance with respect to the porosity evaluation of AM parts, even for as-deposited rough surfaces.

Conclusions
In this work, DL techniques were used in conjunction with UT to evaluate the porosity of AM parts with rough surfaces. Key research outcomes were as follows.
(1) Various porosity mechanisms were investigated through SAM and OM analysis.
Porosity contents increased in the order of normal (the relative porosity content measured by SAM: 0.7%), over-melting (4.2%), LOF (16.1%), and over-melting with low laser power conditions (34%). (2) A comparison of the performance results of the various DL models showed that all the models were highly accurate at over 80.5%, even for the as-deposited specimens with surfaces in the "rough condition". In particular, CNN was the most effective at 94.5%. Owing to the low SNR of the measured ultrasonic signal, conventional UT using ultrasonic velocity and ultrasonic attenuation coefficient measurements could not be used to assess "medium condition" and "rough condition" surfaces. (3) A generalization test was also conducted using newly as-deposited AM specimens that were not used for training to evaluate the applicability of the pre-trained CNN model. The test results confirmed the model's high evaluation performance of 90.0%, which corresponded well with the results obtained with SAM.
These results suggest that the use of DL could be expected to enhance the UT performance with respect to the porosity evaluation of AM parts, even for as-deposited rough surfaces.