Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm

: Fault detection in metallic structures requires a detailed and discriminative feature pool creation mechanism to develop an effective condition monitoring system. Traditional fault detection methods incorporate handcrafted features either from the time, frequency or time-frequency domains. To explore the salient information provided by the acoustic emission (AE) signals, a hybrid of feature pool creation and an optimal features subset selection mechanism is proposed for crack detection in a spherical tank. The optimal hybrid feature pool creation process is composed of two major parts: (1) extraction of statistical features from time and frequency domains, as well as extraction of traditional features associated with the AE signals; and (2) genetic algorithm (GA)-based optimal features subset selection. The optimal features subset is then provided to the k-nearest neighbor (k-NN) classifier to distinguish between normal (NC) and crack conditions (CC). Experimental results show that the proposed approach yields an average 99.8% accuracy for heath state classification. To validate the effectiveness of the proposed approach, it is compared to conventional non-linear dimensionality reduction techniques, as well as those without feature selection schemes. Experimental results show that the proposed approach outperforms conventional non-linear dimensionality reduction techniques, achieving at least 2.55% higher classification accuracy.


Introduction
Metallic equipment is used in a multitude of applications in day-to-day life [1].Within the oil and gas industry (e.g., gasoline, liquid petroleum gas, etc. [2,3]), spherical metal tanks are frequently used for fluid containment due to the potential benefits associated with the shape.As the use of spherical tanks across numerous industries increases, so also does the number of accidents associated with these tanks.Such accidents are primarily due to issues such as corrosion, fatigue cracking, bad installation, etc. [4].Specifically, fatigue cracking leads to leaks and spills from the tank, which may lead to fatal accidents.To avoid such accidents, improved safety precautions and maintenance of the spherical tank is necessary [3].Qualitative and quantitative evaluations to reduce the risks associated with spherical tanks are needed [3,5].A reliable crack detection method for a spherical tank is comprised of the following: extraction of the features associated with the health states of the spherical tank, optimal features subset selection, and fault detection.
The existing crack identification algorithms for spherical tanks consist of monitoring incoming signals.However, there is no intelligent, or automated, crack classification method in place for this purpose.In this study, the primary focus is crack classification for the spherical tank, measured through Energies 2019, 12, 991 2 of 14 acoustic emission signals.Detection of cracks in their early stages enables necessary measures to be undertaken in a timely fashion, thereby reducing accident occurrence.Acoustic emission (AE) signals are a promising nondestructive technology, capable of providing the information required for crack classification in the incipient stages.Compared to other nondestructive methods, AE is an economical and efficient alternative for recording the data associated with the health state of an object [6].Additionally, the low energy signals found in AE signals can provide underlying information for substantial data-driven fault identification approaches [7,8].Due to these benefits, AE signals are used to record data and develop a data-driven crack classification model for spherical tanks.
Traditional data-driven fault identification methodologies rely on two important procedures: handcrafted feature extraction utilizing domain expertise; and detection of the health types using the extracted features.Signal-based health state diagnosis approaches rely primarily on the spectral analysis of the signals [9].The choice of a signal analysis technique to extract discriminant information from the signals also has an impact in the performance of the fault classification process [10].Therefore, in this study, the AE signals are analyzed in different domains to explore the detailed intrinsic information contained in the signals.The advantage of analyzing the signals in different domains is the acquisition of multi-domain knowledge of the signals, which would otherwise not be possible using a single domain analysis.The multi-domain analysis enhances the performance of the classification model.To reduce the necessity of domain knowledge expertise, the feature extraction process is automated using various deep learning approaches [6,[11][12][13].Though these approaches decrease the feature design process for the signals, unique health patterns are still required due to the limited amount of data, which is otherwise not sufficient for the extraction of meaningful features with neural network algorithms.
In addition to these issues, some studies [14][15][16][17] describe the importance of denoising the collected signals due to the presence of noise.Thresholding, wavelet transform, and empirical mode decomposition are the most popular signal de-noising methods [14][15][16].The efficacy of these methods deteriorates in the case of short-time transient signals.In a recent study [17], a blind source separation (BSS) technique was proposed to mitigate the challenges of these methods, i.e., threshold value selection [14], addressing the overlapping frequencies issue in wavelet denoising [15], and tackling the mode mixing problem in empirical mode decompositions [16].It overcomes such challenges, but its performance deteriorates for the nonlinear signals from the spherical tank due to its imprecise source estimation and low separation ability [18].Thus, the selection of an improper denoising algorithm may result in the loss of useful information and lead to unsatisfactory diagnostic performances.In this study, a hybrid feature pool is designed by considering features extracted from different domains through different techniques, so that the feature pool is enriched in useful information that enhances the final classification performance.
In this study, a data-driven hybrid feature pool mechanism is considered for identifying single faults.The hybrid feature pool creation process is divided into two parts: (a) feature extraction by analyzing the AE signals in different domains; and (b) a genetic algorithm (GA)-based feature selection for determination of the best optimal feature's subset.In AE signals, signal parameters such as amplitude, risetime duration, and counts are very important for the exploration of salient information from the signal [11,19].Moreover, statistical features from both time and frequency domains ensure that the feature pool contains the detailed salient information about the signals, which are acquired from the spherical tank in different domains.By combining these AE signal features, a hybrid feature pool is designed.The designed hybrid feature pool has high dimensions and may contain redundant information that affects the performance of the classifier.For dimensionality reduction and the avoidance of redundancy in the hybrid feature set, GA is used for automatic feature subset selection through the optimal heuristic search approach (based on evaluation computation theory) [20].As a result, this feature selection process is a combination of domain knowledge and an optimal selection process to classify the two types of health condition.Finally, the selected features are passed to the k-nearest neighbor (k-NN) classifier as an input to decide the health condition.The main contributions of this work can be summarized as follows: (1) a hybrid feature pool is designed by combining the traditional features associated with the AE signals and statistical parameters from the time and frequency domains to effectively provide intrinsic and class-wise information of the signals; and (2) a feature selection mechanism is designed to determine the optimal feature subset from the hybrid feature pool by using a GA-based heuristic search approach.In the end, a k-NN classifier is applied to classify the health state, using the selected features as input.
The remainder of the paper is structured as follows.Section 2 provides details of the methodology, including the AE data acquisition system.The analysis of the experimental results and discussions are provided in Section 3, and the paper is finally concluded in Section 4.

Methodology
In this study, the main objective is to classify the health state of a spherical tank using AE signals.In Figure 1, a block diagram of the overall process is given.The proposed approach is divided into four major blocks: (1) data collection from a real multisensory testbed, (2) hybrid feature pool, (3) discriminant feature selection by a genetic algorithm, and (4) k-nearest neighbor-based classification.The main contributions of this work can be summarized as follows: (1) a hybrid feature pool is designed by combining the traditional features associated with the AE signals and statistical parameters from the time and frequency domains to effectively provide intrinsic and class-wise information of the signals; and (2) a feature selection mechanism is designed to determine the optimal feature subset from the hybrid feature pool by using a GA-based heuristic search approach.In the end, a k-NN classifier is applied to classify the health state, using the selected features as input.
The remainder of the paper is structured as follows.Section 2 provides details of the methodology, including the AE data acquisition system.The analysis of the experimental results and discussions are provided in Section 3, and the paper is finally concluded in Section 4.

Methodology
In this study, the main objective is to classify the health state of a spherical tank using AE signals.
In Figure 1, a block diagram of the overall process is given.The proposed approach is divided into four major blocks: (1) data collection from a real multisensory testbed, (2) hybrid feature pool, (3) discriminant feature selection by a genetic algorithm, and (4) k-nearest neighbor-based classification.

Experimental Testbed and Dataset Acquisition
To validate the efficiency of the newly developed AE-based spherical tank crack classification method, tests were performed using a data acquisition system based on the industrial norm provided in ASME BPVC.V-2015 (American Society of Mechanical Engineers (ASME) Boiler & Pressure Vessel Code (BPVC)), as well as a recent study on spherical tank fault diagnosis [21].A spherical tank consisting of carbon steel (A283 grade C) was used as a testbed to collect the AE signals, as shown in Figure 2. Additionally, a schematic diagram of the self-designed testbed is presented in Figure 3.In the diagram, the locations of a 3 mm pinhole crack (tank bottom) and the four separate channels are clearly visible.A pencil lead test was performed to produce a guided wave through the tank.A peripheral component interconnect bus (PCI-2)-based data acquisition (DAQ) device [22], connected with wideband differential AE sensors (WDI-AST) [23], was used for recording the AE signals [24].The data acquisition system and channel (sensor) arrangement during the experiment is illustrated in Figures 3 and 4. The particulars of the physical sensors and the PCI board are provided in Table 1.

Experimental Testbed and Dataset Acquisition
To validate the efficiency of the newly developed AE-based spherical tank crack classification method, tests were performed using a data acquisition system based on the industrial norm provided in ASME BPVC.V-2015 (American Society of Mechanical Engineers (ASME) Boiler & Pressure Vessel Code (BPVC)), as well as a recent study on spherical tank fault diagnosis [21].A spherical tank consisting of carbon steel (A283 grade C) was used as a testbed to collect the AE signals, as shown in Figure 2. Additionally, a schematic diagram of the self-designed testbed is presented in Figure 3.In the diagram, the locations of a 3 mm pinhole crack (tank bottom) and the four separate channels are clearly visible.A pencil lead test was performed to produce a guided wave through the tank.A peripheral component interconnect bus (PCI-2)-based data acquisition (DAQ) device [22], connected with wideband differential AE sensors (WDI-AST) [23], was used for recording the AE signals [24].The data acquisition system and channel (sensor) arrangement during the experiment is illustrated in Figures 3 and 4. The particulars of the physical sensors and the PCI board are provided in Table 1.

Hybrid Feature Pool
It is very difficult to obtain intrinsic information for different health types from a raw signal.To create the health condition-based feature matrix, a hybrid feature pool is designed.The feature pool is created from the features obtained from three different domains, i.e., (1)

AE Signal Features
For the AE features, the amplitude, rise time, duration, and counts of the signals are calculated.Amplitude determines the detectability of the signal.Thus, for the first feature (F1), the amplitude of the AE signal is considered.To determine the other three features, a threshold value is needed.Therefore, the root mean square (RMS) value of the signal is calculated to determine the threshold value and extract the remaining AE features, i.e., the rise-time (F2) and duration (F3) of the signal.Rise time refers to the time delay between the first threshold level crossing and the signal peak amplitude.This constraint is linked to the transmission of the wave between the source of the AE event and the sensor.Similarly, the duration refers to the time between the first and last threshold crossing.Lastly, the counts are calculated (F4) [25].The details of the AE features are illustrated in Figure 6.

AE Signal Features
For the AE features, the amplitude, rise time, duration, and counts of the signals are calculated.Amplitude determines the detectability of the signal.Thus, for the first feature (F1), the amplitude of the AE signal is considered.To determine the other three features, a threshold value is needed.Therefore, the root mean square (RMS) value of the signal is calculated to determine the threshold value and extract the remaining AE features, i.e., the rise-time (F2) and duration (F3) of the signal.Rise time refers to the time delay between the first threshold level crossing and the signal peak amplitude.This constraint is linked to the transmission of the wave between the source of the AE event and the sensor.Similarly, the duration refers to the time between the first and last threshold crossing.Lastly, the counts are calculated (F4) [25].The details of the AE features are illustrated in Figure 6.

Classical Statistical Features
After the calculation of traditional AE features, the classical statistical features extracted from both time and frequency domains are considered.The key concept of such variety in the feature extraction procedure is to include all necessary information about all types of health conditions.These features are considered as discriminative given that there is a substantial variation in the magnitude of the signal when impulses appear due to a crack in the spherical tank.From the time-domain, the extracted statistical feature parameters are: root mean square (F5), square mean root (F6), peak to

Classical Statistical Features
After the calculation of traditional AE features, the classical statistical features extracted from both time and frequency domains are considered.The key concept of such variety in the feature extraction procedure is to include all necessary information about all types of health conditions.These features are considered as discriminative given that there is a substantial variation in the magnitude of the signal when impulses appear due to a crack in the spherical tank.From the time-domain, the extracted statistical feature parameters are: root mean square (F5), square mean root (F6), peak to peak (F7), kurtosis (F8), skewness (F9), kurtosis factor (F10), 5th normalized moment (F11), crest factor (F12), impulse factor (F13), shape factor (F14), and 6th normalized moment (F15).These 11 time-domain features provide the statistical details about the nature of the signals and were discovered to be relatively good quality features cracks, due to their sensitivities [26,27].
Moreover, to create a robust feature pool that contains the maximum amount of information from different domains, statistical features from the frequency domain are also considered.The features from the frequency spectrum obtained through fast Fourier transformation (FFT) provide additional information regarding the PV crack [28,29].The features extracted from the frequency domain are: RMS of frequency (F16), root variance of frequency (F17), and mean frequency (F18).Fourteen extracted statistical features from the time-domain and frequency-domain are mathematically described in Table 2.

Feature
Equation Feature Equation Feature Equation Here, X is the original AE signal and F is the frequency domain of the signal X.N denotes the total number of samples of the signal.

Feature Selection by Genetic Algorithm (GA)
The hybrid feature pool contains traditional AE features along with the time and frequency domain-based statistical features.Thus, the dimensionality of the feature vectors is high.Moreover, due to the large number of features obtained from the four channels in the hybrid feature pool, data can be redundant [30].Therefore, it is necessary to determine the optimal feature subset to obtain discriminant information regarding the spherical tank health condition.Through different search techniques, i.e., complete, sequential, and heuristic searches, the optimal subset of features can be determined [19,30].However, the brute-force mechanism of the complete search adds additional computational complexity, and the sequential approach gives no guarantee that the best optimal feature subset is selected.On the other hand, using a heuristic search, the genetic approach with GA provides a balance between optimal selection and computational complexity.Therefore, in this study, GA is considered for the selection of the optimal feature subset from the hybrid feature pool.
Based on the evaluation theory, i.e., selection, crossover, mutation and replacement, GA determines the best combination of features with the most intrinsic class-wise information.To find the best feature combination that creates separability among classes, the degree of class separation (DCS) is calculated by Equation (1): where ICS is the inter class separability parameter used to define the distance among different classes.Similarly, WCC is the within class closeness that defines the closeness of the features within the same class.The Euclidian distance, given in (2), is used to find the distance between two vectors.
A high rating for DCS is achieved when ICS is maximized and/or WCC is minimized.The ICS is determined based on the average distance of each feature vector of various classes with Equation (3).
Similarly, WCC is obtained through the average distance of each feature vector of the same class by Equation ( 4).
In these equations, N defines the number of classes.In this study, the experiment is conducted based on two types of health conditions, i.e., normal and faulty.Therefore, N = 2. Additionally, n NC defines the feature vector for the normal condition, n FC defines the feature vector for the faulty condition, and D i,j is the Euclidian distance, derived from (2).

Fault Classification Using k-Nearest Neighbor (k-NN)
The proposed hybrid feature selection model selects the optimal feature matrix for health state classification.In this study, to classify the optimal feature subsets into respective classes, a k-NN classifier is utilized.The k-NN is one of the most used classifiers in classification problems due to its simple architecture, requiring less computational complexity [7,30,31].The k-NN classifies samples depending on the votes of the k-nearest neighbors, which are defined by distance parameters [30,32].In k-NN, there is no specific training phase before classification.Instead, any efforts to simplify or extract the information is made upon classification.The entire training data set remains in the memory during the training process.The computational complexity of the classifier is very high when used with high-dimensional feature vectors, unless some data dimensionality reduction technique is applied before the classification.For this reason, in this study, we apply a GA-based optimal feature subset selection algorithm to reduce the dimensions of the data.This approach selects the most optimal feature subset from the original feature vectors for k-NN to perform the classification task.A visual explanation of the k-NN algorithm is given in Figure 7.
memory during the training process.The computational complexity of the classifier is very high when used with high-dimensional feature vectors, unless some data dimensionality reduction technique is applied before the classification.For this reason, in this study, we apply a GA-based optimal feature subset selection algorithm to reduce the dimensions of the data.This approach selects the most optimal feature subset from the original feature vectors for k-NN to perform the classification task.A visual explanation of the k-NN algorithm is given in Figure 7. Thus, in k-NN, there are two important parameters that must be selected to complete the classification task (i.e., the optimal value of k that defines the number of neighbors and the distance matric).The optimal value of k can be determined arbitrarily or through a cross-validation process.In this study, we arbitrarily decide the value of k, and then change the value of k to perform cross validation.We validate the output of the classifier by changing the value of k to determine the optimal option for k.The Euclidian distance metric is determined from Equation (2).

Dataset Description
For an in-depth analysis of the proposed crack classification scheme, the experimental analysis and comparative discussion are presented in this section.In this study, AE signals acquired from a spherical tank were used to conduct the experiment.The 0.1 s signals, with 1 MHz sampling frequency, were recorded under both normal and crack conditions of the spherical tank.The signals were divided into training and test datasets, of which 60% of the data was considered for training the model and 40% was used for testing.The details of the dataset are provided in Table 3. From Figure 7, we can see that the new sample (inside of the circle, light blue color) should be classified either as class A (red square) or as class B (dark purple star).If k = 3, then the new sample belongs to class B, because the density of class B is high within the circle, i.e., there are two stars (class B sample) and one square (class A sample) inside the second circle.If we randomly assign k = 5, then the sample belongs to class A, because 3 samples from class A and 2 samples from class B are inside the outermost black dotted circle.Thus, in k-NN, there are two important parameters that must be selected to complete the classification task (i.e., the optimal value of k that defines the number of neighbors and the distance matric).The optimal value of k can be determined arbitrarily or through a cross-validation process.In this study, we arbitrarily decide the value of k, and then change the value of k to perform cross validation.We validate the output of the classifier by changing the value of k to determine the optimal option for k.The Euclidian distance metric is determined from Equation (2).

Dataset Description
For an in-depth analysis of the proposed crack classification scheme, the experimental analysis and comparative discussion are presented in this section.In this study, AE signals acquired from a spherical tank were used to conduct the experiment.The 0.1 s signals, with 1 MHz sampling frequency, were recorded under both normal and crack conditions of the spherical tank.The signals were divided into training and test datasets, of which 60% of the data was considered for training the model and 40% was used for testing.The details of the dataset are provided in Table 3.In Figure 8, the time-domain waveforms of the AE signals, with their corresponding frequency-domain spectra for each condition, are presented.

Normal Condition (NC)
In Figure 8, the time-domain waveforms of the AE signals, with their corresponding frequencydomain spectra for each condition, are presented.

Performance Analysis of the Discriminant Feature Selection by GA from the Hybrid Feature Pool
A hybrid feature pool creation mechanism is proposed in this study to acquire the comprehensive salient information from the AE signals of a spherical tank for a given health condition.In total, eighteen different features are used to create a hybrid feature pool.Of these eighteen features, four (F1-F4) are viewed as traditional AE features (illustrated in Figure 6), eleven (F5-F15) are statistical time-domain features, and three (F16-F18) are statistical frequency-domain features.The combination of features from different domains provides valuable intrinsic information associated with different health conditions.To reduce the dimensions of the hybrid feature pool, a heuristic search-based GA is applied with 1000 generations to select the optimal two feature subsets for the creation of the final feature matrix.In Figure 9, the optimal features (F1, F5) selected by the GA are presented in a 2D plot.F1 (amplitude) belongs to the traditional AE features and F5 (RMS) is the time-domain statistical feature set.This optimal subset indicates the necessity of selecting both AE signal features and statistical features to create the hybrid feature pool.

Performance Analysis of the Discriminant Feature Selection by GA from the Hybrid Feature Pool
A hybrid feature pool creation mechanism is proposed in this study to acquire the comprehensive salient information from the AE signals of a spherical tank for a given health condition.In total, eighteen different features are used to create a hybrid feature pool.Of these eighteen features, four (F1-F4) are viewed as traditional AE features (illustrated in Figure 6), eleven (F5-F15) are statistical time-domain features, and three (F16-F18) are statistical frequency-domain features.The combination of features from different domains provides valuable intrinsic information associated with different health conditions.To reduce the dimensions of the hybrid feature pool, a heuristic search-based GA is applied with 1000 generations to select the optimal two feature subsets for the creation of the final feature matrix.In Figure 9, the optimal features (F1, F5) selected by the GA are presented in a 2D plot.F1 (amplitude) belongs to the traditional AE features and F5 (RMS) is the time-domain statistical feature set.This optimal subset indicates the necessity of selecting both AE signal features and statistical features to create the hybrid feature pool.

Performance Analysis of k-Nearest Neighbor Algorithm
The optimal feature subsets selected by GA are applied to the k-NN algorithm for classification into their respective classes (i.e., normal and crack).From the GA-based selected features presented in Figure 9, the separability of the two classes is clearly visible.As a result, placement of an optimal boundary line between the two classes for the k-NN classifier is relatively clear.To draw the decision boundary and evaluate the classification performance of the k-NN classifier, we consider arbitrarily

Performance Analysis of k-Nearest Neighbor Algorithm
The optimal feature subsets selected by GA are applied to the k-NN algorithm for classification into their respective classes (i.e., normal and crack).From the GA-based selected features presented in Figure 9, the separability of the two classes is clearly visible.As a result, placement of an optimal boundary line between the two classes for the k-NN classifier is relatively clear.To draw the decision boundary and evaluate the classification performance of the k-NN classifier, we consider arbitrarily value of k (number of neighbors) ranging from 1 to 8. The experiment is repeated multiple times to determine the best value for k.After determining the best value of k based on the test accuracy result, we draw the boundary line for classification.From Figure 10a, the optimal value for k is 5 (for k = 5, the accuracy is 100%).Based on this optimal value, the decision boundary is drawn in Figure 10b.The classification performance is obtained by Equation (5).

Avg._classification_accuracy =
True_Positive + True _Negetive Total_number_of _samples (5) To further ensure cross-validation of the result analysis, and to determine the final classification accuracy, we analyzed the results of 10 experiments using the proposed method with an optimal value of k = 5.The classification results for the 10 experiments are given in Figure 11.The final classification accuracy was determined with Equation ( 6).In our experiment, the final classification accuracy was 99.8%.

Comparison Analysis
Several comparisons are conducted to validate the efficiency and robustness of the proposed approach.The comparisons are made using the crack classification models when different dimensionality reduction algorithms are used.In the proposed crack classification methodology, the GA-based feature selection module is replaced with principal component analysis (PCA) and tstochastic neighbor (t-SNE) for the selection of optimal features subsets.The optimal features subset selected through different dimensionality reduction algorithms is provided to the same k-NN classifier to evaluate the classification performance.Moreover, a comparison is made with the published study (PCA + KNN) presented in [27], which was used for fault diagnosis of a mechanical system.In addition, the proposed study is compared with a recent study [17] that uses BSS-based signal denoising and wavelet features for data classification.The comparison results indicate that the proposed approach (GA + k-NN), designed for the crack classification of a spherical tank, outperforms conventional state-of-the-art methods.The (GA + k-NN) approach outperforms all features, (PCA + k-NN) [27], and (t-SNE + k-NN) by 15.35%, 8.01%, and 43.55%, respectively.

Comparison Analysis
Several comparisons are conducted to validate the efficiency and robustness of the proposed approach.The comparisons are made using the crack classification models when different dimensionality reduction algorithms are used.In the proposed crack classification methodology, the GA-based feature selection module is replaced with principal component analysis (PCA) and t-stochastic neighbor (t-SNE) for the selection of optimal features subsets.The optimal features subset selected through different dimensionality reduction algorithms is provided to the same k-NN classifier to evaluate the classification performance.Moreover, a comparison is made with the published study (PCA + KNN) presented in [27], which was used for fault diagnosis of a mechanical system.In addition, the proposed study is compared with a recent study [17] that uses BSS-based signal denoising and wavelet features for data classification.The comparison results indicate that the proposed approach (GA + k-NN), designed for the crack classification of a spherical tank, outperforms conventional state-of-the-art methods.The (GA + k-NN) approach outperforms all features, (PCA + k-NN) [27], and (t-SNE + k-NN) by 15.35%, 8.01%, and 43.55%, respectively.Additionally, it outperforms the BSS-based feature selection approach by 2.55%.The details of the results are given in Table 4.

Conclusions
This paper presented a crack classification model for a spherical tank based on an optimal hybrid feature pool creation with the help of a genetic algorithm (GA) and classification of the instances using a k-NN classifier.A hybrid feature pool was created by considering the features extracted from different domains (i.e., traditional AE features, statistical time domain features, and frequency domain features).The hybrid feature pool contains 18 features from different domains.A heuristic search-based genetic algorithm is then applied to remove the data redundancy and to reduce the dimensions of the original hybrid feature pool.The GA-based feature selection process extracts two optimal feature subsets.The optimal feature subset is provided to a k-NN classifier for data classification into the respective classes.The proposed crack classification model for the spherical tank yielded an average accuracy of 99.8%.To validate the effectiveness of the proposed model, results were compared to other

Figure 1 .
Figure 1.Block diagram of the proposed methodology.

Figure 1 .
Figure 1.Block diagram of the proposed methodology.

Figure 2 .
Figure 2. The testbed used to collect AE signals during the experiment.

Figure 2 .
Figure 2. The testbed used to collect AE signals during the experiment.

Figure 2 .
Figure 2. The testbed used to collect AE signals during the experiment.Figure 2. The testbed used to collect AE signals during the experiment.

Figure 2 . 15 Figure 3 .
Figure 2. The testbed used to collect AE signals during the experiment.Figure 2. The testbed used to collect AE signals during the experiment.

Figure 4 .
Figure 4. Setup of data acquisition devices and channels.

Figure 4 .
Figure 4. Setup of data acquisition devices and channels.Figure 4. Setup of data acquisition devices and channels.

Figure 4 .
Figure 4. Setup of data acquisition devices and channels.Figure 4. Setup of data acquisition devices and channels.
traditional AE features, (2) statistical time-domain features, and (3) statistical frequency-domain features.According to the industry norm provided by ASME BPVC.V-2015, AE signals provide intrinsic information about a mechanical object.To increase the reliability of the feature extraction pipeline, traditional AE features are considered in addition to typical statistical features (time-domain and frequency domain).The process of hybrid feature pool creation is illustrated in the block diagram presented in Figure 5.
is created from the features obtained from three different domains, i.e., (1) traditional AE features, (2) statistical time-domain features, and (3) statistical frequency-domain features.According to the industry norm provided by ASME BPVC.V-2015, AE signals provide intrinsic information about a mechanical object.To increase the reliability of the feature extraction pipeline, traditional AE features are considered in addition to typical statistical features (time-domain and frequency domain).The process of hybrid feature pool creation is illustrated in the block diagram presented in Figure5.

Figure 7 .
Figure 7. Illustration of the k-NN algorithm for a binary classification problem.

Figure 7 .
Figure 7. Illustration of the k-NN algorithm for a binary classification problem.

Energies 2019 , 15 Figure 9 .
Figure 9. Optimal feature set selected by GA for different health conditions.

Figure 9 .
Figure 9. Optimal feature set selected by GA for different health conditions.

15 Figure 10 .
Figure 10.(a) Different classification accuracies as a function of the number of neighbors (k).The optimal value is k = 5.(b) Drawing of the decision boundary for separation of the two classes after determining the optimal value for k.

Figure 10 .
Figure 10.(a) Different classification accuracies as a function of the number of neighbors (k).The optimal value is k = 5.(b) Drawing of the decision boundary for separation of the two classes after determining the optimal value for k.

Figure 10 .
Figure 10.(a) Different classification accuracies as a function of the number of neighbors (k).The optimal value is k = 5.(b) Drawing of the decision boundary for separation of the two classes after determining the optimal value for k.

Figure 11 .
Figure 11.Classification performance of the proposed approach determined for 10 experiments.

Figure 11 .
Figure 11.Classification performance of the proposed approach determined for 10 experiments.

Table 1 .
Specifications of the proposed data acquisition system.

Table 1 .
Specifications of the proposed data acquisition system.

Table 1 .
Specifications of the proposed data acquisition system.

Table 2 .
Statistical features from time and frequency domains for the hybrid feature pool.

Table 3 .
Details of the health conditions considered for the experiment.

Table 4 .
Classification accuracy among different methods.