Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm

Hasan, Md Junayed; Kim, Jong-Myon

doi:10.3390/en12060991

Open AccessFeature PaperArticle

Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm

by

Md Junayed Hasan

and

Jong-Myon Kim

^*

Department of Electrical, Electronics and Computer Engineering, University of Ulsan, Ulsan 44610, Korea

^*

Author to whom correspondence should be addressed.

Energies 2019, 12(6), 991; https://doi.org/10.3390/en12060991

Submission received: 24 February 2019 / Revised: 9 March 2019 / Accepted: 11 March 2019 / Published: 14 March 2019

(This article belongs to the Special Issue Fault Diagnosis and Fault-Tolerant Control)

Download

Browse Figures

Versions Notes

Abstract

:

Fault detection in metallic structures requires a detailed and discriminative feature pool creation mechanism to develop an effective condition monitoring system. Traditional fault detection methods incorporate handcrafted features either from the time, frequency or time-frequency domains. To explore the salient information provided by the acoustic emission (AE) signals, a hybrid of feature pool creation and an optimal features subset selection mechanism is proposed for crack detection in a spherical tank. The optimal hybrid feature pool creation process is composed of two major parts: (1) extraction of statistical features from time and frequency domains, as well as extraction of traditional features associated with the AE signals; and (2) genetic algorithm (GA)-based optimal features subset selection. The optimal features subset is then provided to the k-nearest neighbor (k-NN) classifier to distinguish between normal (NC) and crack conditions (CC). Experimental results show that the proposed approach yields an average 99.8% accuracy for heath state classification. To validate the effectiveness of the proposed approach, it is compared to conventional non-linear dimensionality reduction techniques, as well as those without feature selection schemes. Experimental results show that the proposed approach outperforms conventional non-linear dimensionality reduction techniques, achieving at least 2.55% higher classification accuracy.

Keywords:

acoustic emissions; fault diagnosis; genetic algorithm; hybrid feature pool; k-NN classifier; spherical tank; statistical features

1. Introduction

Metallic equipment is used in a multitude of applications in day-to-day life [1]. Within the oil and gas industry (e.g., gasoline, liquid petroleum gas, etc. [2,3]), spherical metal tanks are frequently used for fluid containment due to the potential benefits associated with the shape. As the use of spherical tanks across numerous industries increases, so also does the number of accidents associated with these tanks. Such accidents are primarily due to issues such as corrosion, fatigue cracking, bad installation, etc. [4]. Specifically, fatigue cracking leads to leaks and spills from the tank, which may lead to fatal accidents. To avoid such accidents, improved safety precautions and maintenance of the spherical tank is necessary [3]. Qualitative and quantitative evaluations to reduce the risks associated with spherical tanks are needed [3,5]. A reliable crack detection method for a spherical tank is comprised of the following: extraction of the features associated with the health states of the spherical tank, optimal features subset selection, and fault detection.

The existing crack identification algorithms for spherical tanks consist of monitoring incoming signals. However, there is no intelligent, or automated, crack classification method in place for this purpose. In this study, the primary focus is crack classification for the spherical tank, measured through acoustic emission signals. Detection of cracks in their early stages enables necessary measures to be undertaken in a timely fashion, thereby reducing accident occurrence. Acoustic emission (AE) signals are a promising nondestructive technology, capable of providing the information required for crack classification in the incipient stages. Compared to other nondestructive methods, AE is an economical and efficient alternative for recording the data associated with the health state of an object [6]. Additionally, the low energy signals found in AE signals can provide underlying information for substantial data-driven fault identification approaches [7,8]. Due to these benefits, AE signals are used to record data and develop a data-driven crack classification model for spherical tanks.

Traditional data-driven fault identification methodologies rely on two important procedures: handcrafted feature extraction utilizing domain expertise; and detection of the health types using the extracted features. Signal-based health state diagnosis approaches rely primarily on the spectral analysis of the signals [9]. The choice of a signal analysis technique to extract discriminant information from the signals also has an impact in the performance of the fault classification process [10]. Therefore, in this study, the AE signals are analyzed in different domains to explore the detailed intrinsic information contained in the signals. The advantage of analyzing the signals in different domains is the acquisition of multi-domain knowledge of the signals, which would otherwise not be possible using a single domain analysis. The multi-domain analysis enhances the performance of the classification model. To reduce the necessity of domain knowledge expertise, the feature extraction process is automated using various deep learning approaches [6,11,12,13]. Though these approaches decrease the feature design process for the signals, unique health patterns are still required due to the limited amount of data, which is otherwise not sufficient for the extraction of meaningful features with neural network algorithms.

In addition to these issues, some studies [14,15,16,17] describe the importance of denoising the collected signals due to the presence of noise. Thresholding, wavelet transform, and empirical mode decomposition are the most popular signal de-noising methods [14,15,16]. The efficacy of these methods deteriorates in the case of short-time transient signals. In a recent study [17], a blind source separation (BSS) technique was proposed to mitigate the challenges of these methods, i.e., threshold value selection [14], addressing the overlapping frequencies issue in wavelet denoising [15], and tackling the mode mixing problem in empirical mode decompositions [16]. It overcomes such challenges, but its performance deteriorates for the nonlinear signals from the spherical tank due to its imprecise source estimation and low separation ability [18]. Thus, the selection of an improper denoising algorithm may result in the loss of useful information and lead to unsatisfactory diagnostic performances. In this study, a hybrid feature pool is designed by considering features extracted from different domains through different techniques, so that the feature pool is enriched in useful information that enhances the final classification performance.

In this study, a data-driven hybrid feature pool mechanism is considered for identifying single faults. The hybrid feature pool creation process is divided into two parts: (a) feature extraction by analyzing the AE signals in different domains; and (b) a genetic algorithm (GA)-based feature selection for determination of the best optimal feature’s subset. In AE signals, signal parameters such as amplitude, risetime duration, and counts are very important for the exploration of salient information from the signal [11,19]. Moreover, statistical features from both time and frequency domains ensure that the feature pool contains the detailed salient information about the signals, which are acquired from the spherical tank in different domains. By combining these AE signal features, a hybrid feature pool is designed. The designed hybrid feature pool has high dimensions and may contain redundant information that affects the performance of the classifier. For dimensionality reduction and the avoidance of redundancy in the hybrid feature set, GA is used for automatic feature subset selection through the optimal heuristic search approach (based on evaluation computation theory) [20]. As a result, this feature selection process is a combination of domain knowledge and an optimal selection process to classify the two types of health condition. Finally, the selected features are passed to the k-nearest neighbor (k-NN) classifier as an input to decide the health condition.

The main contributions of this work can be summarized as follows: (1) a hybrid feature pool is designed by combining the traditional features associated with the AE signals and statistical parameters from the time and frequency domains to effectively provide intrinsic and class-wise information of the signals; and (2) a feature selection mechanism is designed to determine the optimal feature subset from the hybrid feature pool by using a GA-based heuristic search approach. In the end, a k-NN classifier is applied to classify the health state, using the selected features as input.

The remainder of the paper is structured as follows. Section 2 provides details of the methodology, including the AE data acquisition system. The analysis of the experimental results and discussions are provided in Section 3, and the paper is finally concluded in Section 4.

2. Methodology

In this study, the main objective is to classify the health state of a spherical tank using AE signals. In Figure 1, a block diagram of the overall process is given. The proposed approach is divided into four major blocks: (1) data collection from a real multisensory testbed, (2) hybrid feature pool, (3) discriminant feature selection by a genetic algorithm, and (4) k–nearest neighbor-based classification.

2.1. Experimental Testbed and Dataset Acquisition

To validate the efficiency of the newly developed AE-based spherical tank crack classification method, tests were performed using a data acquisition system based on the industrial norm provided in ASME BPVC.V-2015 (American Society of Mechanical Engineers (ASME) Boiler & Pressure Vessel Code (BPVC)), as well as a recent study on spherical tank fault diagnosis [21]. A spherical tank consisting of carbon steel (A283 grade C) was used as a testbed to collect the AE signals, as shown in Figure 2. Additionally, a schematic diagram of the self-designed testbed is presented in Figure 3. In the diagram, the locations of a 3 mm pinhole crack (tank bottom) and the four separate channels are clearly visible. A pencil lead test was performed to produce a guided wave through the tank. A peripheral component interconnect bus (PCI-2)-based data acquisition (DAQ) device [22], connected with wideband differential AE sensors (WDI-AST) [23], was used for recording the AE signals [24]. The data acquisition system and channel (sensor) arrangement during the experiment is illustrated in Figure 3 and Figure 4. The particulars of the physical sensors and the PCI board are provided in Table 1.

2.2. Hybrid Feature Pool

It is very difficult to obtain intrinsic information for different health types from a raw signal. To create the health condition-based feature matrix, a hybrid feature pool is designed. The feature pool is created from the features obtained from three different domains, i.e., (1) traditional AE features, (2) statistical time-domain features, and (3) statistical frequency-domain features. According to the industry norm provided by ASME BPVC.V-2015, AE signals provide intrinsic information about a mechanical object. To increase the reliability of the feature extraction pipeline, traditional AE features are considered in addition to typical statistical features (time-domain and frequency domain). The process of hybrid feature pool creation is illustrated in the block diagram presented in Figure 5.

2.3. AE Signal Features

For the AE features, the amplitude, rise time, duration, and counts of the signals are calculated. Amplitude determines the detectability of the signal. Thus, for the first feature (F1), the amplitude of the AE signal is considered. To determine the other three features, a threshold value is needed. Therefore, the root mean square (RMS) value of the signal is calculated to determine the threshold value and extract the remaining AE features, i.e., the rise-time (F2) and duration (F3) of the signal. Rise time refers to the time delay between the first threshold level crossing and the signal peak amplitude. This constraint is linked to the transmission of the wave between the source of the AE event and the sensor. Similarly, the duration refers to the time between the first and last threshold crossing. Lastly, the counts are calculated (F4) [25]. The details of the AE features are illustrated in Figure 6.

2.4. Classical Statistical Features

After the calculation of traditional AE features, the classical statistical features extracted from both time and frequency domains are considered. The key concept of such variety in the feature extraction procedure is to include all necessary information about all types of health conditions. These features are considered as discriminative given that there is a substantial variation in the magnitude of the signal when impulses appear due to a crack in the spherical tank. From the time-domain, the extracted statistical feature parameters are: root mean square (F5), square mean root (F6), peak to peak (F7), kurtosis (F8), skewness (F9), kurtosis factor (F10), 5th normalized moment (F11), crest factor (F12), impulse factor (F13), shape factor (F14), and 6th normalized moment (F15). These 11 time-domain features provide the statistical details about the nature of the signals and were discovered to be relatively good quality features for cracks, due to their sensitivities [26,27].

Moreover, to create a robust feature pool that contains the maximum amount of information from different domains, statistical features from the frequency domain are also considered. The features from the frequency spectrum obtained through fast Fourier transformation (FFT) provide additional information regarding the PV crack [28,29]. The features extracted from the frequency domain are: RMS of frequency (F16), root variance of frequency (F17), and mean frequency (F18). Fourteen extracted statistical features from the time-domain and frequency-domain are mathematically described in Table 2.

2.5. Feature Selection by Genetic Algorithm (GA)

The hybrid feature pool contains traditional AE features along with the time and frequency domain-based statistical features. Thus, the dimensionality of the feature vectors is high. Moreover, due to the large number of features obtained from the four channels in the hybrid feature pool, data can be redundant [30]. Therefore, it is necessary to determine the optimal feature subset to obtain discriminant information regarding the spherical tank health condition. Through different search techniques, i.e., complete, sequential, and heuristic searches, the optimal subset of features can be determined [19,30]. However, the brute-force mechanism of the complete search adds additional computational complexity, and the sequential approach gives no guarantee that the best optimal feature subset is selected. On the other hand, using a heuristic search, the genetic approach with GA provides a balance between optimal selection and computational complexity. Therefore, in this study, GA is considered for the selection of the optimal feature subset from the hybrid feature pool.

Based on the evaluation theory, i.e., selection, crossover, mutation and replacement, GA determines the best combination of features with the most intrinsic class-wise information. To find the best feature combination that creates separability among classes, the degree of class separation (DCS) is calculated by Equation (1):

DCS = \frac{ICS}{WCC},

(1)

where ICS is the inter class separability parameter used to define the distance among different classes. Similarly, WCC is the within class closeness that defines the closeness of the features within the same class. The Euclidian distance, given in (2), is used to find the distance between two vectors.

D_{x, y} = \sqrt{\sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}}

(2)

A high rating for DCS is achieved when ICS is maximized and/or WCC is minimized. The ICS is determined based on the average distance of each feature vector of various classes with Equation (3).

ICS = \frac{1}{{}_{N}c_{2} \cdot n_{N C} \cdot n_{F C}} \sum_{i = 1}^{N} \sum_{j = i + 1}^{N} \sum_{k = 1}^{n_{N C}} \sum_{l = 1}^{n_{F C}} D_{i, j}

(3)

Similarly, WCC is obtained through the average distance of each feature vector of the same class by Equation (4).

WCC = \frac{1}{N \cdot n_{N C} \cdot n_{F C}} \sum_{i = 1}^{N} \sum_{j = 1}^{n_{N C}} \sum_{k = 1}^{n_{F C}} D_{i, j}

(4)

In these equations, N defines the number of classes. In this study, the experiment is conducted based on two types of health conditions, i.e., normal and faulty. Therefore, N = 2. Additionally,

n_{N C}

defines the feature vector for the normal condition,

n_{F C}

defines the feature vector for the faulty condition, and

D_{i, j}

is the Euclidian distance, derived from (2).

2.6. Fault Classification Using k-Nearest Neighbor (k-NN)

The proposed hybrid feature selection model selects the optimal feature matrix for health state classification. In this study, to classify the optimal feature subsets into respective classes, a k-NN classifier is utilized. The k-NN is one of the most used classifiers in classification problems due to its simple architecture, requiring less computational complexity [7,30,31]. The k-NN classifies samples depending on the votes of the k-nearest neighbors, which are defined by distance parameters [30,32]. In k-NN, there is no specific training phase before classification. Instead, any efforts to simplify or extract the information is made upon classification. The entire training data set remains in the memory during the training process. The computational complexity of the classifier is very high when used with high-dimensional feature vectors, unless some data dimensionality reduction technique is applied before the classification. For this reason, in this study, we apply a GA-based optimal feature subset selection algorithm to reduce the dimensions of the data. This approach selects the most optimal feature subset from the original feature vectors for k-NN to perform the classification task. A visual explanation of the k-NN algorithm is given in Figure 7.

From Figure 7, we can see that the new sample (inside of the circle, light blue color) should be classified either as class A (red square) or as class B (dark purple star). If k = 3, then the new sample belongs to class B, because the density of class B is high within the circle, i.e., there are two stars (class B sample) and one square (class A sample) inside the second circle. If we randomly assign k = 5, then the sample belongs to class A, because 3 samples from class A and 2 samples from class B are inside the outermost black dotted circle. Thus, in k-NN, there are two important parameters that must be selected to complete the classification task (i.e., the optimal value of k that defines the number of neighbors and the distance matric). The optimal value of k can be determined arbitrarily or through a cross-validation process. In this study, we arbitrarily decide the value of k, and then change the value of k to perform cross validation. We validate the output of the classifier by changing the value of k to determine the optimal option for k. The Euclidian distance metric is determined from Equation (2).

3. Experimental Result Analysis and Discussion

3.1. Dataset Description

For an in-depth analysis of the proposed crack classification scheme, the experimental analysis and comparative discussion are presented in this section. In this study, AE signals acquired from a spherical tank were used to conduct the experiment. The 0.1 s signals, with 1 MHz sampling frequency, were recorded under both normal and crack conditions of the spherical tank. The signals were divided into training and test datasets, of which 60% of the data was considered for training the model and 40% was used for testing. The details of the dataset are provided in Table 3.

In Figure 8, the time-domain waveforms of the AE signals, with their corresponding frequency-domain spectra for each condition, are presented.

3.2. Performance Analysis of the Discriminant Feature Selection by GA from the Hybrid Feature Pool

A hybrid feature pool creation mechanism is proposed in this study to acquire the comprehensive salient information from the AE signals of a spherical tank for a given health condition. In total, eighteen different features are used to create a hybrid feature pool. Of these eighteen features, four (F1–F4) are viewed as traditional AE features (illustrated in Figure 6), eleven (F5–F15) are statistical time-domain features, and three (F16–F18) are statistical frequency-domain features. The combination of features from different domains provides valuable intrinsic information associated with different health conditions. To reduce the dimensions of the hybrid feature pool, a heuristic search-based GA is applied with 1000 generations to select the optimal two feature subsets for the creation of the final feature matrix. In Figure 9, the optimal features (F1, F5) selected by the GA are presented in a 2D plot. F1 (amplitude) belongs to the traditional AE features and F5 (RMS) is the time-domain statistical feature set. This optimal subset indicates the necessity of selecting both AE signal features and statistical features to create the hybrid feature pool.

3.3. Performance Analysis of k-Nearest Neighbor Algorithm

The optimal feature subsets selected by GA are applied to the k-NN algorithm for classification into their respective classes (i.e., normal and crack). From the GA-based selected features presented in Figure 9, the separability of the two classes is clearly visible. As a result, placement of an optimal boundary line between the two classes for the k-NN classifier is relatively clear. To draw the decision boundary and evaluate the classification performance of the k-NN classifier, we consider arbitrarily value of k (number of neighbors) ranging from 1 to 8. The experiment is repeated multiple times to determine the best value for k. After determining the best value of k based on the test accuracy result, we draw the boundary line for classification. From Figure 10a, the optimal value for k is 5 (for k = 5, the accuracy is 100%). Based on this optimal value, the decision boundary is drawn in Figure 10b. The classification performance is obtained by Equation (5).

A v g ._classification_accuracy = \frac{True_Positive + True_Negetive}{Total_number_of_samples}

(5)

To further ensure cross-validation of the result analysis, and to determine the final classification accuracy, we analyzed the results of 10 experiments using the proposed method with an optimal value of k = 5. The classification results for the 10 experiments are given in Figure 11. The final classification accuracy was determined with Equation (6). In our experiment, the final classification accuracy was 99.8%.

Final_classification_accuracy = \frac{\sum Accuracy_of_each_experiment}{Total_number_of_experiments}

(6)

3.4. Comparison Analysis

Several comparisons are conducted to validate the efficiency and robustness of the proposed approach. The comparisons are made using the crack classification models when different dimensionality reduction algorithms are used. In the proposed crack classification methodology, the GA-based feature selection module is replaced with principal component analysis (PCA) and t-stochastic neighbor (t-SNE) for the selection of optimal features subsets. The optimal features subset selected through different dimensionality reduction algorithms is provided to the same k-NN classifier to evaluate the classification performance. Moreover, a comparison is made with the published study (PCA + KNN) presented in [27], which was used for fault diagnosis of a mechanical system. In addition, the proposed study is compared with a recent study [17] that uses BSS-based signal denoising and wavelet features for data classification. The comparison results indicate that the proposed approach (GA + k-NN), designed for the crack classification of a spherical tank, outperforms conventional state-of-the-art methods. The (GA + k-NN) approach outperforms all features, (PCA + k-NN) [27], and (t-SNE + k-NN) by 15.35%, 8.01%, and 43.55%, respectively. Additionally, it outperforms the BSS-based feature selection approach by 2.55%. The details of the results are given in Table 4.

4. Conclusions

This paper presented a crack classification model for a spherical tank based on an optimal hybrid feature pool creation with the help of a genetic algorithm (GA) and classification of the instances using a k-NN classifier. A hybrid feature pool was created by considering the features extracted from different domains (i.e., traditional AE features, statistical time domain features, and frequency domain features). The hybrid feature pool contains 18 features from different domains. A heuristic search-based genetic algorithm is then applied to remove the data redundancy and to reduce the dimensions of the original hybrid feature pool. The GA-based feature selection process extracts two optimal feature subsets. The optimal feature subset is provided to a k-NN classifier for data classification into the respective classes. The proposed crack classification model for the spherical tank yielded an average accuracy of 99.8%. To validate the effectiveness of the proposed model, results were compared to other state-of-the-art fault diagnosis algorithms, with and without incorporation of feature selection schemes. The comparison results showed that the proposed model outperforms conventional algorithms by providing at least 2.55% better classification accuracy. In the future, the proposed method can be further extended to identify multiple cracks involved in the spherical tank.

Author Contributions

M.J.H and J.-M.K. contributed equally to the conception of the idea, as well as implementing and analyzing the experimental results, and writing the manuscript.

Funding

This work was supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) and the Ministry of Trade, Industry & Energy (MOTIE) of the Republic of Korea (No. 20162220100050).

Conflicts of Interest

The authors declare no conflict of interest.

References

Saidur, R. A review on electrical motors energy use and energy savings. Renew. Sustain. Energy Rev. 2010, 14, 877–898. [Google Scholar] [CrossRef]
Barker, G. Baker Gas Storage Tank. In The Engineer’s Guide to Plant Layout and Piping Design for the Oil and Gas Industries; Gulf Professional Publishing: Houston, TX, USA, 2018; pp. 361–380. [Google Scholar]
Luo, T.; Wu, C.; Duan, L. Fishbone diagram and risk matrix analysis method and its application in safety assessment of natural gas spherical tank. J. Clean. Prod. 2018, 174, 296–304. [Google Scholar] [CrossRef]
Morofuji, K.; Tsui, N.; Yamada, M.; Maie, A.; Yuyama, S.; Li, Z.W. Quantitative Study of Acoustic Emission Due To Leaks From Water Tanks. Group 2003, 21, 213–222. [Google Scholar]
Korkmaz, K.A.; Sari, A.; Carhoglu, A.I. Seismic risk assessment of storage tanks in Turkish industrial facilities. J. Loss Prev. Process Ind. 2011, 24, 314–320. [Google Scholar] [CrossRef]
Li, W.; Dai, G.; Wang, Y.L.F. Study of Tank Acoustic Emission Testing Signals Analysis Method Based on Wavelet Neural Network. In Proceedings of the ASME 2011 Pressure Vessels and Piping Conference, Baltimore, MD, USA, 17–21 July 2011. [Google Scholar]
Pandya, D.H.; Upadhyay, S.H.; Harsha, S.P. Fault diagnosis of rolling element bearing with intrinsic mode function of acoustic emission data using APF-KNN. Exp. Syst. Appl. 2013, 40, 4137–4145. [Google Scholar] [CrossRef]
Niknam, S.A.; Songmene, V.; Au, Y.H.J. The use of acoustic emission information to distinguish between dry and lubricated rolling element bearings in low-speed rotating machines. Int. J. Adv. Manuf. Technol. 2013, 69, 2679–2689. [Google Scholar] [CrossRef]
Kang, M.; Kim, J.; Kim, J. High-Performance and Energy-Efficient Fault Diagnosis Using Effective Envelope Analysis Processing Unit. IEEE Trans. Power Electron. 2015, 30, 2763–2776. [Google Scholar] [CrossRef]
Amar, M.; Gondal, I.; Wilson, C. Vibration spectrum imaging: A novel bearing fault classification approach. IEEE Trans. Ind. Electron. 2015, 62, 494–502. [Google Scholar] [CrossRef]
Sohaib, M.; Kim, C.-H.; Kim, J.-M. A Hybrid Feature Model and Deep-Learning-Based Bearing Fault Diagnosis. Sensors 2017, 17, 2876. [Google Scholar] [CrossRef]
Islam, M.M.; Kim, J.M. Motor Bearing Fault Diagnosis Using Deep Convolutional Neural Networks with 2D Analysis of Vibration Sign; Springer International Publishing: New York, NY, USA, 2003; Volume 2671, ISBN 978-3-540-40300-5. [Google Scholar]
Tra, V.; Kim, J.; Khan, S.A.; Kim, J.-M. Bearing Fault Diagnosis under Variable Speed Using Convolutional Neural Networks and the Stochastic Diagonal Levenberg-Marquardt Algorithm. Sensors 2017, 17, 2834. [Google Scholar] [CrossRef]
Chen, S.W.; Chen, Y.H. Hardware design and implementation of a wavelet de-noising procedure for medical signal preprocessing. Sensors (Switzerland) 2015, 15, 26396–26414. [Google Scholar] [CrossRef]
Nguyen, P.; Kang, M.; Kim, J.M.; Ahn, B.H.; Ha, J.M.; Choi, B.K. Robust condition monitoring of rolling element bearings using de-noising and envelope analysis with signal decomposition techniques. Exp. Syst. Appl. 2015, 42, 9024–9032. [Google Scholar] [CrossRef]
Lei, Y.; Li, N.; Lin, J.; Wang, S. Fault diagnosis of rotating machinery based on an adaptive ensemble empirical mode decomposition. Sensors (Switzerland) 2013, 13, 16950–16964. [Google Scholar] [CrossRef]
Duong, B.-P.; Kim, J.-Y.; Kim, J.-M.; Sohaib, M.; Tra, V. Improving the Performance of Storage Tank Fault Diagnosis by Removing Unwanted Components and Utilizing Wavelet-Based Features. Entropy 2019, 21, 145. [Google Scholar]
He, J.; Song, Y.; Du, P.; Xu, L. Analysis of Single Channel Blind Source Separation Algorithm for Chaotic Signals. Math. Probl. Eng. 2018, 2018. [Google Scholar] [CrossRef]
Kang, M.; Kim, J.; Wills, L.M.; Kim, J.M. Time-varying and multiresolution envelope analysis and discriminative feature analysis for bearing fault diagnosis. IEEE Trans. Ind. Electron. 2015, 62, 7749–7761. [Google Scholar] [CrossRef]
Islam, M.; Sohaib, M.; Kim, J.; Kim, J.-M.; Islam, M.; Sohaib, M.; Kim, J.; Kim, J.-M. Crack Classification of a Pressure Vessel Using Feature Selection and Deep Learning Methods. Sensors 2018, 18, 4379. [Google Scholar] [CrossRef]
Liu, G.; Yu, Z.; Liang, X.; Ye, C. Vibration-Based Structural Damage Identification and Evaluation for Cylindrical Shells Using Modified Transfer Entropy Theory. J. Press. Vessel Technol. 2018, 140, 61204–61214. [Google Scholar] [CrossRef]
Physicalacoustics - pci 2. Available online: https://www.physicalacoustics.com/by-product/pci-2/ (accessed on 5 January 2019).
Physicalacoustics - sensors. Available online: https://www.physicalacoustics.com/by-product/sensors/WDI-AST-100-900-kHz-Wideband-Differential-AE-Sensor (accessed on 5 January 2019).
Sohaib, M.; Islam, M.; Kim, J.; Jeon, D.-C.; Kim, J.-M. Leakage Detection of a Spherical Water Storage Tank in a Chemical Industry Using Acoustic Emissions. Appl. Sci. 2019, 9, 196. [Google Scholar] [CrossRef]
Center, N. Resource AE Signal Features. Available online: https://www.nde-ed.org/EducationResources/CommunityCollege/Other Methods/AE/AE_Signal Features.php (accessed on 5 January 2019).
Zou, S.; Yan, F.; Yang, G.; Sun, W. The identification of the deformation stage of a metal specimen based on acoustic emission data analysis. Sensors (Switzerland) 2017, 17, 789. [Google Scholar] [CrossRef]
Kang, M.; Islam, M.R.; Kim, J.; Kim, J.M.; Pecht, M. A Hybrid Feature Selection Scheme for Reducing Diagnostic Performance Deterioration Caused by Outliers in Data-Driven Diagnostics. IEEE Trans. Ind. Electron. 2016, 63, 3299–3310. [Google Scholar] [CrossRef]
Shen, J.; Chang, H.; Li, Y. Pressure vessel state investigation based upon the least squares support vector machine. Math. Comput. Model. 2011, 54, 883–887. [Google Scholar] [CrossRef]
Bornn, L.; Farrar, C.R.; Park, G.; Farinholt, K. Structural Health Monitoring With Autoregressive Support Vector Machines. J. Vib. Acoust. 2009, 131, 21004–21009. [Google Scholar] [CrossRef]
Islam, R.; Khan, S.A.; Kim, J.M. Discriminant Feature Distribution Analysis-Based Hybrid Feature Selection for Online Bearing Fault Diagnosis in Induction Motors. J. Sensors 2016, 2016, 1–16. [Google Scholar] [CrossRef]
Yigit, H. A weighting approach for KNN classifier. In Proceedings of the 2013 International Conference on Electronics, Computer and Computation, Ankara, Turkey, 7–9 November 2013. [Google Scholar]
Chen, X.; Xu, J.; Guo, W. The research about video surveillance platform based on cloud computing. In Proceedings of the 2013 International Conference on Machine Learning and Cybernetics, Tianjin, China, 14–17 July 2013. [Google Scholar]

Figure 1. Block diagram of the proposed methodology.

Figure 2. The testbed used to collect AE signals during the experiment.

Figure 3. Schematic diagram for a spherical tank testbed.

Figure 4. Setup of data acquisition devices and channels.

Figure 5. Hybrid feature pool creation process.

Figure 6. Considered AE signal features.

Figure 7. Illustration of the k-NN algorithm for a binary classification problem.

Figure 8. The AE signals associated with normal condition (NC) and crack condition (CC), from both time and frequency-domains, i.e., (a) NC: time-domain, (b) NC: frequency-domain, (c) CC: time-domain, and (d) CC: frequency-domain.

Figure 9. Optimal feature set selected by GA for different health conditions.

Figure 10. (a) Different classification accuracies as a function of the number of neighbors (k). The optimal value is k = 5. (b) Drawing of the decision boundary for separation of the two classes after determining the optimal value for k.

Figure 11. Classification performance of the proposed approach determined for 10 experiments.

Table 1. Specifications of the proposed data acquisition system.

WDI-AST sensor [23]		Peak sensitivity [V/µbar]: −25 dB Operating frequency range: 200 to 900 kHz Directionality: ±1.5 dB Resonant frequency: 650 kHz
PCI 2 [22]		18-bit 40 MHz A/D conversion AE input: 2 channels Sensor testing: AST built-in Dynamic range: >85 dB

Table 2. Statistical features from time and frequency domains for the hybrid feature pool.

Time-Domain Features
Feature	Equation	Feature	Equation	Feature	Equation
F5	$\sqrt{\frac{1}{N}} \sum_{i = 1}^{N} X_{i}^{2}$	F6	${(\frac{1}{N} \sum_{i = 1}^{N} \sqrt{\| X_{i} \|})}^{2}$	F7	$\max (\| X \|) - \min (\| X \|)$
F8	$\frac{1}{N} {\sum_{i = 1}^{N} (\frac{X_{i} - \bar{X}}{σ})}^{4}$	F9	$\frac{1}{N} {\sum_{i = 1}^{N} (\frac{X_{i} - \bar{X}}{σ})}^{3}$	F10	$\frac{F 5}{F 5^{4}}$
F11	$\frac{1}{N} {\sum_{i = 1}^{N} (\frac{X_{i} - \bar{X}}{σ})}^{5}$	F12	$\frac{\max (\| X \|)}{F 5}$	F13	$\frac{\max (\| X \|)}{\frac{1}{N} \sum_{i = 1}^{N} \| X_{i} \|}$
F14	$\frac{F 5}{\frac{1}{N} \sum_{i = 1}^{N} \| X_{i} \|}$	F15	$\frac{1}{N} {\sum_{i = 1}^{N} (\frac{X_{i} - \bar{X}}{σ})}^{6}$	-	-
Frequency-Domain Features
F16	$\sqrt{\frac{1}{N}} \sum_{i = 1}^{N} F_{i}^{2}$	F17	$\sqrt{\frac{1}{N}} \sum_{i = 1}^{N} {(F_{i} - m e a n (\| F_{i} \|))}^{2}$	F18	$\frac{1}{N} \sum_{i = 1}^{N} F_{i}$

Here, X is the original AE signal and F is the frequency domain of the signal X. N denotes the total number of samples of the signal.

Table 3. Details of the health conditions considered for the experiment.

Health Condition	Crack Type	Crack Size (mm)	Fault Location	Sampling Frequency	Number of Channels
Normal Condition (NC)	No Crack	No Crack	No Crack	1 Mhz	4
Crack Condition (CC)	Crack	3	Bottom of the Spherical Tank	1 Mhz	4

Table 4. Classification accuracy among different methods.

Approach	Classification Accuracy (%)	Improvement (%)
All features	84.45	15.35
PCA + k-NN [27]	91.79	8.01
t-SNE + k-NN	56.25	43.55
BSS + Wavelet features [17]	97.25	2.55
GA + k-NN (proposed)	99.8	-

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hasan, M.J.; Kim, J.-M. Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm. Energies 2019, 12, 991. https://doi.org/10.3390/en12060991

AMA Style

Hasan MJ, Kim J-M. Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm. Energies. 2019; 12(6):991. https://doi.org/10.3390/en12060991

Chicago/Turabian Style

Hasan, Md Junayed, and Jong-Myon Kim. 2019. "Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm" Energies 12, no. 6: 991. https://doi.org/10.3390/en12060991

APA Style

Hasan, M. J., & Kim, J.-M. (2019). Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm. Energies, 12(6), 991. https://doi.org/10.3390/en12060991

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm

Abstract

1. Introduction

2. Methodology

2.1. Experimental Testbed and Dataset Acquisition

2.2. Hybrid Feature Pool

2.3. AE Signal Features

2.4. Classical Statistical Features

2.5. Feature Selection by Genetic Algorithm (GA)

2.6. Fault Classification Using k-Nearest Neighbor (k-NN)

3. Experimental Result Analysis and Discussion

3.1. Dataset Description

3.2. Performance Analysis of the Discriminant Feature Selection by GA from the Hybrid Feature Pool

3.3. Performance Analysis of k-Nearest Neighbor Algorithm

3.4. Comparison Analysis

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI