Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis

Islam, Md Rashedul; Kim, Young-Hun; Kim, Jae-Young; Kim, Jong-Myon

doi:10.3390/app9112326

Open AccessArticle

Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis

by

Md Rashedul Islam

,

Young-Hun Kim

,

Jae-Young Kim

and

Jong-Myon Kim

^*

Department of Electrical, Electronics and Computer Engineering, University of Ulsan, Ulsan 44610, Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2019, 9(11), 2326; https://doi.org/10.3390/app9112326

Submission received: 1 May 2019 / Revised: 26 May 2019 / Accepted: 31 May 2019 / Published: 6 June 2019

(This article belongs to the Special Issue Fault Diagnosis of Rotating Machine)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

The proposed model of this paper is for the bearing fault diagnosis of industrial rotating machinery. Specifically, the general fault diagnosis model only can predict the bearing fault based on the predefined number of stored fault information. The proposed approach provides an online fault diagnosis process, where unknown faults are detected and updated with knowledge of the proposed diagnosis system.

Abstract

This paper proposes an online fault diagnosis system for bearings that detect emerging fault modes and then updates the diagnostic system knowledge (DSK) to incorporate information about the newly detected fault modes. New fault modes are detected using k-means clustering along with a new cluster evaluation method, i.e., multivariate probability density function’s cluster distribution factor (MPDFCDF). In this proposed model, a heterogeneous pool of features is constructed from the signal. A hybrid feature selection model is adopted for selecting optimal feature for learning the model with existing fault mode. The proposed online fault diagnosis system detects new fault modes from unknown signals using k-means clustering with the help of proposed MPDFCDF cluster evaluation method. The DSK is updated whenever new fault modes are detected and updated DSK is used to classify faults using the k-nearest neighbor (k-NN) classifier. The proposed model is evaluated using acoustic emission signals acquired from low-speed rolling element bearings with different fault modes and severities under different rotational speeds. Experimental results present that the MPDFCDF cluster evaluation method can detect the optimal number of fault clusters, and the proposed online diagnosis model can detect newly emerged faults and update the DSK effectively, which improves the diagnosis performance in terms of the average classification performance.

Keywords:

online fault diagnosis; k-means clustering; hybrid feature selection; cluster evaluation; envelope analysis; and acoustic emission

1. Introduction

The industrial motor is vital equipment in modern industrial application such as in pumps, air compressors, steel mills, paper mills, wind turbines, and others. It is perhaps the most commonly used rotating machine with the primary goal of providing manufacturing machinery rotation under different load conditions [1]. Thus, the bearing is the most critical and important piece of equipment in the industrial motor. However, under the continuous and heavy workload, rapid raising of the voltage pulse, and contamination of the operating environment, the metal fatigues of different parts of the bearing appear in the form of surface cracks [2]. The gradual increase of the crack severity becomes a major cause of failure of the bearing [2,3]. Finally, the unattended bearing-related failures account for a significant portion of functional breakdowns in induction motors as well as shutdown of the industrial process.

To overcome the situation and decrease the tremendous loss of industrial production, several types of bearing condition monitoring and early fault detection models have been invented. Conventional model-based condition monitoring is most popular among them. However, model-based condition monitoring demands a deep knowledge of the whole industrial process [4,5]. Alternatively, the modern development of sensor technology, information processing techniques, and machine learning models help to invent an efficient data-driven bearing fault diagnosis model by considering a large amount of signal data during continuous industrial operations [4].

In the data-driven approach, data are collected from the actual system environment using different types of sensors for analyzing and distinguishing normal and abnormal conditions without developing the complex model of the system. To collect the data from the bearing, several types of sensor are used, i.e., current sensor, thermal sensor, vibration acceleration sensor, and acoustic emission sensor. Motor current signature analysis (MCSA) is adopted for fault diagnosis [6,7] because of the fact that it is sensitive to different failures of industrial motor and provides nonintrusive monitoring. Vibration signal analysis-based fault diagnosis is widely used for this purpose, which may provide the most information about bearing failures [8,9,10]. The vibration analysis and MCSA show good performance for the early detection of bearing defects. However, these approaches can only show suboptimal performance in the modern complex industrial system because of their difficulty in capturing dynamic activities at weak vibration and current levels, respectively. Acoustic emission (AE) is a popular alternative method for fault diagnosis due to its high operating frequency range and low-energy signals capturing capability [4,11,12,13]. AE can identify bearing abnormalities before they appear on the surface at the vibration acceleration range. Thus, AE is used to monitor for incipient bearing defects in this paper.

In order to develop an efficient fault diagnosis technique, many researchers proposed the data-driven bearing fault diagnosis model. Thomas et al. proposes a heterogeneous feature extraction process using envelop power spectrum, time and frequency domain statistical analysis, and wavelet packet decomposition analysis. In this paper, the authors adopt a simple greedy search model, which is computational time expensive. The k-NN, feedforward net, and SVM (support vector machine) are used in a supervised manner in this research [14]. Islam et al. proposed a hybrid feature selection algorithm and discriminant heterogeneous feature analysis for bearing fault diagnosis. In this paper, acoustic emission sensor is used for acquiring vibration data of the fault. A novel hybrid feature selection model is introduced for selecting optimal features from extracted heterogeneous feature vector. Finally, k-NN is used for identifying bearing fault [15]. In [16], Kun et al. proposed an adaptive frequency band selection for an empirical wavelet decomposition model for bearing fault diagnosis. In this paper, the harmonic significance index and particle swarm optimization is used for selecting an optimal sub band. There are many kinds of research where the traditional fault diagnosis models are introduced and have achieved high accuracy for detecting faults based on trained knowledge.

The traditional fault diagnosis methods use prior knowledge of failure modes to diagnose bearing condition. However, in complex modern industrial processes, bearing often operates continuously under heavy loads, which causes material fatigue and leads to faults (i.e., cracks or spalls). Thus, it is difficult to have complete failure knowledge in advance, which means traditional diagnostic systems make decisions based on incomplete knowledge. Due to the progressive nature of bearing faults, new fault modes can appear during continuous operation. Discounting information about new faults degrades the effectiveness of fault diagnosis systems. In contrast, an online fault diagnosis system considers new faults continually for enhancing the fault knowledge in reliable fault diagnosis [17]. Jiang et al. proposed a progressive fault diagnosis model, where a new hybrid ensemble auto-encoder (HEAE) is used to increase the feature quality for detecting a fault [18]. In this research, several classification algorithms are used for validating the performance. However, this work still considers a background trained model for identifying the new fault. Also, the fault classification model is not updated by incorporating the new fault information.

To address this issue, an online fault diagnosis model is proposed in this paper that can detect new faults as they appear using unsupervised clustering technique, update its knowledge in real time, and use the updated knowledge for a reliable diagnosis. The number of faults can be determined from a set of condition monitoring data through clustering. The k-means clustering algorithm is a simple and popular way to solve the unsupervised clustering problem [19] without the need for training on existing knowledge. It can, therefore, be used to cluster unknown bearing signals and detect new faults. The k-means algorithm first selects k initial centroids as the centers of k clusters. Each sample is then assigned to the nearest cluster through an iterative process. The algorithm is sensitive to the initial centroids, which are generally selected randomly [19,20]. Several methods have been proposed to select optimal centroids. Bradley et al. proposed a cluster center initialization algorithm, which divides the overall dataset into subsamples using k-means and then selects final centroids based on minimum clustering error [21]. Likas et al. introduced global k-means, a tree-based center initialization algorithm to select initial centroids independently of initial position and empirical parameters [22]. David et al. proposed the k-means + +, which yield better clustering performance using random seeds [23]. Those algorithms, however, suffer from randomness, are computationally expensive, and introduce new parameters. In [20], Khan proposed a simple and effective initial centroid selection method that reduces randomness, improves clustering performance, and does not introduce any empirical variables. Thus, this method proposed in [20] is used as a baseline for initial centroid selection.

In k-means clustering, the number of possible clusters k is an important parameter. Since the number of fault modes in a bearing cannot be known in advance, an efficient cluster evaluation technique is required to determine k. The silhouette coefficient is a popular cluster evaluation metric [24,25] that quantifies the proximity of points in a cluster to one another, which helps in identifying the number of visible clusters. A silhouette value is calculated for each point by measuring its average dissimilarity a_i to other points in the same cluster and its minimum average dissimilarity b_i to points in other clusters. The silhouette coefficient, calculated by averaging all silhouette values, gives the number of clusters in the data. In [26], Rahman et al. proposed a computationally simpler cluster evaluation algorithm called the compactness and separation measure of clusters (COSES). It calculates cluster separability by measuring the minimum distance between cluster seeds and calculates compactness by averaging the sum of the distances of all samples of a cluster to its seed.

However, cluster evaluation techniques based on average distances are unsuitable for evaluating the feature distribution of bearings, because samples at larger distances from the cluster center tend to dominate those lying closer to it, and the uneven distribution of densities across the feature space can misrepresent the actual number of clusters in the data. To address these issues, a new cluster evaluation algorithm, i.e., multivariate probability density function’s cluster distribution factor (MPDFCDF), is proposed here. The MPDFCDF identifies the optimal number of clusters and uses that information to detect new faults. Once a new fault mode is detected, the online diagnostic system knowledge (DSK) is updated through heterogeneous feature pool extraction and optimal fault signature selection based on existing knowledge and newly detected fault signals. Subsequently, the online fault diagnosis module uses k-NN for classification.

Key contributions of this paper are:

Since it is difficult to know in advance what types of faults a healthy bearing can experience, traditional offline fault diagnosis systems classify faults on the basis of incomplete knowledge. To address this issue, we propose an online bearing fault diagnosis model that first detects unknown fault modes in real-time using k-means clustering and continually updates the DSK for more reliable fault diagnosis.
It is difficult to determine the number of discernible fault modes through k-means clustering alone without knowing how to define k_kmean. To address this issue, we propose a new cluster evaluation method, the MPDFCDF, to identify the optimal number of clusters (k_opt) in the fault signatures.
To evaluate our proposed model, we recorded bearing signals for different shaft speeds and constructed a heterogeneous fault signatures pool to extract the maximum possible fault information. We used a hybrid fault signature selection to create discriminative fault signatures and build a DSK. Finally, we used the k-NN classification algorithm to estimate the classification performance.

The rest of this paper is organized as follows. Section 2 describes the data acquisition from a test rig. Section 3 provides details of the proposed online fault diagnosis model, including the DSK updating process, unknown fault mode detection using the proposed MPDFCDF cluster evaluation method, and bearing fault diagnosis. Section 4 presents the experimental results and analysis of the proposed model, and Section 5 concludes the paper.

2. Data Acquisition

To conduct the experiment, AE signals of different faults are collected from the collaborative experimental setup at Intelligence Dynamic Lab (Gyeonsang National University, Korea). Figure 1 depicts the block diagram of the experimental setup for collecting acoustic emission (AE) signals from bearings. In this experimental setup, an induction motor is connected to a drive-end shaft, and a non-drive-end shaft is attached to the drive-end shaft through a gearbox with a 1.52:1 gear reduction ratio. Both the drive-end and non-drive-end shafts use cylindrical rolling element bearings (model FAG NJ206-E-TVP2), where the number of rolling elements is 13, the contact angle (α_cangle) is 0°, and the roller diameter (B_d) and pitch diameters (P_d) is 9 mm and 46.5 mm, respectively [27]. Figure 2 presents the parts and measurements of cylindrical rolling element bearing. In this setup, the general-purpose wide-band frequency AE sensor (WSα) from Physical Acoustics Corporation (PAC) is used [28]. The operating frequency is 100–1000 kHz, peak sensitivity is 55 dB, and resonant frequency 125 kHz of the AE sensor. The AE sensor is attached at the top of the non-drive-end bearing house and the collected signals are sampled at 250 kHz using a PCI-2 based system [28]. Figure 3 presents the picture of the experimental setup and data acquisition system. Bearing faults are seeded defects or cracks of different sizes (small size crack: 3 mm and big size crack: 12 mm) as shown in Figure 4, where fault names are Bearing Crack Outer (BCO), Bearing Crack Inner (BCI), and Bearing Crack Roller (BCR).

3. Proposed Online Fault Diagnosis Model

The proposed model for online fault diagnosis consists of three processes, as shown in Figure 5. An initialization process builds the initial DSK with a heterogeneous feature vector corresponding to a healthy bearing signal (FFB). An online bearing fault detection and diagnosis process detect new fault modes in unknown signals using k-means clustering and the proposed MPDFCDF cluster evaluation method. If new fault modes are detected, an updating process incorporates the newly detected fault class into the DSK, and the updated DSK is then used for fault diagnosis

3.1. Heterogeneous Feature Pool Configuration

To discriminate fault signatures, we use maximum possible fault features extraction [14,15]. Different paradigms exist for the extraction of heterogeneous fault features, in which statistical features from time-domain and frequency-domain signals are used to uncover meaningful information for fault diagnosis. We use statistical features from the time-domain AE signal: f₁, root-mean-square (RMS); f₂, square root of the amplitude (SRA); f₃, kurtosis (KV); f₄, skewness (KV); f₅, peak-to-peak (PPV); f₆, crest factor (CF); f₇, impulse factor (IF); f₈, margin factor (MF); f₉, shape factor (SF); f₁₀, kurtosis factor (KF); and frequency-domain features of the signal, including f₁₁, frequency center (FC); f₁₂, RMS frequency (RMSF); and f₁₃, root variance frequency (RVF) [4]. Equations (1) and (2) present mathematical equations describing the extracted statistical features.

\begin{matrix} R M S = {(\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2})}^{1 / 2}, \\ S R A = {(\frac{1}{N} \sum_{i = 1}^{N} \sqrt{| x_{i} |})}^{2}, \\ K V = \frac{1}{N} {\sum_{i = 1}^{N} (\frac{x_{i} - \bar{x}}{σ})}^{4}, \\ S V = \frac{1}{N} {\sum_{i = 1}^{N} (\frac{x_{i} - \bar{x}}{σ})}^{3}, \\ P P V = \max (x_{i}) - \min (x_{i}), \\ C F = \frac{\max (| x_{i} |)}{{(\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2})}^{1 / 2}}, \\ I F = \frac{\max (| x_{i} |)}{\frac{1}{N} \sum_{i = 1}^{N} | x_{i}^{} |}, \\ M F = \frac{\max (| x_{i} |)}{{(\frac{1}{N} \sum_{i = 1}^{N} \sqrt{| x_{i} |})}^{2}}, \\ S F = \frac{{(\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2})}^{1 / 2}}{\frac{1}{N} \sum_{i = 1}^{N} | x_{i} |}, \\ K F = \frac{\frac{1}{N} {\sum_{i = 1}^{N} (\frac{x_{i} - \bar{x}}{σ})}^{4}}{{(\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2})}^{2}} \end{matrix}

(1)

\begin{matrix} F C = \frac{1}{N} \sum_{i = 1}^{N} f_{i}, \\ R M S F = {(\frac{1}{N} \sum_{i = 1}^{N} f_{i}^{2})}^{1 / 2}, \\ R V F = {(\frac{1}{N} \sum_{i = 1}^{N} {(f_{i} - F C)}^{2})}^{1 / 2} \end{matrix}

(2)

To identify particular failure modes, bearing defect properties must be observed at different signal frequencies. Therefore, we extract RMSF values from frequency ranges that include harmonics of the following bearing defect frequencies [29]: Ball pass frequency of the outer raceway (BPFO), ball pass frequency of the inner raceway (BPFI), ball spin frequency (BSF), and fundamental train frequency (FTF). They are calculated as follows:

\begin{array}{l} B P F O = \frac{N_{r o l l e r s} \times F_{s h a f t}}{2} (1 - \frac{B_{d}}{P_{d}} \cos α_{c a n g l e}), \\ B P F I = \frac{N_{r o l l e r s} \times F_{s h a f t}}{2} (1 + \frac{B_{d}}{P_{d}} \cos α_{c a n g l e}), \\ B S F = \frac{P_{d} \times F_{s h a f t}}{2 \times B_{d}} (1 - {(\frac{B_{d}}{P_{d}} \cos α_{c a n g l e})}^{2}), \\ F T F = \frac{F_{s h a f t}}{2} (1 - \frac{B_{d}}{P_{d}} \cos α_{c a n g l e}), \end{array}

(3)

where N_rollers is the number of rollers, F_shaft is the shaft speed in hertz, α_cangle is the contact angle, B_d is the roller diameter, and P_d is the pitch diameter.

RMSF features are calculated by first determining the power spectral envelope of the AE signal and then identifying the defect frequency ranges of BPFO, BPFI, and 2 × BSF up to their third harmonics. Except for BPFO, which is caused by the force of the roller on the stationary outer raceway fault, these frequencies can experience some modulation. BPFI and its harmonics are amplitude-modulated by F_shaft, and 2 × BSF is modulated by the rotational speed of the cage (FTF). Hence, amplitude modulation produces sidebands as an explicit symptom of inner-raceway and roller faults in bearings. Furthermore, random variations in the analytic bearing defect frequencies on the order of 1–2% are observed [30], which are accounted for by the factor RV_order = 2% [27]. Frequency ranges for outer-related defects (N_OFreqR_,i), inner-related defects (N_IFreqR_,i), and roller-related defects (N_RFreqR_,i) are computed using Equation (4) and depicted in Figure 6.

\begin{matrix} N_{O F r e q R, i} = (1 - R V_{o r d e r}) \cdot (B P F O_{i}) ~ (1 + R V_{o r d e r}) \cdot (B P F O_{i}) \\ N_{I F r e q R, i} = (1 - R V_{o r d e r}) \cdot (B P F I_{i} - 2 \cdot F_{s h a f t}) ~ (1 + R V_{o r d e r}) \cdot (B P F I_{i} + 2 \cdot F_{s h a f t}) \\ N_{R F r e q R, i} = (1 - R V_{o r d e r}) \cdot (2 \times B S F_{i} - 2 \cdot F T F) ~ (1 + R V_{o r d e r}) \cdot (2 \times B S F_{i} + 2 \cdot F T F) \end{matrix}

(4)

where i represents the harmonic number. Using these frequency ranges, we compute nine RMSF values for three types of defect frequencies up to their third harmonic: RMSF_BPFI₁(f₁₄), RMSF_BPFI₂ (f₁₅), RMSF_BPFI₃ (f₁₆), RMSF_BPFO₁ (f₁₇), RMSF_BPFO₂ (f₁₈), RMSF_BPFO₃ (f₁₉), RMSF_{2 × BSF1} (f₂₀), RMSF_{2 × BSF2} (f₂₁), RMSF_2×BSF3 (f₂₂). In total, we extract 22 features from each signal to create its heterogeneous feature pool.

3.2. DSK Construction by Selecting Discriminant Fault Signatures of Detected Fault Modes

Discriminant features of candidate data classes are important to effectively distribute class samples and improve classification accuracy. However, the classifier knowledge varies depending on the number of emerging fault classes. In a high-dimensional feature space, features differ in their ability to effectively discriminate a given fault condition. Irrelevant or redundant features degrade the performance of predictive models (e.g., supervised and unsupervised classifiers). To address this issue, we use a discriminant fault signature selection process to build a DSK of detected fault modes.

Feature selection involves the evaluation of feature subsets based on processes that are generally either wrapper or filter based [31]. However, diagnostic performance can be improved by combining the advantages of both the wrapper-and filter-based approaches [15,32,33]. In this study, we introduce a hybrid feature selection algorithm that uses an effective combination of filter and wrapper methods to select the most discriminant signature of emerging faults. In the selection process, extracted features of detected fault classes (N_{detected_class} × N_sample × N_feature) are divided into two halves, where one half is used for the filter selection process and the other half is used for the wrapper selection process (N_feature is the size of the original feature vector). In the filter selection process, (N_{detected_class} × N_sample/2 × N_feature) are randomly divided into a k-fold cross validation k_cv, where k = 2. Optimal feature sets are selected using sequential forward selection (SFS) with feature subset evaluation. The filter selection process selects 10 suboptimal feature subsets from N_iteration × k_cv, where N_iteration = 5.

The 10 suboptimal subsets are combined to create a feature occurrence histogram. Subsequently, in the wrapper selection process, the algorithm selects feature combinations from different levels of the feature occurrence histogram and estimates the average classification accuracy of the k-NN classifier. In the classification process, 50% of the (N_{detected_class} × N_sample/2 × N_feature) are used as a training set and the rest are used as a resting set. The wrapper selection process picks the best feature combination depending on the maximum average classification accuracy. Figure 7 illustrates the overall flow of the DSK construction by selecting discriminant fault signatures of detected fault modes.

Usually, the SFS algorithm evaluates different subsets and selects an optimal subset based on an objective value. In this study, we adopt the maximum class separability (MCS) feature subset evaluation algorithm proposed by Islam et al., which is illustrated in Figure 8 [34]. According to this algorithm, the final objective value is calculated as Equation (5).

o b j e c t i v e_v a l u e = \frac{b e t w e e n_c l a s s_s e p a r a b i l i t y}{w i t h i n_c l a s s_c o m p a c t n e s s}

(5)

3.3. Proposed New Fault Mode Detection

As mentioned in Section 1, a new fault can appear or the severity of an existing crack can increase during the diagnosis process. Unsupervised clustering algorithms are appropriate for detecting an unknown number of faults in a fault signal. The process of our proposed online fault diagnosis system is illustrated in Figure 9. This process compares the optimal features extracted from unknown signals with the existing DSK to identify the number of fault modes.

3.3.1. k-Means Clustering

k-means clustering is the simplest unsupervised clustering algorithm and has been adopted in several application domains [35,36]. In k-means clustering, k centroids are initially selected at random, and data samples {x₁, …x_n} are assigned to disjoint clusters corresponding to the nearest centroid based on a clustering criterion function F [19], which is generally the squared distance between each sample x_i and the centroid c_j of cluster C_j, as given in Equation (6)

F (c_{1}, \dots, c_{k}) = \sum_{j = 1}^{k} \sum_{i = 1}^{N_{j}} {‖ x_{i, j} - c_{j} ‖}^{2}

(6)

where c₁…c_k are the centroids of clusters C₁…C_k, respectively, k is the total number of clusters, and N_j is the total number of samples assigned to the ith cluster. After data samples are assigned to clusters, centroids are relocated based on their membership information.

This process is repeated until there is no change in the value of F [37]. Since the k-means clustering algorithm is sensitive to the k initial centroids, it may converge to a suboptimal solution [20,38]. Many algorithms have been devised to reduce the sensitivity of the process to k, but they inevitably increase the computational cost or introduce new variables. In this study, we use the initial centroid selection method of Khan [20], which considers the deepest valleys and highest gaps between clusters in a dataset. This algorithm sorts the data samples in ascending order of their magnitude d₁, … d_n, then calculates the Euclidean distance between each pair of consecutive points, D_i = dist (d_i − d_i₊₁), where i = 1 … n − 1. Next, the distances D_i are sorted in descending order while keeping their original indices i for identifying the position numbers after further processing. The algorithm selects the first k − 1 distances, where k is the user-defined number of clusters, and points corresponding to the selected distances are combined into an upper bound set of k data groups {i₁ … i_k−₁, n}, and a lower bound set of k data groups {1, i₁ + 1, … i_k−₁ + 1}. Finally, center medians are calculated using the data points between the lower and upper bound positions. These center medians are the initial k centroids and are used as the initial seeds for the k-means clustering process. This process is illustrated in Figure 10.

3.3.2. The Proposed MPDFCDF Cluster Evaluation Method

As mentioned in Section 1, k-means clustering requires k_kmeans to be specified in advance. However, it is difficult to know the number of clusters (fault modes) of unknown signals in online fault detection and diagnosis system. This problem can be resolved by repeating the clustering process for different values of k_kmeans (e.g., k_kmeans = 2 … 7) and then determining k_opt through an efficient cluster evaluation method. To do this, elbow is a well-known unsupervised cluster evaluation technique [39,40]. The elbow is a visual graph evaluation technique, where the number of clusters is increased one by one and evaluate a cost function of the clusters in every stapes. The cost function is basically the calculation of the sum of squared errors of clusters. During the time of increasing the number of clusters, the value of cost function dramatically decreased and then became steady. Based on the curve of the graph, it determines the optimal number of cluster. However, the visual determination is often ambiguous. Also, the evaluation of the cost function of clusters is challenging.

The silhouette coefficient [24] and COSES [26] methods work as a cluster evaluation for well-separated data. However, data from faulty bearings have an unusual and complex distribution, which is not handled well by average-distance-based cluster evaluation methods. In Figure 11a, data are well separated into easily identifiable clusters, but in Figure 11b, two data groups are in close proximity to each other but far away from the other groups. In such cases, the larger distance dominates over the smaller distance, leading to incorrect identification of the number of clusters. Furthermore, unequal densities of data groups could bias the cluster evaluation. To address these issues, the MPDFCDF is proposed, which considers the multivariate Gaussian distribution and probability density functions (PDFs) of the data in feature space.

The PDF ρ(x) can be used to express the statistical distribution of data in clusters. The univariate PDF of a normal distribution is given by Equation (7), where µ is the mean, and σ² is the variance

ρ (x) = \frac{1}{σ \sqrt{2 π}} \exp (\frac{- {(x - μ)}^{2}}{2 σ^{2}}) .

(7)

However, real-world problem sets exhibit behavior that is more suitably represented by multivariate distributions, such as the multivariate normal distribution given by Equation (8)

ρ (X) = \frac{1}{\sqrt{| \sum | {(2 π)}^{d}}} \exp (- \frac{1}{2} {(X - μ)}^{'} \sum^{- 1} (X - μ))

(8)

where d is the dimension of data, X and µ are 1 × d vectors, and ∑ is the d × d symmetric positive definite covariance matrix.

The properties of these distributions and the distance between their means represent key information that can help in identifying k_opt. If samples are distributed in a scattered way, the probability that data exist in dense areas is low, and the deviation of samples from the mean is high. On the other hand, if samples are distributed in a compact way, the probability that data exist in dense areas is high and the deviation of samples from the mean is low. Thus, the ratio of the highest PDF value to the variance of the distribution is a good measure of the cluster density, whereas distances between means of clusters measure the cluster separation. k_opt is identified by evaluating clustered samples on these criteria, using different values of k. The MPDFCDF cluster evaluation method, thus, consists of the following steps, which are illustrated in Figure 12:

Step 1: Classify samples belonging to k clusters by k-means clustering;
Step 2: Calculate the mean µ and covariance ∑ of all clusters;
Step 3: Calculate the PDFs for all clusters using Equation (8);
Step 4: Calculate the local distribution factor for each cluster;

$L o c a l_D i s t r i b u t i o n_F a c t o r_{c} = \frac{\max (ρ_{c} (X))}{2 \times \max (\sum_{c})};$

the 2-sigma rule [41] is used here, which states that 95% of samples exist within 2-sigma of the mean and the remaining 5% can be regarded as outliers.
Step 5: Calculate the global density factor for the distribution

Global Density Factor = min(Local_density_factor);
Step 6: Calculate the global separability factor

Global Separability Factor = min(Inter_Cluster_Dist)

where

$I n t e r_C l u s t e r_D i s t (c_{i} - c_{j}) = \sqrt{{(c_{i} - c_{j})}^{2}};$
Step 7: Calculate the MPDFCDF

MPDFCDF = abs(Global Density Factor-Global Separability Factor)

where the minimum MPDFCDF value identifies k_opt.

3.4. System Update

In the proposed online fault detection and diagnosis model, known fault data from the DSK are combined with the optimal fault signature of unknown signals to detect new fault modes. k-means clustering and the proposed MPDFCDF cluster evaluation method are used to determine the number of fault modes (k_opt) in the combined data. If k_opt is larger than the number of known faults F_dsk, the model automatically updates the DSK to incorporate the newly detected fault class.

3.5. Fault Classification Using k-NN

The proposed online fault diagnosis model classifies unknown samples using k-NN [11,42]. We validate the model by estimating the generalized classification accuracy through k-fold cross-validation. The k-NN classifier classifies a sample based on the majority class among its k_knn nearest neighbors, which are determined using distance criteria. Several distance criteria have been used in previous studies, such as the Euclidean distance, correlation between samples, city block distance, cosine distance, and Hamming distance. In this study, we use the Euclidean distance, which is the most widely used distance criterion and can be formulated as Equation (9).

d i s t (x_{i}, x_{j}) = \sqrt{\sum_{d = 1}^{D} {(x_{i, d} - x_{j, d})}^{2}}

(9)

where dist(x_i−x_j) represents the Euclidean distance between two data points x_i and x_j, and samples are represented by d feature dimensions.

4. Experimental Results and Analysis

4.1. Experimental Datasets

For the experiment of the proposed model, signals were collected from a fault-free bearing (FFB) and faulty bearings at various shaft speeds. In this study, six fault modes are considered: A 3-mm crack on the outer raceway (BCO 3 mm), a 3-mm crack on the inner raceway (BCI 3 mm), a 3-mm crack on the roller (BCR 3 mm), a 12-mm crack on the outer raceway (BCO 12 mm), a 12-mm crack on the inner raceway (BCI 12 mm), and a 10-mm crack on the roller (BCR 12 mm). Signals are collected for different rotational speeds (300 rpm to 500 rpm). Finally, AE data are divided into five datasets based on the shaft rotational speed, where each dataset contains a total of 630 AE signals of 5 s duration each (i.e., 90 AE signals for FFB and 90 for each of the six bearing faults). Table 1 presents the details of the experimental dataset.

To validate the proposed model, five test cases are defined to progressively introduce new faults (small or big cracks) into the system, as shown in Table 2. Actually, to diagnosis the continues growth of different faults, the data acquisition process should run for a long time (several years) during operation of a bearing, which is very difficult. Thus, to validate the proposed online baring fault and identify new faults, faulty signals of different faults are incorporate in progressive test cases. Finally, the proposed model is validated by five considered datasets. Each dataset is divided into two halves, one for analysis and the other for evaluation. The analysis dataset, containing 45 of 90 signals from each class, is used to detect fault modes and update the DSK. The evaluation dataset, also containing 45 samples from each class, is used for diagnosis. It is considered that the system only starts with the diagnostic knowledge of a healthy machine, originating from 30 FFB signals, where each signal is represented by a 22-dimensional feature vector. Each test case uses 30 randomly-selected samples from the corresponding fault class in the analysis dataset to evaluate the unknown fault mode detection process and update the system. For evaluation, we randomly select 30 samples from each candidate fault class in the evaluation dataset.

4.2. Identification of the Optimal Number of Clusters Kopt Using the MPDFCDF Cluster Evaluation Method

The proposed MPDFCDF cluster evaluation method determines the optimal number of clusters by considering cluster density and inter-cluster separation based on the multivariate normal PDF. In Figure 13, the MPDFCDF is compared with two conventional state-of-the-art methods for five sample distributions containing multiple fault signals.

4.3. New Fault Mode Detection and System Update for Online Fault Diagnosis

Initially, we assume that the system only has knowledge of a healthy machine. Thus, the DSK includes 22 features of each of the 30 FFB signals. For each test case, the system extracts an optimal number of features from unknown signals and adds those features to the DSK to form the analysis set. The new fault mode detection module detects k_opt, which represents the number of fault modes in the analysis set. If k_opt is greater than F_dsk, the number of features in the current DSK, the new fault signatures are added to the DSK.

Here, system knowledge is represented as C_dsk × N_dsk × F_dsk,

As a central part of the online diagnosis system, the DSK updating process extracts a heterogeneous feature pool from the signals of all detected fault modes. The hybrid feature selection algorithm selects the discriminant fault signatures and constructs a DSK, which is used in the evaluation process. Table 3 summarizes the optimal feature subsets for different test cases with different datasets. After completing each test case, the DSK is represented as C_dsk × N_dsk × F_dsk, where C_dsk is the number of emerging fault classes, N_dsk is the number of samples in each class, and F_dsk is the number of discriminant fault signature variables.

4.4. Effectiveness of Online Fault Diagnosis

The effectiveness of the proposed online fault diagnosis system is evaluated by considering its classification performance using the k-NN classifier. For each test case, the DSK is used as a training set for the k-NN classifier, where k_knn is set to 5 (the number of nearest neighbors k is equal to the nearest integer to

\sqrt{N_{d s k}}

[25]). The evaluation process includes five progressive test cases t, each containing an evaluation set C_eval, _t × N_eval, _t. To calculate the classification accuracy before updating the system, the DSK of the previous test case C_dsk, _t₋₁ × N_dsk, _t₋₁ × F_dsk, _t₋₁ is used as a training set and C_eval, _t × N_eval, _t × F_dsk, _t₋₁ is used as a test set. To calculate the classification accuracy after updating the system, the DSK of the current test case C_dsk, _t × N_dsk, _t × F_dsk, _t is used as a training set and C_eval, _t × N_eval, _t × F_dsk, _t is used as a test set. In the classification process, we use k-cv (cross-validation), where k_cv is set to 3. Thus, N_eval, _t is randomly divided into thirds, and each third is used for testing during each k-cv iteration. The cross-validation process is executed N_iteration times (N_iteration = 10 in this study), and the generalized classification performance is calculated by averaging the classification accuracy. Table 4 presents the diagnosis performance for the five test cases in terms of the average sensitivity per class and the average classification accuracy (ACA), defined in Equations (10) and (11), respectively.

Sensitivity = \frac{N_{TP}}{N_{TP} + N_{FN}} \times 100 (%),

(10)

A C A = \frac{\sum_{N_{c l a s s e s}} N_{T P}}{N_{s a m p l e s}} \times 100 (%)

(11)

where N_TP is the number of correctly classified samples in a fault class, N_FN is the number of incorrectly classified samples in a fault class, and N_samples is the total number of test samples.

Figure 14 shows the average classification performance for different datasets. The proposed method detects the unknown faults well and updates the DSK (i.e., fault modes and optimum fault signatures). The updated DSK then provides improved classification performance. For each test case, when new faults are introduced, the classification performance initially degrades because the system does not have enough information about the new faults. However, the performance improves once the DSK is updated with information on the new fault modes. The classification accuracy using the proposed method is high for test case 1 because it involves only two well-distributed classes. For test case 3, the classification accuracy of the proposed method is almost 100% because the unknown data contains no new faults. Otherwise, the classification performance decreases with an increase in new faults (small or big cracks). However, after updating the DSK, the classification performance improves by about 21%. Furthermore, classification accuracies for more severe faults (e.g., bigger cracks) are better than those for less severe faults (e.g., smaller cracks) because more severe faults contain more separable features. According to the experiment, the proposed model can detect the newly accounted fault in data. However, in a real monitoring environment like industry, the quality of bearing degrades very slowly. According to the suggestion of the American Bearing Manufacturing Association (ABMA) and the American Rolling Bearing Company, the minimum L10 life of a continuous 24 h operated bearing is 60,000 h, which is around seven years. The proposed model might be deployed in the real industry environment, where a chunk of data will be captured a certain interval, i.e., once a day, once a week, once a month, or even more. The proposed automatic diagnosis model will analyze the collected data and identify the new fault or any abnormality of the bearing if it exists. The newly detected fault mode may incorporate the database of the diagnosis model. The system may run up to a tolerant level of quality of bearing.

5. Conclusions

Reliable and early fault diagnosis of the bearing may reduce the loss of manufacturing production by decreasing the possibility of an unwanted breakdown of the industrial motor. The traditional fault diagnosis model worked on the predefined and incomplete knowledge of faults of a bearing. However, due to the progressive nature of the growing fault during the operation under heavy load, new faults may appear on the surface of the bearing. To address these issues, this paper proposed a reliable and online fault diagnosis system for detecting unknown fault modes in the operating life of the bearing and update the diagnosis system knowledge (DSK) in real time. The heterogeneous feature vector is extracted from the collected AE signal from the bearing for training the diagnosis model. To detect the unknown number of faults, the k-means unsupervised clustering algorithm is used by finding the optimal number of clusters. However, determining the correct number of clusters k_opt is challenging due to the dynamic behavior of faults. For this reason, a new MPDFCDF cluster evaluation method is introduced to calculate k_opt and establish the existence of new fault modes. When a new fault mode is detected, the system automatically updates the DSK. Finally, the optimal features selection method is used for selecting optimal features from the updated fault database and the k-NN classification model is adopted for identifying faults in the unknown signals. For evaluating the proposed model, sensor signals with multiple faults, different fault severities, and different rotation speeds were considered. According to the experimental results, the proposed model showed much better diagnosis performance after incorporating the detected new fault. In addition, the proposed MPDFCDF algorithm outperformed existing cluster evaluation methods in terms of finding the optimal number of fault modes in data.

Author Contributions

All authors contributed equally to the conception of the idea, as well as implementing and analyzing the experimental results, and writing the manuscript.

Funding

This work was supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) and the Ministry of Trade, Industry & Energy (MOTIE) of Korea (No. 20181510102160), in part by the Ministry of Trade, Industry & Energy (MOTIE) of Korea and Korea Institute for Advancement of Technology(KIAT) through the Encouragement Program for The Industries of Economic Cooperation Region. (P0006123).

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, P.; Du, Y.; Habetler, T.G.; Lu, B. A survey of condition monitoring and protection methods for medium-voltage induction motors. IEEE Trans. Ind. Appl. 2011, 47, 34–46. [Google Scholar] [CrossRef]
Kang, M.; Kim, J.; Kim, J.-M.; Tan, A.C.C.; Kim, E.Y.; Choi, B.-K. Reliable Fault Diagnosis for Low-Speed Bearings Using Individually Trained Support Vector Machines with Kernel Discriminative Feature Analysis. IEEE Trans. Power Electron. 2015, 30, 2786–2797. [Google Scholar] [CrossRef]
Xu, Y.; Zhang, K.; Ma, C.; Cui, L.; Tian, W. Adaptive Kurtogram and its applications in rolling bearing fault diagnosis. Mech. Syst. Signal Process. 2019, 130, 87–107. [Google Scholar] [CrossRef]
Kang, M.; Islam, M.R.; Kim, J.; Kim, J.; Pecht, M. A Hybrid Feature Selection Scheme for Reducing Diagnostic Performance Deterioration Caused by Outliers in Data-Driven Diagnostics. IEEE Trans. Ind. Electron. 2016, 63, 3299–3310. [Google Scholar] [CrossRef]
Seshadrinath, J.; Singh, B.; Panigrahi, B.K. Vibration analysis based interturn fault diagnosis in induction machines. IEEE Trans. Ind. Inform. 2014, 10, 340–350. [Google Scholar] [CrossRef]
Zhou, W.; Lu, B.; Habetler, T.G.; Harley, R.G. Incipient Bearing Fault Detection via Motor Stator Current Noise Cancellation Using Wiener Filter. IEEE Trans. Ind. Appl. 2009, 45, 1309–1317. [Google Scholar] [CrossRef]
Zhou, W.; Habetler, T.G.; Harley, R.G. Bearing Fault Detection via Stator Current Noise Cancellation and Statistical Control. IEEE Trans. Ind. Electron. 2008, 55, 4260–4269. [Google Scholar] [CrossRef]
Berry, J.E. How to Track Rolling Element Bearing Health with Vibration Signature Analysis. Sound Vib. 1991, 11, 24–35. [Google Scholar]
Jiang, F.; Zhu, Z.; Li, W.; Ren, Y.; Zhou, G.; Chang, Y. A Fusion Feature Extraction Method Using EEMD and Correlation Coefficient Analysis for Bearing Fault Diagnosis. Appl. Sci. 2018, 8, 1621. [Google Scholar] [CrossRef]
Seshadrinath, J.; Singh, B.; Panigrahi, B.K. Investigation of vibration signatures for multiple fault diagnosis in variable frequency drives using complex wavelets. IEEE Trans. Power Electron. 2014, 29, 936–945. [Google Scholar] [CrossRef]
Pandya, D.H.; Upadhyay, S.H.; Harsha, S.P. Fault diagnosis of rolling element bearing with intrinsic mode function of acoustic emission data using APF-KNN. Expert Syst. Appl. 2013, 40, 4137–4145. [Google Scholar] [CrossRef]
Niknam, S.A.; Songmene, V.; Au, Y.H.J. The Use of Acoustic Emission Information to Distinguish Between Dry and Lubricated Rolling Element Bearings in Low-Speed Rotating Machines. Int. J. Adv. Manuf. Technol. 2013, 69, 2679–2689. [Google Scholar] [CrossRef]
Eftekharnejad, B.; Carrasco, M.R.; Charnley, B.; Mba, D. The Application of Spectral Kurtosis on Acoustic Emission and Vibrations from a Defective Bearing. Mech. Syst. Signal Process. 2011, 25, 266–284. [Google Scholar] [CrossRef]
Rauber, T.W.; Boldt, F.A.; Varejao, F.M. Heterogeneous Feature Models and Feature Selection Applied to Bearing Fault Diagnosis. IEEE Trans. Ind. Electron. 2015, 62, 637–646. [Google Scholar] [CrossRef]
Islam, R.; Khan, S.A.; Kim, J.-M. Discriminant Feature Distribution Analysis-Based Hybrid Feature Selection for Online Bearing Fault Diagnosis in Induction Motors. J. Sens. 2016, 2016, 1–16. [Google Scholar] [CrossRef]
Yu, K.; Lin, T.R.; Tan, J.; Ma, H. An adaptive sensitive frequency band selection method for empirical wavelet transform and its application in bearing fault diagnosis. Measurement 2019, 134, 375–384. [Google Scholar] [CrossRef]
Yin, G.; Zhang, Y.-T.; Li, Z.-N.; Ren, G.-Q.; Fan, H.-B. Online fault diagnosis method based on Incremental Support Vector Data Description and Extreme Learning Machine with incremental output structure. Neurocomputing 2014, 128, 224–231. [Google Scholar] [CrossRef]
Jiang, W.; Zhou, J.; Liu, H.; Shan, Y. A multi-step progressive fault diagnosis method for rolling element bearing based on energy entropy theory and hybrid ensemble auto-encoder. ISA Trans. 2019, 87, 235–250. [Google Scholar] [CrossRef] [PubMed]
Yiakopoulos, C.T.; Gryllias, K.C.; Antoniadis, I.A. Rolling element bearing fault detection in industrial environments based on a K-means clustering approach. Expert Syst. Appl. 2011, 38, 2888–2911. [Google Scholar] [CrossRef]
Khan, F. An initial seed selection algorithm for k-means clustering of geo referenced data to improve replicability of cluster assignments for mapping application. Appl. Soft Comput. 2012, 12, 3698–3700. [Google Scholar] [CrossRef]
Bradley, P.S.; Fayyad, U.M. Refining initial points for K-means clustering. In Proceedings of the Fifteenth International Conference on Machine Learning, San Francisco, CA, USA, 24–27 July 1998; pp. 91–99. [Google Scholar]
Likas, A.; Vlassis, N.; Verbeek, J.J. The global K-means clustering algorithm. Pattern Recognit. 2003, 36, 451–461. [Google Scholar] [CrossRef]
Arthur, D.; Vassilvitskii, S. K-means ++: The Advantages of Careful Seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA, 7–9 January 2007; pp. 1027–1035. [Google Scholar]
Kaufman, L.; Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2005. [Google Scholar]
Tan, P.-N.; Steinbach, M.; Kumar, V. Introduction to Data Mining, 1st ed.; Pearson Addison Wessley: Boston, MA, USA, 2005. [Google Scholar]
Rahman, M.A.; Islam, M.Z. A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowl.-Based Syst. 2014, 71, 345–365. [Google Scholar] [CrossRef]
Kang, M.; Kim, J.; Wills, L.M.; Kim, J.-M. Time-Varying and Multi resolution Envelope Analysis and Discriminative Feature Analysis for Bearing Fault Diagnosis. IEEE Trans. Power Electron. 2015, 62, 7749–7761. [Google Scholar]
WS Sensor, General Purpose Wideband Sensor. Available online: http://www.physicalacoustics.com/content/literature/sensors/Model_WSa.pdf (accessed on 20 May 2019).
Bediaga, I.; Mendizabal, X.; Arnaiz, A.; Munoa, J. Ball Bearing Damage Detection Using Traditional Signal Processing Algorithms. IEEE Instrum. Meas. Mag. 2013, 16, 20–25. [Google Scholar] [CrossRef]
Randall, R.B.; Antoni, J. Rolling Element Bearing Diagnostics—A Tutorial. Mech. Syst. Signal Process. 2011, 25, 485–520. [Google Scholar] [CrossRef]
Li, B.; Zhang, P.-L.; Tian, H.; Mi, S.-S.; Liu, D.-S.; Ren, G.-Q. A New Feature Extraction and Selection Scheme for Hybrid Fault Diagnosis of Gearbox. Expert Syst. Appl. 2011, 38, 10000–10009. [Google Scholar] [CrossRef]
Liu, C.; Jiang, D.; Yang, W. Global Geometric Similarity Scheme for Feature Selection in Fault Diagnosis. Expert Syst. Appl. 2014, 41, 3585–3595. [Google Scholar] [CrossRef]
Li, Z.; Yan, X.; Tian, Z.; Yuan, C.; Peng, Z.; Li, L. Blind Vibration Component Separation and Nonlinear Feature Extraction Applied to the Non stationary Vibration Signals for the Gearbox Multi-Fault Diagnosis. Measurement 2013, 46, 259–271. [Google Scholar] [CrossRef]
Islam, R.; Khan, S.A.; Kim, J.-M. Maximum class separability-based discriminant feature selection using a GA for reliable fault diagnosis of induction motors. Lect. Notes Artif. Intell. (LNAI) 2015, 9227, 526–537. [Google Scholar]
Steinley, D. K-means clustering: A half-century synthesis. Br. J. Math. Stat. Psychol. 2006, 9, 1–34. [Google Scholar] [CrossRef]
Wu, X.; Kumar, V.; Quinlan, J.R.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G.J.; Ng, A.; Liu, B.; Yu, P.S.; et al. Top 10 algorithms in data mining. Knowl. Inf. Syst. 2008, 14, 1–37. [Google Scholar] [CrossRef]
Lloyd, S.P. Least Squares Quantization in PCM. IEEE Trans. Inf. Theory 1982, 8, 129–137. [Google Scholar] [CrossRef]
Naldi, M.C.; Campello, R.J.G.B. Comparison of distributed evolutionary k-means clustering algorithms. Neurocomputing 2015, 163, 78–93. [Google Scholar] [CrossRef]
Zhang, Y.; Mańdziuk, J.; Quek, C.H.; Goh, B.W. Curvature-based method for determining the number of clusters. Inf. Sci. 2017, 415–416, 414–428. [Google Scholar] [CrossRef]
Yahyaoui, H.; Own, H.S. Unsupervised clustering of service performance behaviors. Inf. Sci. 2018, 422, 558–571. [Google Scholar] [CrossRef]
Kazmier, L.J. Schaum’s Outline of Business Statistics; McGraw Hill Professional: New York, NY, USA, 2009; p. 359. [Google Scholar]
Yigit, H. A weighting approach for KNN classifier. In Proceedings of the 2013 International Conference on Electronics, Computer and Computation (ICECCO), Ankara, Turkey, 7–9 November 2013; pp. 228–231. [Google Scholar]

Figure 1. Block diagram of the experimental setup.

Figure 2. Cylindrical rolling element bearings, (a) parts of the bearing, (b) measurement of bearing.

Figure 3. (a) Bearing fault signal collection test rig and (b) data acquisition system.

Figure 4. Faulty bearing parts with different crack severities. (a) BCO (bearing crack on the outer surface) 3 mm, (b) BCI (bearing crack on the inner surface) 3 mm, (c) BCR (bearing crack on the roller surface) 3 mm, (d) BCO 12 mm, (e) BCI 12 mm, (f) BCR 10 mm.

Figure 5. Flow diagram of the proposed online bearing fault diagnosis model, where F_dsk is the number of fault modes in the diagnostic system knowledge (DSK).

Figure 6. Calculation of N_OFreqR_,i, N_IFreqR_,i, and N_RFreqR_,i. (a) Defect frequencies of BCI; (b) defect frequencies of BCO; (c) defect frequencies of BCR.

Figure 7. The flow of DSK construction by selecting discriminant fault signatures of detected fault modes.

Figure 8. Feature subset evaluation algorithm: (a) Within-class compactness value, and (b) between-class distance value.

Figure 9. Proposed fault mode detection using k-means clustering and a novel cluster evaluation algorithm to detect the optimal number of clusters k.

Figure 10. Initial centroids selection process with an example.

Figure 11. Two cluster distribution examples, (a) well separated data into easily identifiable clusters and (b) two data groups are in close proximity to each other but far away from the other groups.

Figure 12. Steps of the multivariate probability density function’s cluster distribution factor (MPDFCDF) cluster evaluation method.

Figure 13. (a–e) Sample distribution of different datasets and optimal k_opt (blue big circle) selection using the compactness and separation measure of (f–j) clusters (COSES) method, (k–o) silhouette coefficient, and (p–t) the proposed MPDFCDF cluster evaluation method.

Figure 14. Diagnosis performance improvement for different test cases with different conditional datasets, i.e., (a) Dataset-1: 300 RPM, (b) Dataset-2: 350 RPM, (c) Dataset-3: 400 RPM, (d) Dataset-4: 450 RPM, and (e) Dataset-5: 500 RPM. The y-axis represents average classification accuracy (ACA) (%).

Table 1. Various datasets with the crack severity specifications considered in this study.

		Dataset 1	Dataset 2	Dataset 3	Dataset 4	Dataset 5
Average RPM		300 rpm	350 rpm	400 rpm	450 rpm	500 rpm
Fault Severity	Small crack	Crack length: 3 mm, width: 0.35 mm, depth: 0.3 mm on outer raceway, inner raceway, and roller
Fault Severity	Big crack	Crack length: 12 mm on outer and inner raceways and 10 mm on roller, width: 0.49 mm, depth: 0.5 mm

Table 2. Organization of different test cases.

Initial

Test Case 1

Update System knowledge

Test Case 2

Update System knowledge

Test Case 3

No Update

Test Case 4

Update System knowledge

Test Case 5

Update System knowledge

Final

System Condition

FFB

FFB
BCI 3 mm

FFB
BCI 3 mm
BCO 3 mm
BCR 3 mm

FFB
BCI 3 mm
BCO 3 mm
BCR 3 mm
BCI 12 mm

FFB
BCI 3 mm
BCO 3 mm
BCR 3 mm
BCI 12 mm
BCO 12 mm
BCR 12 mm

Unknown Signals

BCI 3 mm

BCO 3 mm
BCR 3 mm

BCI 3 mm
BCR 3 mm

BCI 12 mm

BCO 12 mm
BCR 12 mm

Table 3. Optimal feature vector for different test cases of different datasets.

Datasets	Test Cases
	Initial		Test Case 1		Test Case 2		Test Case 3		Test Case 4		Test Case 5
	Features	System Knowledge	Features	System Knowledge	Features	System Knowledge	Features	System Knowledge	Features	System Knowledge	Features	System Knowledge
1	f1~f22	1 × 30 × 22	f1, f12, f13, f15	2 × 30 × 4	f2, f9, f10	4 × 30 × 3	f2, f9, f10	4 × 30 × 3	f2, f9	5 × 30 × 2	f2, f9, f13	7 × 30 × 3
2	f1~f22	1 × 30 × 22	f2, f9, f14	2 × 30 × 3	f2, f9, f22	4 × 30 × 3	f2, f9, f22	4 × 30 × 3	f2, f9, f13	5 × 30 × 3	f2, f9, f11, f22	7 × 30 × 4
3	f1~f22	1 × 30 × 22	f2, f15, f16, f20	2 × 30 × 4	f2, f16	4 × 30 × 2	f2, f16	4 × 30 × 2	f2, f11, f15, f16	5 × 30 × 4	f2, f9, f11	7 × 30 × 3
4	f1~f22	1 × 30 × 22	f2, f9, f13	2 × 30 × 3	f2, f9, f20, f21	4 × 30 × 4	f2, f9, f20, f21	4 × 30 × 4	f2, f9	5 × 30 × 2	f2, f9, f13	7 × 30 × 3
5	f1~f22	1 × 30 × 22	f2, f9	2 × 30 × 2	f2, f9, f20, f21	4 × 30 × 4	f2, f9, f20, f21	4 × 30 × 4	f2, f9, f22	5 × 30 × 3	f2, f11	7 × 30 × 2

Table 4. Diagnosis performance of five successive test cases of datasets 1 to 5 in terms of average sensitivity per class.

Datasets			Average Sensitivities Per Class with Standard Deviation
Datasets	Test Case	System Condition	FFB	BCI 3 mm	BCO 3 mm	BCR 3 mm	BCI 12 mm	BCO 12 mm	BCR 10 mm	Average
Dataset-1: 300 RPM	Case-1	Before Update	100.00 (0.0)	100.00 (0.0)						100.00
	Case-1	Updated	100.00 (0.0)	100.00 (0.0)						100.00
	Case-2	Before Update	90.45 (2.3)	91.23 (2.1)	65.45 (4.6)	69.56 (4.3)				79.17
	Case-2	Updated	100.00 (0.0)	94.25 (1.3)	100.00 (0.0)	100.00 (0.0)				98.56
	Case-3	Before Update	99.58 (0.7)	94.36 (1.8)	100.00 (0.0)	100.00 (0.0)				98.49
	Case-3	Updated	100.00 (0.0)	95.12 (1.1)	100.00 (0.0)	100.00 (0.0)				98.78
	Case-4	Before Update	93.67 (2.1)	94.26 (1.9)	90.23 (2.4)	88.59 (3.9)	73.62 (4.3)			88.07
	Case-4	Updated	100.00 (0.0)	93.50 (1.4)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)			98.70
	Case-5	Before Update	96.56 (1.6)	95.64 (1.7)	91.56 (2.1)	93.12 (1.8)	98.26 (0.9)	79.46 (4.1)	82.69 (3.7)	91.04
	Case-5	Updated	100.00 (0.0)	95.25 (1.6)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	99.32
Dataset-2: 350 RPM	Case-1	Before Update	100.00 (0.0)	97.78 (1.1)						98.89
	Case-1	Updated	100.00 (0.0)	100.00 (0.0)						100.00
	Case-2	Before Update	82.23 (3.2)	73.12 (3.9)	75.94 (3.6)	75.45 (3.3)				76.69
	Case-2	Updated	100.00 (0.0)	89.85 (2.1)	100.00 (0.0)	100.00 (0.0)				97.46
	Case-3	Before Update	100.00 (0.0)	92.35 (1.6)	100.00 (0.0)	99.56 (0.7)				97.98
	Case-3	Updated	100.00 (0.0)	92.00 (1.5)	100.00 (0.0)	100.00 (0.0)				98.00
	Case-4	Before Update	93.42 (1.5)	93.26 (1.6)	91.86 (2.1)	94.56 (1.3)	89.00 (2.0)			92.42
	Case-4	Updated	100.00 (0.0)	97.78 (0.9)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)			99.56
	Case-5	Before Update	92.65 (1.4)	94.00 (1.0)	92.15 (1.8)	94.22 (1.6)	88.25 (2.5)	82.22 (2.8)	86.00 (2.4)	89.93
	Case-5	Updated	100.00 (0.0)	98.00 (0.9)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	99.71
Dataset-3: 400 RPM	Case-1	Before Update	96.35 (0.6)	100.00 (0.0)						98.18
	Case-1	Updated	100.00 (0.0)	100.00 (0.0)						100.00
	Case-2	Before Update	85.00 (2.1)	82.56 (2.8)	77.26 (3.2)	78.95 (2.7)				80.94
	Case-2	Updated	100.00 (0.0)	98.25 (1.1)	100.00 (0.0)	100.00 (0.0)				99.56
	Case-3	Before Update	100.00 (0.0)	99.56 (0.4)	100.00 (0.0)	99.98 (0.1)				99.89
	Case-3	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)				100.00
	Case-4	Before Update	97.25 (1.3)	96.00 (1.7)	94.56 (1.6)	97.15 (1.0)	92.14 (1.7)			95.42
	Case-4	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)			100.00
	Case-5	Before Update	95.45 (1.2)	96.25 (1.1)	96.00 (1.0)	97.43 (0.8)	95.64 (1.5)	82.22 (2.8)	86.00 (2.9)	92.71
	Case-5	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00
Dataset-4: 450 RPM	Case-1	Before Update	99.45 (0.7)	99.00 (0.6)						99.23
	Case-1	Updated	100.00 (0.0)	100.00 (0.0)						100.00
	Case-2	Before Update	88.00 (2.4)	86.59 (2.3)	82.45 (2.8)	83.12 (2.8)				85.04
	Case-2	Updated	100.00 (0.0)	98.00 (1.6)	100.00 (0.0)	100.00 (0.0)				99.50
	Case-3	Before Update	98.60 (1.5)	99.45 (0.4)	100.00 (0.0)	99.85 (0.2)				99.48
	Case-3	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)				100.00
	Case-4	Before Update	97.46 (1.3)	98.12 (1.2)	97.23 (1.5)	98.50 (1.3)	93.00 (1.6)			96.86
	Case-4	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)			100.00
	Case-5	Before Update	95.89 (1.4)	98.26 (1.1)	96.23 (1.9)	94.53 (2.1)	97.58 (1.4)	95.68 (1.6)	88.26 (2.2)	95.20
	Case-5	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00
Dataset-5: 500 RPM	Case-1	Before Update	97.78 (1.6)	100.00 (0.0)						98.89
	Case-1	Updated	100.00 (0.0)	100.00 (0.0)						100.00
	Case-2	Before Update	89.00 (1.9)	87.50 (2.1)	85.17 (2.3)	85.85 (2.2)				86.88
	Case-2	Updated	100.00 (0.0)	98.00 (1.6)	100.00 (0.0)	100.00 (0.0)				99.50
	Case-3	Before Update	100.00 (0.0)	99.98 (0.3)	100.00 (0.0)	100.00 (0.0)				100.00
	Case-3	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)				100.00
	Case-4	Before Update	97.50 (1.3)	97.80 (1.4)	97.43 (1.2)	98.30 (1.2)	94.50 (1.9)			97.11
	Case-4	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)			100.00
	Case-5	Before Update	97.12 (1.6)	98.60 (1.4)	95.26 (1.8)	98.20 (1.1)	96.80 (1.3)	87.50 (2.6)	90.76 (1.8)	94.89
	Case-5	Updated	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00 (0.0)	100.00

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Islam, M.R.; Kim, Y.-H.; Kim, J.-Y.; Kim, J.-M. Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis. Appl. Sci. 2019, 9, 2326. https://doi.org/10.3390/app9112326

AMA Style

Islam MR, Kim Y-H, Kim J-Y, Kim J-M. Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis. Applied Sciences. 2019; 9(11):2326. https://doi.org/10.3390/app9112326

Chicago/Turabian Style

Islam, Md Rashedul, Young-Hun Kim, Jae-Young Kim, and Jong-Myon Kim. 2019. "Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis" Applied Sciences 9, no. 11: 2326. https://doi.org/10.3390/app9112326

APA Style

Islam, M. R., Kim, Y.-H., Kim, J.-Y., & Kim, J.-M. (2019). Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis. Applied Sciences, 9(11), 2326. https://doi.org/10.3390/app9112326

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis

Abstract

Featured Application

Abstract

1. Introduction

2. Data Acquisition

3. Proposed Online Fault Diagnosis Model

3.1. Heterogeneous Feature Pool Configuration

3.2. DSK Construction by Selecting Discriminant Fault Signatures of Detected Fault Modes

3.3. Proposed New Fault Mode Detection

3.3.1. k-Means Clustering

3.3.2. The Proposed MPDFCDF Cluster Evaluation Method

3.4. System Update

3.5. Fault Classification Using k-NN

4. Experimental Results and Analysis

4.1. Experimental Datasets

4.2. Identification of the Optimal Number of Clusters Kopt Using the MPDFCDF Cluster Evaluation Method

4.3. New Fault Mode Detection and System Update for Online Fault Diagnosis

4.4. Effectiveness of Online Fault Diagnosis

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI