# Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis

^{*}

## Abstract

**:**

## Featured Application

**The proposed model of this paper is for the bearing fault diagnosis of industrial rotating machinery. Specifically, the general fault diagnosis model only can predict the bearing fault based on the predefined number of stored fault information. The proposed approach provides an online fault diagnosis process, where unknown faults are detected and updated with knowledge of the proposed diagnosis system.**

## Abstract

## 1. Introduction

_{i}to other points in the same cluster and its minimum average dissimilarity b

_{i}to points in other clusters. The silhouette coefficient, calculated by averaging all silhouette values, gives the number of clusters in the data. In [26], Rahman et al. proposed a computationally simpler cluster evaluation algorithm called the compactness and separation measure of clusters (COSES). It calculates cluster separability by measuring the minimum distance between cluster seeds and calculates compactness by averaging the sum of the distances of all samples of a cluster to its seed.

- Since it is difficult to know in advance what types of faults a healthy bearing can experience, traditional offline fault diagnosis systems classify faults on the basis of incomplete knowledge. To address this issue, we propose an online bearing fault diagnosis model that first detects unknown fault modes in real-time using k-means clustering and continually updates the DSK for more reliable fault diagnosis.
- It is difficult to determine the number of discernible fault modes through k-means clustering alone without knowing how to define k
_{kmean}. To address this issue, we propose a new cluster evaluation method, the MPDFCDF, to identify the optimal number of clusters (k_{opt}) in the fault signatures. - To evaluate our proposed model, we recorded bearing signals for different shaft speeds and constructed a heterogeneous fault signatures pool to extract the maximum possible fault information. We used a hybrid fault signature selection to create discriminative fault signatures and build a DSK. Finally, we used the k-NN classification algorithm to estimate the classification performance.

## 2. Data Acquisition

_{cangle}) is 0°, and the roller diameter (B

_{d}) and pitch diameters (P

_{d}) is 9 mm and 46.5 mm, respectively [27]. Figure 2 presents the parts and measurements of cylindrical rolling element bearing. In this setup, the general-purpose wide-band frequency AE sensor (WSα) from Physical Acoustics Corporation (PAC) is used [28]. The operating frequency is 100–1000 kHz, peak sensitivity is 55 dB, and resonant frequency 125 kHz of the AE sensor. The AE sensor is attached at the top of the non-drive-end bearing house and the collected signals are sampled at 250 kHz using a PCI-2 based system [28]. Figure 3 presents the picture of the experimental setup and data acquisition system. Bearing faults are seeded defects or cracks of different sizes (small size crack: 3 mm and big size crack: 12 mm) as shown in Figure 4, where fault names are Bearing Crack Outer (BCO), Bearing Crack Inner (BCI), and Bearing Crack Roller (BCR).

## 3. Proposed Online Fault Diagnosis Model

#### 3.1. Heterogeneous Feature Pool Configuration

_{1}, root-mean-square (RMS); f

_{2}, square root of the amplitude (SRA); f

_{3}, kurtosis (KV); f

_{4}, skewness (KV); f

_{5}, peak-to-peak (PPV); f

_{6}, crest factor (CF); f

_{7}, impulse factor (IF); f

_{8}, margin factor (MF); f

_{9}, shape factor (SF); f

_{10}, kurtosis factor (KF); and frequency-domain features of the signal, including f

_{11}, frequency center (FC); f

_{12}, RMS frequency (RMSF); and f

_{13}, root variance frequency (RVF) [4]. Equations (1) and (2) present mathematical equations describing the extracted statistical features.

_{rollers}is the number of rollers, F

_{shaft}is the shaft speed in hertz, α

_{cangle}is the contact angle, B

_{d}is the roller diameter, and P

_{d}is the pitch diameter.

_{shaft}, and 2 × BSF is modulated by the rotational speed of the cage (FTF). Hence, amplitude modulation produces sidebands as an explicit symptom of inner-raceway and roller faults in bearings. Furthermore, random variations in the analytic bearing defect frequencies on the order of 1–2% are observed [30], which are accounted for by the factor RV

_{order}= 2% [27]. Frequency ranges for outer-related defects (N

_{OFreqR}

_{,i}), inner-related defects (N

_{IFreqR}

_{,i}), and roller-related defects (N

_{RFreqR}

_{,i}) are computed using Equation (4) and depicted in Figure 6.

_{BPFI}

_{1}(f

_{14}), RMSF

_{BPFI}

_{2}(f

_{15}), RMSF

_{BPFI}

_{3}(f

_{16}), RMSF

_{BPFO}

_{1}(f

_{17}), RMSF

_{BPFO}

_{2}(f

_{18}), RMSF

_{BPFO}

_{3}(f

_{19}), RMSF

_{2 × BSF1}(f

_{20}), RMSF

_{2 × BSF2}(f

_{21}), RMSF

_{2×BSF3}(f

_{22}). In total, we extract 22 features from each signal to create its heterogeneous feature pool.

#### 3.2. DSK Construction by Selecting Discriminant Fault Signatures of Detected Fault Modes

_{detected_class}× N

_{sample}× N

_{feature}) are divided into two halves, where one half is used for the filter selection process and the other half is used for the wrapper selection process (N

_{feature}is the size of the original feature vector). In the filter selection process, (N

_{detected_class}× N

_{sample}/2 × N

_{feature}) are randomly divided into a k-fold cross validation k

_{cv}, where k = 2. Optimal feature sets are selected using sequential forward selection (SFS) with feature subset evaluation. The filter selection process selects 10 suboptimal feature subsets from N

_{iteration}× k

_{cv}, where N

_{iteration}= 5.

_{detected_class}× N

_{sample}/2 × N

_{feature}) are used as a training set and the rest are used as a resting set. The wrapper selection process picks the best feature combination depending on the maximum average classification accuracy. Figure 7 illustrates the overall flow of the DSK construction by selecting discriminant fault signatures of detected fault modes.

#### 3.3. Proposed New Fault Mode Detection

#### 3.3.1. k-Means Clustering

_{1}, …x

_{n}} are assigned to disjoint clusters corresponding to the nearest centroid based on a clustering criterion function F [19], which is generally the squared distance between each sample x

_{i}and the centroid c

_{j}of cluster C

_{j}, as given in Equation (6)

_{1}…c

_{k}are the centroids of clusters C

_{1}…C

_{k}, respectively, k is the total number of clusters, and N

_{j}is the total number of samples assigned to the ith cluster. After data samples are assigned to clusters, centroids are relocated based on their membership information.

_{1}, … d

_{n}, then calculates the Euclidean distance between each pair of consecutive points, D

_{i}= dist (d

_{i}− d

_{i}

_{+1}), where i = 1 … n − 1. Next, the distances D

_{i}are sorted in descending order while keeping their original indices i for identifying the position numbers after further processing. The algorithm selects the first k − 1 distances, where k is the user-defined number of clusters, and points corresponding to the selected distances are combined into an upper bound set of k data groups {i

_{1}… i

_{k−}

_{1}, n}, and a lower bound set of k data groups {1, i

_{1}+ 1, … i

_{k−}

_{1}+ 1}. Finally, center medians are calculated using the data points between the lower and upper bound positions. These center medians are the initial k centroids and are used as the initial seeds for the k-means clustering process. This process is illustrated in Figure 10.

#### 3.3.2. The Proposed MPDFCDF Cluster Evaluation Method

_{kmeans}to be specified in advance. However, it is difficult to know the number of clusters (fault modes) of unknown signals in online fault detection and diagnosis system. This problem can be resolved by repeating the clustering process for different values of k

_{kmeans}(e.g., k

_{kmeans}= 2 … 7) and then determining k

_{opt}through an efficient cluster evaluation method. To do this, elbow is a well-known unsupervised cluster evaluation technique [39,40]. The elbow is a visual graph evaluation technique, where the number of clusters is increased one by one and evaluate a cost function of the clusters in every stapes. The cost function is basically the calculation of the sum of squared errors of clusters. During the time of increasing the number of clusters, the value of cost function dramatically decreased and then became steady. Based on the curve of the graph, it determines the optimal number of cluster. However, the visual determination is often ambiguous. Also, the evaluation of the cost function of clusters is challenging.

^{2}is the variance

_{opt}. If samples are distributed in a scattered way, the probability that data exist in dense areas is low, and the deviation of samples from the mean is high. On the other hand, if samples are distributed in a compact way, the probability that data exist in dense areas is high and the deviation of samples from the mean is low. Thus, the ratio of the highest PDF value to the variance of the distribution is a good measure of the cluster density, whereas distances between means of clusters measure the cluster separation. k

_{opt}is identified by evaluating clustered samples on these criteria, using different values of k. The MPDFCDF cluster evaluation method, thus, consists of the following steps, which are illustrated in Figure 12:

- Step 1: Classify samples belonging to k clusters by k-means clustering;
- Step 2: Calculate the mean µ and covariance ∑ of all clusters;
- Step 3: Calculate the PDFs for all clusters using Equation (8);
- Step 4: Calculate the local distribution factor for each cluster;$$Local\_Distribution\_Facto{r}_{c}=\raisebox{1ex}{$\mathrm{max}({\rho}_{c}(X))$}\!\left/ \!\raisebox{-1ex}{$2\times \mathrm{max}({\sum}_{c})$}\right.;$$
- Step 5: Calculate the global density factor for the distributionGlobal Density Factor = min(Local_density_factor);
- Step 6: Calculate the global separability factorGlobal Separability Factor = min(Inter_Cluster_Dist)$$Inter\_Cluster\_Dist({c}_{i}-{c}_{j})=\sqrt{{({c}_{i}-{c}_{j})}^{2}};$$
- Step 7: Calculate the MPDFCDFMPDFCDF = abs(Global Density Factor-Global Separability Factor)
_{opt}.

#### 3.4. System Update

_{opt}) in the combined data. If k

_{opt}is larger than the number of known faults F

_{dsk}, the model automatically updates the DSK to incorporate the newly detected fault class.

#### 3.5. Fault Classification Using k-NN

_{knn}nearest neighbors, which are determined using distance criteria. Several distance criteria have been used in previous studies, such as the Euclidean distance, correlation between samples, city block distance, cosine distance, and Hamming distance. In this study, we use the Euclidean distance, which is the most widely used distance criterion and can be formulated as Equation (9).

_{i}−x

_{j}) represents the Euclidean distance between two data points x

_{i}and x

_{j}, and samples are represented by d feature dimensions.

## 4. Experimental Results and Analysis

#### 4.1. Experimental Datasets

#### 4.2. Identification of the Optimal Number of Clusters Kopt Using the MPDFCDF Cluster Evaluation Method

#### 4.3. New Fault Mode Detection and System Update for Online Fault Diagnosis

_{opt}, which represents the number of fault modes in the analysis set. If k

_{opt}is greater than F

_{dsk}, the number of features in the current DSK, the new fault signatures are added to the DSK.

_{dsk}× N

_{dsk}× F

_{dsk},

_{dsk}× N

_{dsk}× F

_{dsk}, where C

_{dsk}is the number of emerging fault classes, N

_{dsk}is the number of samples in each class, and F

_{dsk}is the number of discriminant fault signature variables.

#### 4.4. Effectiveness of Online Fault Diagnosis

_{knn}is set to 5 (the number of nearest neighbors k is equal to the nearest integer to $\sqrt{{N}_{dsk}}$ [25]). The evaluation process includes five progressive test cases t, each containing an evaluation set C

_{eval},

_{t}× N

_{eval},

_{t}. To calculate the classification accuracy before updating the system, the DSK of the previous test case C

_{dsk},

_{t}

_{−1}× N

_{dsk},

_{t}

_{−1}× F

_{dsk},

_{t}

_{−1}is used as a training set and C

_{eval},

_{t}× N

_{eval},

_{t}× F

_{dsk},

_{t}

_{−1}is used as a test set. To calculate the classification accuracy after updating the system, the DSK of the current test case C

_{dsk},

_{t}× N

_{dsk},

_{t}× F

_{dsk},

_{t}is used as a training set and C

_{eval},

_{t}× N

_{eval},

_{t}× F

_{dsk},

_{t}is used as a test set. In the classification process, we use k-cv (cross-validation), where k

_{cv}is set to 3. Thus, N

_{eval},

_{t}is randomly divided into thirds, and each third is used for testing during each k-cv iteration. The cross-validation process is executed N

_{iteration}times (N

_{iteration}= 10 in this study), and the generalized classification performance is calculated by averaging the classification accuracy. Table 4 presents the diagnosis performance for the five test cases in terms of the average sensitivity per class and the average classification accuracy (ACA), defined in Equations (10) and (11), respectively.

_{TP}is the number of correctly classified samples in a fault class, N

_{FN}is the number of incorrectly classified samples in a fault class, and N

_{samples}is the total number of test samples.

## 5. Conclusions

_{opt}is challenging due to the dynamic behavior of faults. For this reason, a new MPDFCDF cluster evaluation method is introduced to calculate k

_{opt}and establish the existence of new fault modes. When a new fault mode is detected, the system automatically updates the DSK. Finally, the optimal features selection method is used for selecting optimal features from the updated fault database and the k-NN classification model is adopted for identifying faults in the unknown signals. For evaluating the proposed model, sensor signals with multiple faults, different fault severities, and different rotation speeds were considered. According to the experimental results, the proposed model showed much better diagnosis performance after incorporating the detected new fault. In addition, the proposed MPDFCDF algorithm outperformed existing cluster evaluation methods in terms of finding the optimal number of fault modes in data.

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Zhang, P.; Du, Y.; Habetler, T.G.; Lu, B. A survey of condition monitoring and protection methods for medium-voltage induction motors. IEEE Trans. Ind. Appl.
**2011**, 47, 34–46. [Google Scholar] [CrossRef] - Kang, M.; Kim, J.; Kim, J.-M.; Tan, A.C.C.; Kim, E.Y.; Choi, B.-K. Reliable Fault Diagnosis for Low-Speed Bearings Using Individually Trained Support Vector Machines with Kernel Discriminative Feature Analysis. IEEE Trans. Power Electron.
**2015**, 30, 2786–2797. [Google Scholar] [CrossRef] - Xu, Y.; Zhang, K.; Ma, C.; Cui, L.; Tian, W. Adaptive Kurtogram and its applications in rolling bearing fault diagnosis. Mech. Syst. Signal Process.
**2019**, 130, 87–107. [Google Scholar] [CrossRef] - Kang, M.; Islam, M.R.; Kim, J.; Kim, J.; Pecht, M. A Hybrid Feature Selection Scheme for Reducing Diagnostic Performance Deterioration Caused by Outliers in Data-Driven Diagnostics. IEEE Trans. Ind. Electron.
**2016**, 63, 3299–3310. [Google Scholar] [CrossRef] - Seshadrinath, J.; Singh, B.; Panigrahi, B.K. Vibration analysis based interturn fault diagnosis in induction machines. IEEE Trans. Ind. Inform.
**2014**, 10, 340–350. [Google Scholar] [CrossRef] - Zhou, W.; Lu, B.; Habetler, T.G.; Harley, R.G. Incipient Bearing Fault Detection via Motor Stator Current Noise Cancellation Using Wiener Filter. IEEE Trans. Ind. Appl.
**2009**, 45, 1309–1317. [Google Scholar] [CrossRef] - Zhou, W.; Habetler, T.G.; Harley, R.G. Bearing Fault Detection via Stator Current Noise Cancellation and Statistical Control. IEEE Trans. Ind. Electron.
**2008**, 55, 4260–4269. [Google Scholar] [CrossRef] - Berry, J.E. How to Track Rolling Element Bearing Health with Vibration Signature Analysis. Sound Vib.
**1991**, 11, 24–35. [Google Scholar] - Jiang, F.; Zhu, Z.; Li, W.; Ren, Y.; Zhou, G.; Chang, Y. A Fusion Feature Extraction Method Using EEMD and Correlation Coefficient Analysis for Bearing Fault Diagnosis. Appl. Sci.
**2018**, 8, 1621. [Google Scholar] [CrossRef] - Seshadrinath, J.; Singh, B.; Panigrahi, B.K. Investigation of vibration signatures for multiple fault diagnosis in variable frequency drives using complex wavelets. IEEE Trans. Power Electron.
**2014**, 29, 936–945. [Google Scholar] [CrossRef] - Pandya, D.H.; Upadhyay, S.H.; Harsha, S.P. Fault diagnosis of rolling element bearing with intrinsic mode function of acoustic emission data using APF-KNN. Expert Syst. Appl.
**2013**, 40, 4137–4145. [Google Scholar] [CrossRef] - Niknam, S.A.; Songmene, V.; Au, Y.H.J. The Use of Acoustic Emission Information to Distinguish Between Dry and Lubricated Rolling Element Bearings in Low-Speed Rotating Machines. Int. J. Adv. Manuf. Technol.
**2013**, 69, 2679–2689. [Google Scholar] [CrossRef] - Eftekharnejad, B.; Carrasco, M.R.; Charnley, B.; Mba, D. The Application of Spectral Kurtosis on Acoustic Emission and Vibrations from a Defective Bearing. Mech. Syst. Signal Process.
**2011**, 25, 266–284. [Google Scholar] [CrossRef] - Rauber, T.W.; Boldt, F.A.; Varejao, F.M. Heterogeneous Feature Models and Feature Selection Applied to Bearing Fault Diagnosis. IEEE Trans. Ind. Electron.
**2015**, 62, 637–646. [Google Scholar] [CrossRef] - Islam, R.; Khan, S.A.; Kim, J.-M. Discriminant Feature Distribution Analysis-Based Hybrid Feature Selection for Online Bearing Fault Diagnosis in Induction Motors. J. Sens.
**2016**, 2016, 1–16. [Google Scholar] [CrossRef] - Yu, K.; Lin, T.R.; Tan, J.; Ma, H. An adaptive sensitive frequency band selection method for empirical wavelet transform and its application in bearing fault diagnosis. Measurement
**2019**, 134, 375–384. [Google Scholar] [CrossRef] - Yin, G.; Zhang, Y.-T.; Li, Z.-N.; Ren, G.-Q.; Fan, H.-B. Online fault diagnosis method based on Incremental Support Vector Data Description and Extreme Learning Machine with incremental output structure. Neurocomputing
**2014**, 128, 224–231. [Google Scholar] [CrossRef] - Jiang, W.; Zhou, J.; Liu, H.; Shan, Y. A multi-step progressive fault diagnosis method for rolling element bearing based on energy entropy theory and hybrid ensemble auto-encoder. ISA Trans.
**2019**, 87, 235–250. [Google Scholar] [CrossRef] [PubMed] - Yiakopoulos, C.T.; Gryllias, K.C.; Antoniadis, I.A. Rolling element bearing fault detection in industrial environments based on a K-means clustering approach. Expert Syst. Appl.
**2011**, 38, 2888–2911. [Google Scholar] [CrossRef] - Khan, F. An initial seed selection algorithm for k-means clustering of geo referenced data to improve replicability of cluster assignments for mapping application. Appl. Soft Comput.
**2012**, 12, 3698–3700. [Google Scholar] [CrossRef] - Bradley, P.S.; Fayyad, U.M. Refining initial points for K-means clustering. In Proceedings of the Fifteenth International Conference on Machine Learning, San Francisco, CA, USA, 24–27 July 1998; pp. 91–99. [Google Scholar]
- Likas, A.; Vlassis, N.; Verbeek, J.J. The global K-means clustering algorithm. Pattern Recognit.
**2003**, 36, 451–461. [Google Scholar] [CrossRef] - Arthur, D.; Vassilvitskii, S. K-means ++: The Advantages of Careful Seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA, 7–9 January 2007; pp. 1027–1035. [Google Scholar]
- Kaufman, L.; Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2005. [Google Scholar]
- Tan, P.-N.; Steinbach, M.; Kumar, V. Introduction to Data Mining, 1st ed.; Pearson Addison Wessley: Boston, MA, USA, 2005. [Google Scholar]
- Rahman, M.A.; Islam, M.Z. A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowl.-Based Syst.
**2014**, 71, 345–365. [Google Scholar] [CrossRef] - Kang, M.; Kim, J.; Wills, L.M.; Kim, J.-M. Time-Varying and Multi resolution Envelope Analysis and Discriminative Feature Analysis for Bearing Fault Diagnosis. IEEE Trans. Power Electron.
**2015**, 62, 7749–7761. [Google Scholar] - WS Sensor, General Purpose Wideband Sensor. Available online: http://www.physicalacoustics.com/content/literature/sensors/Model_WSa.pdf (accessed on 20 May 2019).
- Bediaga, I.; Mendizabal, X.; Arnaiz, A.; Munoa, J. Ball Bearing Damage Detection Using Traditional Signal Processing Algorithms. IEEE Instrum. Meas. Mag.
**2013**, 16, 20–25. [Google Scholar] [CrossRef] - Randall, R.B.; Antoni, J. Rolling Element Bearing Diagnostics—A Tutorial. Mech. Syst. Signal Process.
**2011**, 25, 485–520. [Google Scholar] [CrossRef] - Li, B.; Zhang, P.-L.; Tian, H.; Mi, S.-S.; Liu, D.-S.; Ren, G.-Q. A New Feature Extraction and Selection Scheme for Hybrid Fault Diagnosis of Gearbox. Expert Syst. Appl.
**2011**, 38, 10000–10009. [Google Scholar] [CrossRef] - Liu, C.; Jiang, D.; Yang, W. Global Geometric Similarity Scheme for Feature Selection in Fault Diagnosis. Expert Syst. Appl.
**2014**, 41, 3585–3595. [Google Scholar] [CrossRef] - Li, Z.; Yan, X.; Tian, Z.; Yuan, C.; Peng, Z.; Li, L. Blind Vibration Component Separation and Nonlinear Feature Extraction Applied to the Non stationary Vibration Signals for the Gearbox Multi-Fault Diagnosis. Measurement
**2013**, 46, 259–271. [Google Scholar] [CrossRef] - Islam, R.; Khan, S.A.; Kim, J.-M. Maximum class separability-based discriminant feature selection using a GA for reliable fault diagnosis of induction motors. Lect. Notes Artif. Intell. (LNAI)
**2015**, 9227, 526–537. [Google Scholar] - Steinley, D. K-means clustering: A half-century synthesis. Br. J. Math. Stat. Psychol.
**2006**, 9, 1–34. [Google Scholar] [CrossRef] - Wu, X.; Kumar, V.; Quinlan, J.R.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G.J.; Ng, A.; Liu, B.; Yu, P.S.; et al. Top 10 algorithms in data mining. Knowl. Inf. Syst.
**2008**, 14, 1–37. [Google Scholar] [CrossRef] - Lloyd, S.P. Least Squares Quantization in PCM. IEEE Trans. Inf. Theory
**1982**, 8, 129–137. [Google Scholar] [CrossRef] - Naldi, M.C.; Campello, R.J.G.B. Comparison of distributed evolutionary k-means clustering algorithms. Neurocomputing
**2015**, 163, 78–93. [Google Scholar] [CrossRef] - Zhang, Y.; Mańdziuk, J.; Quek, C.H.; Goh, B.W. Curvature-based method for determining the number of clusters. Inf. Sci.
**2017**, 415–416, 414–428. [Google Scholar] [CrossRef] - Yahyaoui, H.; Own, H.S. Unsupervised clustering of service performance behaviors. Inf. Sci.
**2018**, 422, 558–571. [Google Scholar] [CrossRef] - Kazmier, L.J. Schaum’s Outline of Business Statistics; McGraw Hill Professional: New York, NY, USA, 2009; p. 359. [Google Scholar]
- Yigit, H. A weighting approach for KNN classifier. In Proceedings of the 2013 International Conference on Electronics, Computer and Computation (ICECCO), Ankara, Turkey, 7–9 November 2013; pp. 228–231. [Google Scholar]

**Figure 2.**Cylindrical rolling element bearings, (

**a**) parts of the bearing, (

**b**) measurement of bearing.

**Figure 4.**Faulty bearing parts with different crack severities. (

**a**) BCO (bearing crack on the outer surface) 3 mm, (

**b**) BCI (bearing crack on the inner surface) 3 mm, (

**c**) BCR (bearing crack on the roller surface) 3 mm, (

**d**) BCO 12 mm, (

**e**) BCI 12 mm, (

**f**) BCR 10 mm.

**Figure 5.**Flow diagram of the proposed online bearing fault diagnosis model, where F

_{dsk}is the number of fault modes in the diagnostic system knowledge (DSK).

**Figure 6.**Calculation of N

_{OFreqR}

_{,i}, N

_{IFreqR}

_{,i}, and N

_{RFreqR}

_{,i}. (

**a**) Defect frequencies of BCI; (

**b**) defect frequencies of BCO; (

**c**) defect frequencies of BCR.

**Figure 7.**The flow of DSK construction by selecting discriminant fault signatures of detected fault modes.

**Figure 8.**Feature subset evaluation algorithm: (

**a**) Within-class compactness value, and (

**b**) between-class distance value.

**Figure 9.**Proposed fault mode detection using k-means clustering and a novel cluster evaluation algorithm to detect the optimal number of clusters k.

**Figure 11.**Two cluster distribution examples, (

**a**) well separated data into easily identifiable clusters and (

**b**) two data groups are in close proximity to each other but far away from the other groups.

**Figure 12.**Steps of the multivariate probability density function’s cluster distribution factor (MPDFCDF) cluster evaluation method.

**Figure 13.**(

**a**–

**e**) Sample distribution of different datasets and optimal k

_{opt}(blue big circle) selection using the compactness and separation measure of (

**f**–

**j**) clusters (COSES) method, (

**k**–

**o**) silhouette coefficient, and (

**p**–

**t**) the proposed MPDFCDF cluster evaluation method.

**Figure 14.**Diagnosis performance improvement for different test cases with different conditional datasets, i.e., (

**a**) Dataset-1: 300 RPM, (

**b**) Dataset-2: 350 RPM, (

**c**) Dataset-3: 400 RPM, (

**d**) Dataset-4: 450 RPM, and (

**e**) Dataset-5: 500 RPM. The y-axis represents average classification accuracy (ACA) (%).

Dataset 1 | Dataset 2 | Dataset 3 | Dataset 4 | Dataset 5 | ||
---|---|---|---|---|---|---|

Average RPM | 300 rpm | 350 rpm | 400 rpm | 450 rpm | 500 rpm | |

Fault Severity | Small crack | Crack length: 3 mm, width: 0.35 mm, depth: 0.3 mm on outer raceway, inner raceway, and roller | ||||

Big crack | Crack length: 12 mm on outer and inner raceways and 10 mm on roller, width: 0.49 mm, depth: 0.5 mm |

Initial | Test Case 1 | Update System knowledge | Test Case 2 | Update System knowledge | Test Case 3 | No Update | Test Case 4 | Update System knowledge | Test Case 5 | Update System knowledge | Final | |

System Condition | FFB | FFB | FFB BCI 3 mm | FFB BCI 3 mm BCO 3 mm BCR 3 mm | FFB BCI 3 mm BCO 3 mm BCR 3 mm | FFB BCI 3 mm BCO 3 mm BCR 3 mm BCI 12 mm | FFB BCI 3 mm BCO 3 mm BCR 3 mm BCI 12 mm BCO 12 mm BCR 12 mm | |||||

Unknown Signals | BCI 3 mm | BCO 3 mm BCR 3 mm | BCI 3 mm BCR 3 mm | BCI 12 mm | BCO 12 mm BCR 12 mm |

Datasets | Test Cases | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Initial | Test Case 1 | Test Case 2 | Test Case 3 | Test Case 4 | Test Case 5 | |||||||

Features | System Knowledge | Features | System Knowledge | Features | System Knowledge | Features | System Knowledge | Features | System Knowledge | Features | System Knowledge | |

1 | f1~f22 | 1 × 30 × 22 | f1, f12, f13, f15 | 2 × 30 × 4 | f2, f9, f10 | 4 × 30 × 3 | f2, f9, f10 | 4 × 30 × 3 | f2, f9 | 5 × 30 × 2 | f2, f9, f13 | 7 × 30 × 3 |

2 | f1~f22 | 1 × 30 × 22 | f2, f9, f14 | 2 × 30 × 3 | f2, f9, f22 | 4 × 30 × 3 | f2, f9, f22 | 4 × 30 × 3 | f2, f9, f13 | 5 × 30 × 3 | f2, f9, f11, f22 | 7 × 30 × 4 |

3 | f1~f22 | 1 × 30 × 22 | f2, f15, f16, f20 | 2 × 30 × 4 | f2, f16 | 4 × 30 × 2 | f2, f16 | 4 × 30 × 2 | f2, f11, f15, f16 | 5 × 30 × 4 | f2, f9, f11 | 7 × 30 × 3 |

4 | f1~f22 | 1 × 30 × 22 | f2, f9, f13 | 2 × 30 × 3 | f2, f9, f20, f21 | 4 × 30 × 4 | f2, f9, f20, f21 | 4 × 30 × 4 | f2, f9 | 5 × 30 × 2 | f2, f9, f13 | 7 × 30 × 3 |

5 | f1~f22 | 1 × 30 × 22 | f2, f9 | 2 × 30 × 2 | f2, f9, f20, f21 | 4 × 30 × 4 | f2, f9, f20, f21 | 4 × 30 × 4 | f2, f9, f22 | 5 × 30 × 3 | f2, f11 | 7 × 30 × 2 |

**Table 4.**Diagnosis performance of five successive test cases of datasets 1 to 5 in terms of average sensitivity per class.

Datasets | Average Sensitivities Per Class with Standard Deviation | |||||||||
---|---|---|---|---|---|---|---|---|---|---|

Test Case | System Condition | FFB | BCI 3 mm | BCO 3 mm | BCR 3 mm | BCI 12 mm | BCO 12 mm | BCR 10 mm | Average | |

Dataset-1: 300 RPM | Case-1 | Before Update | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||||

Case-2 | Before Update | 90.45 (2.3) | 91.23 (2.1) | 65.45 (4.6) | 69.56 (4.3) | 79.17 | ||||

Updated | 100.00 (0.0) | 94.25 (1.3) | 100.00 (0.0) | 100.00 (0.0) | 98.56 | |||||

Case-3 | Before Update | 99.58 (0.7) | 94.36 (1.8) | 100.00 (0.0) | 100.00 (0.0) | 98.49 | ||||

Updated | 100.00 (0.0) | 95.12 (1.1) | 100.00 (0.0) | 100.00 (0.0) | 98.78 | |||||

Case-4 | Before Update | 93.67 (2.1) | 94.26 (1.9) | 90.23 (2.4) | 88.59 (3.9) | 73.62 (4.3) | 88.07 | |||

Updated | 100.00 (0.0) | 93.50 (1.4) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 98.70 | ||||

Case-5 | Before Update | 96.56 (1.6) | 95.64 (1.7) | 91.56 (2.1) | 93.12 (1.8) | 98.26 (0.9) | 79.46 (4.1) | 82.69 (3.7) | 91.04 | |

Updated | 100.00 (0.0) | 95.25 (1.6) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 99.32 | ||

Dataset-2: 350 RPM | Case-1 | Before Update | 100.00 (0.0) | 97.78 (1.1) | 98.89 | |||||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||||

Case-2 | Before Update | 82.23 (3.2) | 73.12 (3.9) | 75.94 (3.6) | 75.45 (3.3) | 76.69 | ||||

Updated | 100.00 (0.0) | 89.85 (2.1) | 100.00 (0.0) | 100.00 (0.0) | 97.46 | |||||

Case-3 | Before Update | 100.00 (0.0) | 92.35 (1.6) | 100.00 (0.0) | 99.56 (0.7) | 97.98 | ||||

Updated | 100.00 (0.0) | 92.00 (1.5) | 100.00 (0.0) | 100.00 (0.0) | 98.00 | |||||

Case-4 | Before Update | 93.42 (1.5) | 93.26 (1.6) | 91.86 (2.1) | 94.56 (1.3) | 89.00 (2.0) | 92.42 | |||

Updated | 100.00 (0.0) | 97.78 (0.9) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 99.56 | ||||

Case-5 | Before Update | 92.65 (1.4) | 94.00 (1.0) | 92.15 (1.8) | 94.22 (1.6) | 88.25 (2.5) | 82.22 (2.8) | 86.00 (2.4) | 89.93 | |

Updated | 100.00 (0.0) | 98.00 (0.9) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 99.71 | ||

Dataset-3: 400 RPM | Case-1 | Before Update | 96.35 (0.6) | 100.00 (0.0) | 98.18 | |||||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||||

Case-2 | Before Update | 85.00 (2.1) | 82.56 (2.8) | 77.26 (3.2) | 78.95 (2.7) | 80.94 | ||||

Updated | 100.00 (0.0) | 98.25 (1.1) | 100.00 (0.0) | 100.00 (0.0) | 99.56 | |||||

Case-3 | Before Update | 100.00 (0.0) | 99.56 (0.4) | 100.00 (0.0) | 99.98 (0.1) | 99.89 | ||||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||

Case-4 | Before Update | 97.25 (1.3) | 96.00 (1.7) | 94.56 (1.6) | 97.15 (1.0) | 92.14 (1.7) | 95.42 | |||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | ||||

Case-5 | Before Update | 95.45 (1.2) | 96.25 (1.1) | 96.00 (1.0) | 97.43 (0.8) | 95.64 (1.5) | 82.22 (2.8) | 86.00 (2.9) | 92.71 | |

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | ||

Dataset-4: 450 RPM | Case-1 | Before Update | 99.45 (0.7) | 99.00 (0.6) | 99.23 | |||||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||||

Case-2 | Before Update | 88.00 (2.4) | 86.59 (2.3) | 82.45 (2.8) | 83.12 (2.8) | 85.04 | ||||

Updated | 100.00 (0.0) | 98.00 (1.6) | 100.00 (0.0) | 100.00 (0.0) | 99.50 | |||||

Case-3 | Before Update | 98.60 (1.5) | 99.45 (0.4) | 100.00 (0.0) | 99.85 (0.2) | 99.48 | ||||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||

Case-4 | Before Update | 97.46 (1.3) | 98.12 (1.2) | 97.23 (1.5) | 98.50 (1.3) | 93.00 (1.6) | 96.86 | |||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | ||||

Case-5 | Before Update | 95.89 (1.4) | 98.26 (1.1) | 96.23 (1.9) | 94.53 (2.1) | 97.58 (1.4) | 95.68 (1.6) | 88.26 (2.2) | 95.20 | |

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | ||

Dataset-5: 500 RPM | Case-1 | Before Update | 97.78 (1.6) | 100.00 (0.0) | 98.89 | |||||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||||

Case-2 | Before Update | 89.00 (1.9) | 87.50 (2.1) | 85.17 (2.3) | 85.85 (2.2) | 86.88 | ||||

Updated | 100.00 (0.0) | 98.00 (1.6) | 100.00 (0.0) | 100.00 (0.0) | 99.50 | |||||

Case-3 | Before Update | 100.00 (0.0) | 99.98 (0.3) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | ||||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | |||||

Case-4 | Before Update | 97.50 (1.3) | 97.80 (1.4) | 97.43 (1.2) | 98.30 (1.2) | 94.50 (1.9) | 97.11 | |||

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 | ||||

Case-5 | Before Update | 97.12 (1.6) | 98.60 (1.4) | 95.26 (1.8) | 98.20 (1.1) | 96.80 (1.3) | 87.50 (2.6) | 90.76 (1.8) | 94.89 | |

Updated | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 (0.0) | 100.00 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Islam, M.R.; Kim, Y.-H.; Kim, J.-Y.; Kim, J.-M. Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis. *Appl. Sci.* **2019**, *9*, 2326.
https://doi.org/10.3390/app9112326

**AMA Style**

Islam MR, Kim Y-H, Kim J-Y, Kim J-M. Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis. *Applied Sciences*. 2019; 9(11):2326.
https://doi.org/10.3390/app9112326

**Chicago/Turabian Style**

Islam, Md Rashedul, Young-Hun Kim, Jae-Young Kim, and Jong-Myon Kim. 2019. "Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis" *Applied Sciences* 9, no. 11: 2326.
https://doi.org/10.3390/app9112326