Fuzzy C-Means Based Clustering and Rule Formation Approach for Classification of Bearing Faults Using Discrete Wavelet Transform

The rolling bearings are considered as the heart of rotating machinery and early fault diagnosis is one of the biggest challenges during operation. Due to complicated mechanical assemblies, detection of the advancing fault and faults at the incipient stage is very difficult and tedious. This work presents a fuzzy rule based classification of bearing faults using Fuzzy C-means clustering method using vibration measurements. Experiments were conducted to collect the vibration signals of a normal bearing and bearings with faults in the inner race, outer race and ball fault. Discrete Wavelet Transform (DWT) technique is used to decompose the vibration signals into different frequency bands. In order to detect the early faults in the bearings, various statistical features were extracted from this decomposed signal of each frequency band. Based on the extracted features, Fuzzy C-means clustering method (FCM) is developed to classify the faults using suitable membership functions and fuzzy rule base is developed for each class of the bearing fault using labeled data. The experimental results show that the proposed method is able to classify the condition of the bearing using the extracted features. The proposed FCM based clustering and classification model provides easier interpretation and implementation for monitoring the condition of the rolling bearings at an early stage and it will be helpful to take the preventive action before a large-scale failure.


Introduction
Bearings are one of the most critical elements in rotating machinery systems. Early bearing fault detection and diagnosis are very important to prevent critical system failures as any fault on the rolling element bearing leads to sudden, unwarned machine breakdowns and unpredicted failures. It leads to huge loss and affects the production rate. Vibration measurements using accelerometer is commonly used for fault detection in bearings. In order to determine the presence of a bearing fault, existing fault diagnostic methods use fault characteristic frequencies and its sensitiveness towards geometrical parameters of bearing such as inner race diameter, outer race diameter, pitch circle diameter, the diameter of rolling element, radial clearance, etc. However, the defects in bearings cause variation in the frequency at which it operates and this signal is modulated by the natural frequency of the bearings. Also, it is difficult to accurately identify the occurrence of a bearing fault at its incipient stage due to the small signal-to-noise ratio of the vibration signals. Hence, there is a need for developing improved fault diagnosis methods to identify and classify the bearing faults at the early stage. the proposed methodology for fault diagnosis of roller element bearings. Section 4 describes the results and inferences of the proposed methodology. Major conclusions are presented in Section 5.

Experimental Arrangement
In the present work, a test rig is developed for conducting the vibration measurements for the standard roller element ball bearing as shown in Figure 1. The test rig comprises a three-phase 0.5 HP AC motor-driven shaft bearing mechanical system, shaft coupling, roller element ball bearing and bearing housing.
Computation 2019, 7, x FOR PEER REVIEW 3 of 18 the proposed methodology for fault diagnosis of roller element bearings. Section.4 describes the results and inferences of the proposed methodology. Major conclusions are presented in Section.5.

Experimental Arrangement
In the present work, a test rig is developed for conducting the vibration measurements for the standard roller element ball bearing as shown in Figure 1. The test rig comprises a three-phase 0.5 HP AC motor-driven shaft bearing mechanical system, shaft coupling, roller element ball bearing and bearing housing. The specifications of the ball bearing are provided in Table 1. In order to understand the defects at the early stage, a cut of 0.5 mm was created using electro-discharge machining on the inner race, the outer race of bearing.  Kg : 0.110 In this work, an accelerometer (model: Kistler-Type 8730A500) with a sensitivity of 10 mV/g and an acceleration range of ± 500g was mounted on the bearing housing to acquire the acceleration signals from the bearing. PC based Data Acquisition (DAQ) system consisting of LABVIEW software is used to obtain vibration signals for No-Fault (NF) and different types of faulty bearings such as Inner Race Fault (IRF), Outer Race Fault (ORF) and Ball Fault (BF). In the present work, the length of the signal consisting of 1, 20,000 data are collected for each bearing conditions such as no fault, Inner race fault, outer race fault and ball fault at a sampling rate of 12 kHz for the duration of 29 s and stored it in PC for further processing. The acquired vibration signals from the test rig are shown in Figure 2. The collected signals are divided into bins containing 10,000 samples for feature extraction and analysis of bearing faults. The specifications of the ball bearing are provided in Table 1. In order to understand the defects at the early stage, a cut of 0.5 mm was created using electro-discharge machining on the inner race, the outer race of bearing. In this work, an accelerometer (model: Kistler-Type 8730A500, Kistler Instrument Corp., NY, USA) with a sensitivity of 10 mV/g and an acceleration range of ±500 was mounted on the bearing housing to acquire the acceleration signals from the bearing. PC based Data Acquisition (DAQ) system consisting of LABVIEW software is used to obtain vibration signals for No-Fault (NF) and different types of faulty bearings such as Inner Race Fault (IRF), Outer Race Fault (ORF) and Ball Fault (BF). In the present work, the length of the signal consisting of 120,000 data are collected for each bearing conditions such as no fault, Inner race fault, outer race fault and ball fault at a sampling rate of 12 kHz for the duration of 29 s and stored it in PC for further processing. The acquired vibration signals from the test rig are shown in Figure 2. The collected signals are divided into bins containing 10,000 samples for feature extraction and analysis of bearing faults. A closer view of the acquired vibration signal for different bearing conditions is shown in Figure  3. It can be noticed that vibration signal for the healthy bearing is periodic with low magnitude peaks and relatively large amplitude of dispersed and non-periodic peaks are seen for inner race defect. Vibration signal for the ball defect and outer race defect shows relatively high amplitude vibration with chaos and intermittent behavior. As the faults in bearing components such inner race, outer race and ball provides an unknown impulse response in the acquired vibration signal, there are changes in the magnitude of energy at different frequency bands related to these faults [26], and it requires further analysis of vibration data for bearing fault detection and classification. A closer view of the acquired vibration signal for different bearing conditions is shown in Figure 3. It can be noticed that vibration signal for the healthy bearing is periodic with low magnitude peaks and relatively large amplitude of dispersed and non-periodic peaks are seen for inner race defect. Vibration signal for the ball defect and outer race defect shows relatively high amplitude vibration with chaos and intermittent behavior. A closer view of the acquired vibration signal for different bearing conditions is shown in Figure  3. It can be noticed that vibration signal for the healthy bearing is periodic with low magnitude peaks and relatively large amplitude of dispersed and non-periodic peaks are seen for inner race defect. Vibration signal for the ball defect and outer race defect shows relatively high amplitude vibration with chaos and intermittent behavior. As the faults in bearing components such inner race, outer race and ball provides an unknown impulse response in the acquired vibration signal, there are changes in the magnitude of energy at different frequency bands related to these faults [26], and it requires further analysis of vibration data for bearing fault detection and classification. As the faults in bearing components such inner race, outer race and ball provides an unknown impulse response in the acquired vibration signal, there are changes in the magnitude of energy at different frequency bands related to these faults [26], and it requires further analysis of vibration data for bearing fault detection and classification.

Proposed Methodology for Bearing Fault Classification
In the present work, time-frequency domain technique, i.e. wavelet transform is used for analyzing and detecting the bearing faults due to the non-stationary and nonlinear characteristics of the vibration signals. Figure 4 depicts the schematic diagram of the proposed method for fault classification of rolling bearing.

Proposed Methodology for Bearing Fault Classification
In the present work, time-frequency domain technique, i.e. wavelet transform is used for analyzing and detecting the bearing faults due to the non-stationary and nonlinear characteristics of the vibration signals. Figure 4 depicts the schematic diagram of the proposed method for fault classification of rolling bearing. The steps involved in the proposed method are given below and they are explained in the subsequent sections: (i) Decomposing the vibration signal into N levels using filtering and decimation to obtain the approximation and detailed coefficients; (ii) Extracting the statistical features from the DWT coefficients; (iii) Fuzzy C-Means clustering approach for grading the bearing faults using suitable fuzzy membership functions.

Decomposition of Vibration Signal Using Discrete Wavelet Transform
Wavelet is defined as a vanishing wave in an oscillatory motion which has energy concentrated with time. The continuous wavelet transform (CWT) of a time varying signal f(t) is identified as the sum over all time of the signal multiplied by scaled, shifted versions of the wavelet function Ψ(t) as given by Equation (1).
Here the parameters 'a' and 'b' are the translation and dilation of the wavelets. By generating daughter wavelets , from the mother wavelet more time-frequency information can be extracted, which is limited to finite space. The Discrete wavelet transform (DWT) is derived from the discretization of CWT (a,b) and the most common discretization is dyadic, given also by McFadden and Smith [27] as, DWT uses the power of 2 as scale and position values where 2 j a  and ⥂⥂ * 2 .  The steps involved in the proposed method are given below and they are explained in the subsequent sections: (i) Decomposing the vibration signal into N levels using filtering and decimation to obtain the approximation and detailed coefficients; (ii) Extracting the statistical features from the DWT coefficients; (iii) Fuzzy C-Means clustering approach for grading the bearing faults using suitable fuzzy membership functions.

Decomposition of Vibration Signal Using Discrete Wavelet Transform
Wavelet is defined as a vanishing wave in an oscillatory motion which has energy concentrated with time. The continuous wavelet transform (CWT) of a time varying signal f (t) is identified as the sum over all time of the signal multiplied by scaled, shifted versions of the wavelet function Ψ(t) as given by Equation (1).
Here the parameters 'a' and 'b' are the translation and dilation of the wavelets. By generating daughter wavelets ψ a,b (t) from the mother wavelet ψ(t) more time-frequency information can be extracted, which is limited to finite space.
The Discrete wavelet transform (DWT) is derived from the discretization of CWT (a,b) and the most common discretization is dyadic, given also by McFadden and Smith [27] as, DWT uses the power of 2 as scale and position values where a = 2 j and b = k * 2 j .
The DWT of a signal is calculated by passing it through a series of filters. The signal samples are decomposed by passing through a low pass filter and also a high pass filter in parallel. This decomposition process is iterative with successive approximations such that signal is broken down into many lower resolution component. Here the approximation coefficients and detail coefficients are obtained from the low pass filter and high pass filter respectively and it is shown as a decomposition tree as shown in Figure 5. The DWT of a signal is calculated by passing it through a series of filters. The signal samples are decomposed by passing through a low pass filter and also a high pass filter in parallel. This decomposition process is iterative with successive approximations such that signal is broken down into many lower resolution component. Here the approximation coefficients and detail coefficients are obtained from the low pass filter and high pass filter respectively and it is shown as a decomposition tree as shown in Figure 5. In the present work, statistical features are extracted from DWT coefficients (cD1, cD2, cD3, cD4) and (cA1, cA2, cA3, cA4) to detect the bearing faults and it is explained in the next section.

Extraction of Statistical Features for Different Bearing Faults Using DWT Coefficients
As the statistical features such as Mean, Variance, kurtosis value (KV), Root Mean Square value (RMS), peak-peak value (PPV), shape factor (SF) crest factor (CF), impulse factor (IF) are sensitive to the bearing faults of impulsive in nature, they are extracted from detail and approximation coefficients of wavelet transform [28]. Table2 highlights the standard formula used for calculating the statistical features [12].

S. No Notation
Feature Formula Peak to Peak In the present work, statistical features are extracted from DWT coefficients (cD1, cD2, cD3, cD4) and (cA1, cA2, cA3, cA4) to detect the bearing faults and it is explained in the next section.

Extraction of Statistical Features for Different Bearing Faults Using DWT Coefficients
As the statistical features such as Mean, Variance, kurtosis value (KV), Root Mean Square value (RMS), peak-peak value (PPV), shape factor (SF) crest factor (CF), impulse factor (IF) are sensitive to the bearing faults of impulsive in nature, they are extracted from detail and approximation coefficients of wavelet transform [28]. Table 2 highlights the standard formula used for calculating the statistical features [12]. Table 2. List of statistical features and formulas.

S. No
Notation Feature Formula In the present work, the statistical features X = {x 1 , x 2 , . . . , x n } are extracted from detailed coefficients (cD1, cD2, cD3, cD4) and approximate coefficients (cA1, cA2, cA3, cA4) of wavelet transform respectively and the sample values are shown in Table 3. Table 3. List of statistical features and the output classes for different bearing conditions.

Extracted Features from DWT Coefficients
Output Classes

Fuzzy C-Means Clustering (FCM) of Extracted Statistical Features
As the incipient bearing faults are difficult to detect due to the complexity of the small signal-to-noise ratio of the vibration signals, the crisp clustering fails as the boundaries among the clusters are vague and ambiguous for different bearing faults. Hence FCM is applied for classification of bearing faults in the present work. It is one of the most common clustering algorithms which minimize the Euclidean distance between each sample and all clustering centers using an objective function [29]. Let X = {x 1 , x 2 , . . . , x n } be statistical feature vector of the given dataset with 'n' samples to be analyzed, which can be divided into c classes. Here V = {v 1 , v 2 , . . . , v c } be the set of centers of clusters in X dataset in p dimensional space. Where n is the number of objects, p is the number of features, and c is the number of partitions or clusters. Each U∈M fc is called a Fuzzy C-partition of X; Mfc is the Fuzzy C-partition space associated with X.
Here u ij = u j (x j ) is called the grade of membership of x j in the fuzzy set u j . Consider the following subset of Vcn. For each integer c, 2≤ c, < n, let Vcn be the vector space of c × n matrices with entries in [0,1], and let u ij denote the ij th elements of any U∈Vcn.
In order to obtain the optimum fuzzy partition, least squares objective function J is formulated and must be minimized as given by Equation (5) Here m is a fuzzifier parameter (or weighting exponent) whose value is chosen as a real number greater than 1 (1 ≤ m < ∞). While m approaches to 1 clustering tends to become crisp but when it goes to the infinity clustering becomes fuzzified and usually m is fixed as 2U = {u ik } is the membership function, with u ik ∈ [0,1], which denotes the degree of membership of the k-th pattern and i-th cluster centers; V = {V 1 , V 2 , . . . V c } is a vector of c cluster. These v i are interpreted as clusters defined by their companion U matrix and the optimal solution is found to be: Suppose that under a given conditions, a clustering center was determined by the feature of training data sets. Then all subsequent observations can be using the following equation where u k0 is the fuzzy grade of the current observation being assigned to kth pattern and X 0 is the current observation. The iteration process will terminate when, the maxx ij = U (k+1) − U (k) < ε, where ε, the criteria between 0 and 1 for termination and k are the iteration steps. For the above feature sets, the cluster centers are calculated using the Equations (6) and (7). Using the calculated cluster centers and the formulated membership functions for the statistical feature vectors, fuzzy inference rules are formulated for the classification of bearing faults.

Generation of Fuzzy Rules Using Cluster Centers
In the present work, a supervised fuzzy rule formation approach is followed for the classification bearing faults based on the estimated cluster centers [30,31]. For the given input statistical feature vectors, output classes are labeled as shown in Table 3. Mamdani type fuzzy rules are extracted based on the calculated vector of centroids of 'c' clusters V = {V 1 , V 2 , . . . , V c } using FCM for given feature vector X of different bearing faults.
If cluster center xi is found to be in the group of data for class 'c', then fuzzy sets on X i , For an input vector in input space X p , there exist p variables x 1 , x 2 , . . . , x p , which are defined in the interval X i = [a i , b i ], a i < b i , A ij are developed for each cluster. In this work, Gaussian function is used as the membership function for the input feature vectors and the output classes. Based on the estimated cluster center, the following fuzzy rule was extracted and assigned to class 'c': Rule i: if X 1 is A ij and X 2 is A ij . . . and X p is A ij , then the bearing fault class is 'c'.
Where X j is the j-th input feature and A ij is the membership function in the i-th rule associated with the j-th input feature. A ij is the membership function of the i-th rule. Actually, input data vector is ascribed to class 'c' if fuzzy rules determine vector's higher membership in class 'c'.

Results and Discussion
In order to validate the proposed detection and clustering methodology, experimental data acquired from the experimental setup for different bearing fault conditions and the results are presented in this section. The vibration signal of bearings is decomposed using 4 th order Daubechies wavelet ('db4') and statistical features are extracted from the wavelet coefficients. Further, the cluster centers are identified from the extracted statistical features; fuzzy rules are extracted for the labeled data of statistical features of different bearing faults. A fuzzy rule base and inference system is developed in MATLAB software environment and results are validated for test data.

Wavelet Decomposition of Vibration Signal with Different Bearing Faults
The vibration data for each classes of bearing faults were decomposed using wavelet transform and wavelet coefficients are calculated as it provides a compact representation of energy distribution of the vibration signal in time and frequency domain. Approximate (cA) and detailed (cD) coefficients of wavelet transform at each level of decomposition are graphically represented in Figure 6. It is noticed from Figure 6 that the decomposed signal obtained from the no-fault bearing has less magnitude which ranges from -0.1 to 0.1 for the approximation coefficient in 4th level. At the same time, it is found that ball fault has a maximum magnitude range from −2 to 2; whereas the other faults such as inner race fault and outer race fault have the magnitude range −0.2 to 0.2 and −0.5 to 0.5 respectively. From these results, it is evident that the magnitudes of wavelet coefficients are sensitive to different bearing faults.

Statistical Feature Extraction Using Wavelet Coefficients
As the feature extraction typically calculates quantitative information about the faults from the decomposed signal, the statistical features are directly calculated from the approximation and detail wavelet coefficients and it is listed for different conditions of the bearings in Table 3. It can be noticed that the magnitude of extracted statistical features such as Mean, variance, kurtosis, RMS show an increasing trend for different faults from wavelet coefficients for the no-fault condition and its magnitude is less as compared to the faulty conditions of the bearings as shown in Figure 7a.    From Figure 7b, it can be identified that the magnitude of statistical feature for iFactor of wavelet coefficients is found to be higher in magnitude as compared to the other statistical features. Magnitude of statistical features is found to be lesser for No fault bearing condition than the ball fault bearing condition. These results show the sensitivity of detailed and approximated coefficients of DWT for identifying the different bearing fault conditions.

Determination Cluster Centers Using Proposed FCM Method
The clustering centers for the extracted statistical features are calculated using Equation (6) and Equation (7) for four types of bearing conditions and it is shown as a data matrix (8 × 4) in Table 4. Based on the cluster centers, the membership values are assigned using Gaussian membership function using the Equation (8) for individual statistical features and it is shown in Figure 8.   Table 5.  Table 5. Table 5. Sample membership values of feature data for different fault classes.   The variation in membership values using Gaussian membership function for all the experimental data of statistical features are shown in Figure 9a. The surface and contour plot of the gaussian membership function is given in Figure 9b and it clearly shows 4 peaks and zones which describes four bearing faults determined by the fuzzy clustering method. It is also seen that the magnitude of membership values varies depending upon the position of cluster center. The sample numerical values of the membership function of the given statistical feature data for the four different states of the bearing such as no-fault (A), Inner race fault (B), Outer race fault (C) and Ball fault (D) are given in Table 5. It can be noticed that when the data point is nearer to the cluster center, the magnitude of membership value is higher and reached minimum when it moves further away from the corresponding cluster.

Development of Fuzzy Rule Base and Inferencing System
Based on the estimated cluster centers of input statistical features, the fault classes are labelled for each bearing fault class and fuzzy rules are extracted using Mamdani type fuzzy inferencing system as shown in Figure 10a.  It can be noticed that when the data point is nearer to the cluster center, the magnitude of membership value is higher and reached minimum when it moves further away from the corresponding cluster.

Development of Fuzzy Rule Base and Inferencing System
Based on the estimated cluster centers of input statistical features, the fault classes are labelled for each bearing fault class and fuzzy rules are extracted using Mamdani type fuzzy inferencing system as shown in Figure 10a. For the given magnitude of input statistical features such as Mean, Variance, kurtosis value (KV), Root Mean Square value (RMS), peak-peak value (PPV), shape factor (SF) crest factor (CF), impulse factor (IF), the corresponding cluster is identified using FCM and output class is assigned by the proposed fuzzy inferencing system. Following four rules are extracted using the estimated cluster centers for the bearing classes and it is given in Table 6. Table 6. Fuzzy rule base for bearing fault classification.

Rule No
Generated Fuzzy Rules Using FCM

Validation of the Proposed Approach
In order to validate the proposed approach, 4 samples of feature vector were randomly picked in each class of bearing faults for testing and validation purposes. The classification results are shown in Table 7 and the overall performance of bearing fault classification is found For the given magnitude of input statistical features such as Mean, Variance, kurtosis value (KV), Root Mean Square value (RMS), peak-peak value (PPV), shape factor (SF) crest factor (CF), impulse factor (IF), the corresponding cluster is identified using FCM and output class is assigned by the proposed fuzzy inferencing system. Following four rules are extracted using the estimated cluster centers for the bearing classes and it is given in Table 6. From the extracted fuzzy rules given in Table 6, it can be noted that the rule base for bearing fault classification is simple, computationally efficient and linguistically interpretable.

Validation of the Proposed Approach
In order to validate the proposed approach, 4 samples of feature vector were randomly picked in each class of bearing faults for testing and validation purposes. The classification results are shown in Table 7 and the overall performance of bearing fault classification is found to be 100%. These results prove that the proposed approach is useful for classification and condition monitoring of bearing states.  Figure 11 shows the output of the proposed fuzzy rule based inferencing system for the given input features. This method identifies and assigns the class of the bearing faults based on the highest membership for output class. It can be seen from Table 7 that the predicted class by the proposed method matches with the actual class of the bearing fault for all the input values given to the proposed fuzzy inferencing system. to be 100%. These results prove that the proposed approach is useful for classification and condition monitoring of bearing states.  Figure 11. Fuzzy rule inferencing system for bearing fault classification. Figure 11 shows the output of the proposed fuzzy rule based inferencing system for the given input features. This method identifies and assigns the class of the bearing faults based on the highest membership for output class. It can be seen from Table 7 that the predicted class by the proposed method matches with the actual class of the bearing fault for all the input values given to the proposed fuzzy inferencing system.
The performance of the proposed fuzzy rule-based classifier model is found to be better than SVM, neural network [32]. Also, our fuzzy rule based classification method is more efficient and easier to use than neural network, typically producing good results without any trial and error. Also, the proposed fuzzy rule-based classifiers are easy to interpret, verify, and extend.

Conclusions
Roller element bearing forms the core of rotating machinery, and its monitoring has always been of significant research interest. In order to enhance the capability of fault classification of bearings, this paper proposes a fuzzy rule based classification approach using discrete wavelet transform (DWT) and the Fuzzy C-means clustering (FCM) to identify fault types. The fault classification results show that the proposed FCM based rule formation approach identified the fault categories of rolling-element bearing more accurately and it grades the condition of the bearing. From the experimental results, it is found that the overall performance of clustering and classification of bearing faults is found to be 100% for identifying the states of the roller element bearings. The proposed method shows that the fuzzy c means clustering method is more direct and easier to implement. It is evident from this work, with the integration of fuzzy sets and fuzzy rule base, FCM is a very powerful tool for the classification and grading of bearing faults. The proposed system can be easily implemented for the early bearing fault identification through online monitoring since the DWT-FCM requires only lesser amount of computation. The performance of the proposed fuzzy rule-based classifier model is found to be better than SVM, neural network [32]. Also, our fuzzy rule based classification method is more efficient and easier to use than neural network, typically producing good results without any trial and error. Also, the proposed fuzzy rule-based classifiers are easy to interpret, verify, and extend.

Conclusions
Roller element bearing forms the core of rotating machinery, and its monitoring has always been of significant research interest. In order to enhance the capability of fault classification of bearings, this paper proposes a fuzzy rule based classification approach using discrete wavelet transform (DWT) and the Fuzzy C-means clustering (FCM) to identify fault types. The fault classification results show that the proposed FCM based rule formation approach identified the fault categories of rolling-element bearing more accurately and it grades the condition of the bearing. From the experimental results, it is found that the overall performance of clustering and classification of bearing faults is found to be 100% for identifying the states of the roller element bearings. The proposed method shows that the fuzzy c means clustering method is more direct and easier to implement. It is evident from this work, with the integration of fuzzy sets and fuzzy rule base, FCM is a very powerful tool for the classification and grading of bearing faults. The proposed system can be easily implemented for the early bearing fault identification through online monitoring since the DWT-FCM requires only lesser amount of computation.