Rolling-Element Bearing Fault Diagnosis Using Advanced Machine Learning-Based Observer

: Rotating machines represent a class of nonlinear, uncertain, and multiple-degrees-of-freedom systems that are used in various applications. The complexity of the system’s dynamic behavior and uncertainty result in substantial challenges for fault estimation, detection, and identiﬁcation in rotating machines. To address the aforementioned challenges, this paper proposes a novel technique for fault diagnosis of a rolling-element bearing (REB), founded on a machine-learning-based advanced fuzzy sliding mode observer. First, an ARX-Laguerre algorithm is presented to model the bearing in the presence of noise and uncertainty. In addition, a fuzzy algorithm is applied to the ARX-Laguerre technique to increase the system’s modeling accuracy. Next, the conventional sliding mode observer is applied to resolve the problems of fault estimation in a complex system with a high degree of uncertainty, such as rotating machinery. To address the problem of chattering that is inherent in the conventional sliding mode observer, the higher-order super-twisting (advanced) technique is introduced in this study. In addition, the fuzzy method is applied to the advanced sliding mode observer to improve the accuracy of fault estimation in uncertain conditions. As a result, the advanced fuzzy sliding mode observer adaptively improves the reliability, robustness, and estimation accuracy of rolling-element bearing fault estimation. Then, the residual signal delivered by the proposed methodology is split in the windows and each window is characterized by a numerical parameter. Finally, a machine learning technique, called a decision tree, adaptively derives the threshold values that are used for problems of fault detection and fault identiﬁcation in this study. The e ﬀ ectiveness of the proposed algorithm is validated using a publicly available vibration dataset of Case Western Reverse University. The experimental results show that the machine learning-based advanced fuzzy sliding mode observation methodology signiﬁcantly improves the reliability and accuracy of the fault estimation, detection, and identiﬁcation of rolling element bearing faults under variable crack sizes and load conditions.


Introduction
Rolling element bearings (REBs) have been extensively used in several industries, such as the automotive, steam and gas turbines, and power generation industries, to improve their efficiency by reducing friction [1,2].The complexities of the required tasks, with time-varying and nonlinear parameters in rolling bearings, make their fault estimation, detection, and identification highly challenging.The fault estimation, detection, and identification (further referred to as FEDI) are intransitive to prevent the bearing's destruction.Here, the fault estimation technique is used to estimate the signal (fault) to obtain the valuable differentiation between various conditions of bearing, the fault detection algorithm is used to detect normal and abnormal conditions, and the fault identification technique is used to identify the specific types of faults in the REBs.Various types of failures have been representing in bearings, which are divided into four foremost groups, i.e., inner race faults, outer race faults, ball or rolling-element faults, and cage faults that in this research called IF, OF, BF, and CF, respectively.To analyze the faults conditions in a bearing, various REB condition monitoring techniques such as vibration, motor current signature analysis (MCSA), and acoustic emission (AE) measurements have been reported [3].This research exploits the vibration measurements since these signals are suitable for FEDI.
Numerous procedures have been recently presented for FEDI in various systems, which can be divided into four groups: (a) model-based (MB) techniques, (b) signal-based (SB) approaches, (c) knowledge-based (KB) procedures, and (d) hybrid methods [1,4].Some recently published representative examples of model-based techniques have been reported in [1,[5][6][7].Model-based techniques for fault diagnosis identify the faults by modeling the system's dynamics using mathematical modeling or system identification techniques using a small dataset.Apart from the various advantages of the MB method such as reliability and robustness, accuracy is the main drawback of this technique [1].Signal-based fault diagnosis approaches extract fault features and differentiate the health conditions of the system by applying various signal processing techniques to the acquired signals.Some recently published representative examples of signal-based approaches can be found in [8][9][10][11][12][13][14][15][16][17].When using the traditional fault diagnosis frameworks (i.e., knowledge-based methods), it is crucial to select an appropriate signal processing technique that is useful for extracting the representative fault features that are used as inputs to the decision-making approaches.Recently, various types of advanced signal processing approaches were applied for rotating machinery fault diagnosis.These techniques include the conventional wavelet transform [9], wavelet-based kurtogram [10], empirical mode decomposition [11], ensemble empirical mode decomposition [12] and their modifications [13][14][15], as well as relatively new methods for detecting and extracting the repetitive transients caused by mechanical faults, such as spectral l2/l1 norm [16] and spectral Gini index [17].These methods are essential for feature engineering; however, in the proposed methodology, the signal processing step is replaced by the observation technique (advanced fuzzy sliding mode observer) which is used to provide new insights and demonstrate the applicability of the control theory field for solving the problems of mechanical fault diagnosis.Knowledge-based approaches were based on the ideas of transferring the industrial knowledge and expertise in fault diagnosis from humans to the machine by creating algorithms that performed fault identification using decision tables and rules.However, recently the trend in KB-based approaches has been shifted toward artificial intelligence techniques that are used for automatically extracting valuable features and making decisions about the fault conditions of the system [8,18,19].To address the issues of model-based approaches and knowledge-based techniques, hybrid fault diagnosis can be introduced for bearing fault diagnosis.In the hybrid approach, various algorithms from different groups can be used in parallel to improve the performance of the fault diagnosis method.In this research, a robust hybrid fault diagnosis algorithm is presented using a machine learning-based advanced fuzzy sliding mode observer for the REB.
The main challenge in designing the procedure of sliding mode observer is system modeling.The mathematical-based system modeling and system identification are the main frameworks for modeling the complex (e.g., bearing) systems [5,6].In the mathematical-based system modeling, the Lagrange technique can be used for modeling the REBs [6].Apart from the reliability and accuracy of physical system modeling, this technique has drawbacks in uncertain and noisy conditions.To address this issue, the system identification techniques are widely applied as indicated in [5,20,21].To estimate the system using system identification technique, various orthonormal procedures such as Auto Regressive with eXogenous input (ARX), orthonormal function bases (OFB), and generalized orthonormal bases (GOB) methods have been used [20,21].Apart from several advantages of orthonormal techniques compared to the classical algorithms, these techniques have two important drawbacks.The first one is related to the challenge of finding the optimal orthonormal values and the second problem is related to the restrictions in decoupled systems [6,7,20,21].To address these issues, ensure an efficient complexity reduction and reduce the system's estimation order, the ARX-Laguerre technique has been applied in [20,21].Apart from the advantage of complexity reduction in ARX-Laguerre, it has a challenge related to estimation accuracy in nonlinear systems.To increase the accuracy in the ARX-Laguerre technique, the T-S fuzzy ARX-Laguerre (FAL) technique is presented in the literature [5].In our work, this technique is used to estimate the vibration signals of REBs.
One of the main model-based techniques that can be used for FEDI is the observation technique [1].The observation techniques can be categorized into two main groups: (a) The linear observers (e.g., proportional-integral (PI) and proportional multi integral (PMI)) and (b) nonlinear observers (e.g., feedback linearization observer (FLO), sliding mode observer (SMO), and fuzzy observer (FO)).To reduce the observation error, the linear observer only uses the feedback term.Robustness and reliability in uncertain conditions are the main drawbacks of linear observers [21,22].Regarding the nonlinear observers, they can be designed based on driving the dynamic system's behavior in parallel with the linear observer to increase the robustness and reliability of the linear observer for fault estimation [6,7].In this research nonlinear observer is used for FEDI.Despite the improvement of the accuracy obtained from using the FLO, robustness is the main issue of this technique [7].The extended FLO and SMO have been presented by researchers to improve the robustness of observers [5][6][7].The SMO is a robust technique for FEDI for nonlinear and complex systems (e.g., rolling bearings) which operate in uncertain and noisy conditions.The nonlinear switching function in the SMO is defined to converge the output estimation error toward zero.This technique can perform FEDI according to adaptive updates of the observer parameters, which can significantly improve the performance of FEDI in nonlinear systems [23,24].Though the SMO increases the robustness, this scheme unfortunately suffers from the chattering phenomenon and reduced fault estimation accuracy in the presence of uncertainties and unknown conditions.The chattering phenomenon (high-frequency oscillation) is one of the significant disadvantages of SMO.The main effect of this challenge is the increase of some serious mechanical obstacles such as heats the mechanical components and saturation.To decrease the chattering, the higher-order SMO (HSMO) is presented and reported in [25,26].To increase the performance of the HSMO, different techniques, such as the quasi-continuous (QC) algorithm [27], suboptimal (SO) method [28], and twisting technique (TW) [29], have been introduced.The main challenge of the QC-HSMO, SO-HSMO, and TW-HSMO approaches is the first-order derivative of the sliding variable.To address this issue, a higher-order super-twisting (advanced) SMO (ASMO) for measurable and unmeasurable state observer has been reported [6].Apart from the stability, robustness, and chattering attenuation in the ASMO, this method suffers from a somewhat reduced fault estimation accuracy.Therefore, in this research, the fuzzy technique is applied to the ASMO to increase the estimation accuracy and design AFSMO.
Once the observer is designed, the decision regarding the REB condition should be made.There are various conventional techniques that can be applied for the decision-making, such as decision tables, rule-based reasoning, and case-based reasoning [30]; however, recently the solutions provided by artificial intelligence (AI) are frequently applied to resolve the problems of fault diagnosis.Machine learning (ML) is one of the fields of AI that introduces some of the most popular techniques for decision-making, such as support vector machines (SVMs) [31] and artificial neural networks (ANNs) [32,33].To make a decision on the particular faulty condition, the SVM first attempts to find an optimal hyperplane that best separates the feature parameters corresponding to data instances of different classes.Then, when the new data sample appears, the SVM determines on which side of the hyperplane the sample lays and assigns a class corresponding to the location.When an ANN is used in fault identification, the network learns the optimal weights of its neurons during the back-propagation procedure to minimize the loss function and meet the target values given a set of input attributes (i.e., feature set) corresponding to different faulty conditions.These days, the trend in AI applications have shifted toward the deep learning (DL) approach, which focuses on learning data representations to achieve the target goals.This shift is primarily enabled by the significant increase of computational capabilities of modern computer systems.The most popular DL-based solutions used for solving different problems in condition monitoring are convolutional neural networks [34] (fault diagnosis), autoencoders [19] (fault diagnosis, feature extraction, and data augmentation), generative adversarial networks [35] (data augmentation), and recurrent neural networks [36] (fault prediction).The principles of DL-based solutions are similar to those of the ANN; they adapt the weights of neurons and tune the hyperparameters to meet the requirements according to the task.However, unlike ANNs, the deep networks are characterized by a large number of hidden layers and nodes, which necessitate the application of huge datasets to achieve a good generalization by these networks.Also, the computational time of DL-based methods significantly increases in comparison with the conventional ML-based approaches.In this work, we employ an ML-based classification technique called a decision tree (DT) [37][38][39] to complete the proposed fault diagnosis methodology and implement the decision-making procedure for REB fault detection and identification.
During training, the conventional DT [40] algorithm relatively fastly learns and derives the logical set of easily interpretable rules that can be used for decision-making regarding the REB faulty conditions, while providing insights into the quality of the fault estimation procedure performed by the advanced fuzzy SMO (AFSMO) in the previous step.
Figure 1 illustrates the complete block diagram of the proposed algorithm for FEDI of the bearing.This figure indicates that this algorithm has three main parts.In the first step, the system is modeled using the fuzzy ARX-Laguerre (FAL) technique.This part itself has three steps: (i) modeling the bearing based on the filtered ARX method, (ii) modifying the performance of the filtered ARX technique based on an orthonormal function and designing the ARX-Laguerre method, and (iii) improving the accuracy and performance of ARX-Laguerre bearing modeling based on the fuzzy ARX-Laguerre technique.In the second step, the AFSMO is designed for accurate fault estimation and improved performance of the decision component.The second step has three sub-blocks: (i) run the SMO, (ii) reduce the chattering and increase the robustness, in which the SMO is improved based on the advanced technique and the designed ASMO, and (iii) increase the fault estimation accuracy using the fuzzy algorithm and apply it to the ASMO.Apart from the advantages of the ASMO regarding robustness and reliability, it suffers from a suboptimal fault estimation accuracy.To address this issue, a fuzzy algorithm is used in parallel with the advanced SMO for the bearings.In the third step, the faults are detected and identified based on the DT machine learning algorithm.The decision-making for fault detection and identification of REBs contain three parts applied in sequence.These parts are the (i) residual generator, (ii) window characterization, and (iii) deriving the logical decision rules for fault detection and identification using DTs.In the residual generator, the residual signals are calculated in various (normal and abnormal) conditions.Next, these residual signals are cut into windows of equal size and the amplitude-dependent feature parameter is extracted to quantitively characterize these obtained windows.Finally, the fault detection and identification are accomplished by the logical decision rules delivered by the DT technique.Specifically, the decision about the fault condition is accomplished by comparing the value of the extracted feature parameter from the window of the residual signal with the learned threshold provided by the DT classification algorithm.This paper has the following contributions:

I.
Robust technique for modeling the vibration signals in bearings based on the fuzzy ARX-Laguerre approach is proposed.II.
The estimation accuracy of the higher-order sliding mode observer for vibration signals has been improved by the T-S fuzzy algorithm.III.
The performance of fault detection and identification by the proposed hybrid observer is improved by the decision tree technique and hence, new machine learning-based hybrid observer is introduced in this paper.

Dataset
In this research, the Case Western Reserve University (CWRU) benchmark bearing dataset [2,6] is used to experimentally evaluate the effectiveness of machine learning-based AFSMO that is diagramed in Figure 1. Figure 2 presents the experimental data acquisition system used to extract the signals in various conditions.To collect the vibration signals in normal and abnormal conditions, the vibration sensor (6205-2RS JEM SKF) is used.This sensor collects normal and single point faults that are seeded on the drive-end bearings at various bearing locations as the ball fault (BF), outer fault (OF), and inner fault (IF), respectively.The sampling rate of the record data is 12 kHz under four different motor torques loaded under different rotation velocities from 1730 rpm to 1790 rpm.Table 1 illustrates the data description of the CWRU benchmark bearing dataset.Four different datasets are defined in this table which are categorized by motor torque loaded from 0 to 3 hp.Moreover, the vibration signals in healthy and faulty (BF, OF, IF) conditions have three different crack sizes (i.e., 0.007, 0.014, and 0.021 inches in diameter).The rest of this research paper is organized as follows: Section 2 provides insights into the Case Western Reserve University (CWRU) benchmark dataset used in this paper.In Section 3, the bearing is modeled based on the FAL procedure.Section 4 includes two main steps.In the first step, the AFSMO is utilized for fault estimation.In the second step, the decision tree algorithm is used for the fault detection and identification.In Section 5, fault detection, estimation, and identification results for the bearing are analyzed.Finally, the conclusions are provided in the last section.

Dataset
In this research, the Case Western Reserve University (CWRU) benchmark bearing dataset [2,6] is used to experimentally evaluate the effectiveness of machine learning-based AFSMO that is diagramed in Figure 1. Figure 2 presents the experimental data acquisition system used to extract the signals in various conditions.To collect the vibration signals in normal and abnormal conditions, the vibration sensor (6205-2RS JEM SKF) is used.This sensor collects normal and single point faults that are seeded on the drive-end bearings at various bearing locations as the ball fault (BF), outer fault (OF), and inner fault (IF), respectively.The sampling rate of the record data is 12 kHz under four different motor torques loaded under different rotation velocities from 1730 rpm to 1790 rpm.Table 1 illustrates the data description of the CWRU benchmark bearing dataset.Four different datasets are defined in this table which are categorized by motor torque loaded from 0 to 3 hp.Moreover, the vibration signals in healthy and faulty (BF, OF, IF) conditions have three different crack sizes (i.e., 0.007, 0.014, and 0.021 inches in diameter).

Rolling-Element-Bearing Modeling
In this research, the hybrid technique is proposed for FEDI.The model-based approach is the core of the proposed method.First, the Lagrangian formulation based on potential energy, kinetic energy, and generalized forces can be expressed as the following equation [41].
Here, , ,  , and  are kinetic energy, potential energy, generalized force, and generalized coordinate, respectively.The energy equation can be obtained by the derivative of the Equation ( 1) and is expressed as follows [6,41]:

Rolling-Element-Bearing Modeling
In this research, the hybrid technique is proposed for FEDI.The model-based approach is the core of the proposed method.First, the Lagrangian formulation based on potential energy, kinetic energy, and generalized forces can be expressed as the following equation [41].
Here, K, P, Q i , and q i are kinetic energy, potential energy, generalized force, and generalized coordinate, respectively.The energy equation can be obtained by the derivative of the Equation (1) and is expressed as follows [6,41]: where F (q) , A(q), B(q), C(q), ϕ, and (∆A, ∆B, ∆C) are the force vector, mass vector, time-variant stiffness matrix, time-variant damping matrix, fault (IF, OF, BF) vectors, and unknown modeling parameters for mass, stiffness, and damping matrix, respectively.If ∆ = ∆A(q) .. q]+∆B(q)[ .q + ∆C(q)[q], the dynamic equation of the bearings can be represented as follows: If ω q, .q = B(q) . q + C(q)[q] and ψ q, .q = A −1 (q)(∆ + ϕ) represent the uncertainties and faults, the bearing dynamic equation is rewritten as follows: ..
Apart from several advantages of mathematical-based system modeling, in most complicated systems, such as bearing systems, the precise mathematical formulation of energy and force in the bearing is nonlinear and complicated.In addition, mathematical modeling is not accurate in the presence of uncertainty.Moreover, the dynamic behavior of the bearing in theoretical and practical applications may be different, which causes challenges in system modeling for FEDI.Therefore, the fuzzy ARX-Laguerre (FAL) technique is represented for REB modeling.This system modeling technique has three main steps.In the first step, the ARX system modeling is defined.To improve the robustness and reliability of the system modeling, the ARX-Laguerre technique is represented in the second step.In addition, to improve the system's modeling accuracy for the ARX-Laguerre technique, the fuzzy technique is represented.The mathematical formulation for an ARX system model is represented as [20][21][22]: where and δ x , δ y are the output, model parameters, input, and order of the system, respectively.To represent the model parameters ε x , ε y , the following equation is presented.
Here, ω n,x , ω n,y , γ x , γ y , and J x n , J y n are the coefficients of the Fourier decomposition, orthonormal basis, and orthonormal function, respectively.In the next step, ARX-Laguerre system modeling is used to obtain robust system modeling.The input and output orthonormal functions are represented in Equation (7).
Appl.Sci.2019, 9, 5404 Here, O n, f o and O n, f i are the filtered orthonormal functions for the output and input, respectively.The ARX orthonormal function is represented as the following equation.
In addition, the Laguerre technique is defined as Equation (9).
Here, L x n and L y n are the input and output Laguerre functions, respectively.Based on Equations ( 7) and ( 9), the input and output orthonormal functions are modified to be the following equation.
Here, O n, f o and O n, f i are the input and output modified orthonormal functions, respectively.Thus, the ARX-Laguerre system modeling and estimation is represented as the following [20].
Here, (Ψ z −1 , Λ z −1 ) and (W(k), I(k)) are the polynomial variables and filtering signals, respectively and defined as the following equations.
Therefore, the state-space ARX-Laguerre technique is represented as Equation (13).
, and (λ) T are the system's state, modeling coefficients, measured output, input, uncertainties and fault, and Fourier coefficients, respectively.Here, the state system modeling coefficient σ s is represented by the following equation: where O N x ,N y and O N y ,N x are null matrices and σ so and σ si are represented in Equations ( 15) and ( 16), respectively.
The output system modeling coefficient σ o is represented in Equation (17).
In addition, the input system modeling coefficient σ i can be represented in Equation (18).
To improve the ARX-Laguerre system modeling accuracy, the fuzzy technique is recommended in this research.The fuzzy ARX-Laguerre (FAL) system modeling is defined as follows.
Here, σ f and J f are the fuzzy coefficient and fuzzy function for system estimation, respectively.The fuzzy if-then rule in this research is defined by the following rule.I f f irst input is lingistic variable f or f irst input and second input is linguistic variable f or second input then output is linguistic variable f or output (20) The membership functions of a fuzzy set for the system estimation of error (e) in the interval of −0.3, 0.3 are the Gaussian and the linguistic variables, which are defined by negative high (NH), negative medium (NM), negative low (NL), zero (Z), positive low (PL), positive medium (PM), and positive high (PH).The fuzzy membership functions for the system estimation of the change of error .e in the interval of −0.1, 0.1 are the Gaussian and the linguistic variables, which are defined by NH, NM, NL, Z, PL, PM, and PH.In addition, the fuzzy linguistic variables for J f in the interval of −0.08, 0.08 are the Gaussian and the fuzzy sets, which are defined by NH, NM, NL, Z, PL, PM, and PH.Table 2 illustrates the fuzzy rule table for the FAL system estimation technique.According to this table, the error has seven linguistic variables, the change of error has seven linguistic variables and the fuzzy system estimation has seven linguistic variables.Therefore, the fuzzy technique to improve system modeling has 49 rule-bases.Based on these 49 rule-bases, the fuzzy technique has improved the accuracy of system modeling to achieve the minimum estimation error.Figures 3 and 4 show the estimation accuracy and errors of REB modeling for the normal and faulty conditions based on the proposed FAL technique, ARX-Laguerre system modeling technique, and ARX system estimation method.Based on these figures, the system modeling accuracy in the proposed fuzzy ARX-Laguerre system modeling is higher than the ARX-Laguerre technique and ARX system modeling techniques.Regarding Figure 3, the error rate in the proposed fuzzy ARX-Laguerre technique is close to zero.proposed fuzzy ARX-Laguerre system modeling is higher than the ARX-Laguerre technique and ARX system modeling techniques.Regarding Figure 3, the error rate in the proposed fuzzy ARX-Laguerre technique is close to zero.

Proposed Algorithm for Fault Diagnosis in Rolling-Element Bearing
According to Figure 1, the fuzzy ARX-Laguerre method was used to model the bearing in normal and abnormal conditions.The next step focuses on the designed advanced fuzzy sliding mode observation technique that comprises three main blocks: (a) apply SMO for fault estimation, (b) improve the robustness and chattering attenuation based on the high-order super-twisting (advance) SMO (ASMO), and (c) improve the accuracy of the fault estimation technique, i.e., the T-S fuzzy algorithm applied to the ASMO.

Advance Sliding Mode Observer for Fault Estimation
Based on Equation (19) for FAL modeling of the REB, the SMO is proposed based on Equation (21).
In addition, the equation to define fault estimation based on the SMO is: where  (),  (),  (), and ( ,  ,  ,  ,  ) are the sliding mode system state estimation, sliding mode output estimation, sliding mode fault estimation, and coefficients, respectively.The SMO is a robust and stable technique for fault diagnosis in bearings, however, this technique suffers from the chattering phenomenon.To reduce the effect of the chattering, the following function is introduced: where  and  are new the observation definition and coefficient, respectively.If the unknown conditions are estimated (Δ + ), the SMO error goes to zero in a finite time.

𝜎 𝐽 (𝑘) − 𝐽 (𝑘)
.   () −  () −   =  ×   () −  () , Here,  and  are the respective coefficient and the variable for the super-twisting variable.Therefore, based on Equation ( 23), this technique can improve the performance of fault estimation in the unknown condition.Therefore, to reduce the chattering phenomenon, the ASMO state-space and the fault estimation equation can be represented in Equations ( 25) and (26).

Proposed Algorithm for Fault Diagnosis in Rolling-Element Bearing
According to Figure 1, the fuzzy ARX-Laguerre method was used to model the bearing in normal and abnormal conditions.The next step focuses on the designed advanced fuzzy sliding mode observation technique that comprises three main blocks: (a) apply SMO for fault estimation, (b) improve the robustness and chattering attenuation based on the high-order super-twisting (advance) SMO (ASMO), and (c) improve the accuracy of the fault estimation technique, i.e., the T-S fuzzy algorithm applied to the ASMO.

Advance Sliding Mode Observer for Fault Estimation
Based on Equation (19) for FAL modeling of the REB, the SMO is proposed based on Equation (21).
In addition, the equation to define fault estimation based on the SMO is: where Ĵs−SMO (k), Ĵo−SMO (k), φSMO (k), and σ s , σ o , σ i , σ p , σ SM are the sliding mode system state estimation, sliding mode output estimation, sliding mode fault estimation, and coefficients, respectively.The SMO is a robust and stable technique for fault diagnosis in bearings, however, this technique suffers from the chattering phenomenon.To reduce the effect of the chattering, the following function is introduced: where Z and σ x are new the observation definition and coefficient, respectively.If the unknown conditions are estimated (∆ + ϕ), the SMO error goes to zero in a finite time.
Here, σ x 0 and .δ are the respective coefficient and the variable for the super-twisting variable.
Therefore, based on Equation (23), this technique can improve the performance of fault estimation in the unknown condition.Therefore, to reduce the chattering phenomenon, the ASMO state-space and the fault estimation equation can be represented in Equations ( 25) and (26).
Here, J s−HSMO (k), J o−HSMO (k), and ϕ HSMO (k) are the state-estimation error, measured output estimation error, and fault estimation error based on the advanced SMO, respectively.Therefore, to guarantee the stability and robustness, the sliding surface slope coefficients are represented by the following equations [6].
Here, Γ is an estimation error bounded and based on Equation ( 26), it can be determined by the nonlinear part of the estimation error performance.Though there is an improvement in the robustness and the attenuation of the chattering based on the ASMO for FEDI, this fails to improve the fault estimation accuracy.Therefore, the T-S fuzzy algorithm is used to improve the ASMO.The T-S fuzzy technique is represented based on the following equation [42].
Appl.Sci.2019, 9, 5404 13 of 22 Here, r x (k), α f , TH n , TH f , and (ς n , ς f ) are the residual signal, fuzzy estimation function, threshold value for the normal condition, threshold value for the faulty condition, and normal and abnormal coefficients for the T-S fuzzy observer, respectively.Therefore, the proposed advanced fuzzy SMO and the fault estimation are represented in Equations ( 30) and (31).
Here, Ĵs− f HSMO , Ĵo− f HSMO , φ f HSMO , and σ f HsM are the advanced fuzzy sliding mode system state estimation, advanced fuzzy sliding mode output estimation, advanced fuzzy sliding mode fault estimation, and coefficients, respectively.After estimating the output based on the proposed AFSMO, the residual signal can be calculated as follows: Figures 5-7 illustrate the residual signal in normal and abnormal conditions based on the SMO, ASMO, and AFSMO (the proposed method).Based on these figures, the difference between various states of signals in the AFSMO is more clear than those of the ASMO and SMO.In the next part, the procedure of fault detection and identification is performed and described using the decision tree machine learning technique.
Figures 5-7 illustrate the residual signal in normal and abnormal conditions based on the SMO, ASMO, and AFSMO (the proposed method).Based on these figures, the difference between various states of signals in the AFSMO is more clear than those of the ASMO and SMO.In the next part, the procedure of fault detection and identification is performed and described using the decision tree machine learning technique.

Residual Signal Characterization
Once the residual signal is obtained based on Equation (32), this signal can be successfully used to perform the fault detection and diagnosis process.To apply the DT approach for the problem of fault diagnosis, the specific numerical feature parameter sensitive to the changing signal conditions is essential.Because of the fact that the residual signal itself represents an error between the original signal and that estimated by the proposed observer, this signal cannot be quantified by most of the conventional feature parameters that are used in fault diagnosis.However, the parameters that are sensitive to the amplitude can be applied to characterize the residual (i.e., the error) signal.
To characterize the residual signal by a numerical parameter, first, we split the time sequence into windows of equal size [22].Then, the feature parameter called the energy of the signal is extracted from these windows to deliver a numerical value that can be used to characterize "the amount of error" in the particular window.The formulation of the energy feature parameter is provided below:

Residual Signal Characterization
Once the residual signal is obtained based on Equation (32), this signal can be successfully used to perform the fault detection and diagnosis process.To apply the DT approach for the problem of fault diagnosis, the specific numerical feature parameter sensitive to the changing signal conditions is essential.Because of the fact that the residual signal itself represents an error between the original signal and that estimated by the proposed observer, this signal cannot be quantified by most of the conventional feature parameters that are used in fault diagnosis.However, the parameters that are sensitive to the amplitude can be applied to characterize the residual (i.e., the error) signal.
To characterize the residual signal by a numerical parameter, first, we split the time sequence into windows of equal size [22].Then, the feature parameter called the energy of the signal is extracted from these windows to deliver a numerical value that can be used to characterize "the amount of error" in the particular window.The formulation of the energy feature parameter is provided below: Here, r x i is the ith sample of the residual signal and N is the total number of samples.When the different types of mechanical faults appear in REBs, the amplitudes of the vibration signals change drastically.This signal behavior strongly affects the residues as well.Thus, the use of the energy as a numerical feature for characterizing the windows of the residual signals is a reasonable choice because the value of the energy is closely related to the amplitude of the sequence being investigated.Moreover, since the residue represents the error between the two signals, the values of its samples can be both positive and negative.The application of the energy feature allows us to consider the error values located on both sides from zero because of the squared term presented in Equation (33). Figure 8 shows the residual signal characterization based on the energy for normal and faulty conditions.As can be seen from this figure, the extracted feature parameters corresponding to different system's conditions are well separable in 1D feature space which means that the proposed methodology can be efficiently applied for purposes of REB fault diagnosis.

Decision Tree-Based Fault Diagnosis
Once the windows of residual signals have been characterized using the energy feature parameter, these feature parameters are fed into the classifier, which is made using the DT.DT [40], as well as random forests (i.e., the ensemble of DTs) [43,44], is a type of machine learning technique

Decision Tree-Based Fault Diagnosis
Once the windows of residual signals have been characterized using the energy feature parameter, these feature parameters are fed into the classifier, which is made using the DT.DT [40], as well as random forests (i.e., the ensemble of DTs) [43,44], is a type of machine learning technique that can perform both the classification and regression tasks while their decisions are fairly intuitive and easy to interpret.Because of their properties, decision trees are considered as white-box models; they automatically derive general classification rules that work well for the data given in the training set based on the specific attributes (i.e., feature parameters).Moreover, these obtained decision rules can be efficiently used manually for classifying different samples if necessary.
During training of the DT to perform fault detection and identification, the heuristic criteria that is based on the error and is called the Gini Diversity Index (GDI) [45], also known in the literature as the Gini Impurity, is applied in this paper.The main advantage of employing the GDI criterion is that the DTs trained with this metric tends to isolate the most frequent data class in its branch of the tree which is useful for evaluating the quality of features in training set and their separability.
The GDI is defined as follows: where GDI i is the Gini score for the ith leaf of the DT, n is the total number of classes (i.e., bearing faults) presented in data, and p i,k is the ratio of the instances belonging to class k among all instances in the ith leaf.
The training procedure of the DT can be summarized as follows.Given a specific feature parameter and GDI criterion, the training goal is to find a split for the features that induces a binary partition of the set of data samples with a minimum GDI criterion, where the weights of this criterion are given by the number of data samples that lie in each of the two branches [46].The leaf node is said to be "pure" when all training samples it applies to belong to the same class.Finally, after the training of the DT with the energy values extracted from the windows of residual signals is completed, we obtain a set of rules for differentiating various bearing conditions.This set of rules is based on the threshold values automatically learned by the DT to minimize the weighted sum of Gini scores of the DT leaves.After completing the training process, the new data sample can be simply classified by tracing out a route from the root to one of the leaves of the tree while comparing the energy value with the learned threshold values for each of the leaves.In this paper, the number of assigned leaves is similar to the number of signal classes presented in the dataset, which is equal to four.Equations (35) and (36) are used to formulate the respective fault detection and fault diagnosis based on the DT algorithm.
Here, ζ n , ζ b , and ζ i are the normal condition threshold, ball fault condition threshold, and inner fault condition threshold, respectively.The proposed DT-based AFSMO for bearing FEDI is summarized in Algorithm 1, and an example of the DT trained using the energy values extracted from the residual signal windows for 0.021 inches crack size condition under 3 hp torque load is presented in Figure 9.

𝑖𝑓 𝐸 𝜁 & 𝐸 𝜁 & 𝐸 𝜁 → 𝑂𝑢𝑡𝑒𝑟 𝐹𝑎𝑢𝑙𝑡
Here,  ,  , and  are the normal condition threshold, ball fault condition threshold, and inner fault condition threshold, respectively.The proposed DT-based AFSMO for bearing FEDI is summarized in Algorithm 1, and an example of the DT trained using the energy values extracted from the residual signal windows for 0.021 inches crack size condition under 3 hp torque load is presented in Figure 9.  Run the ASMO to reduce the chattering phenomenon (25, 26) 4: Run the AFSMO to increase the fault estimation (30, 31) 5: Run the residual generator (32) 6: Run the residual signal characterization by energy (33) 7: Run the learning process of decision trees (34) 8: Apply the classification rules delivered by decision trees for fault detection and identification (35, 36)

Experimental Results
In this section, the fault diagnosis capabilities of the proposed technique are validated against two state-of-the-art approaches that have reported results for the same publicly available dataset.Specifically, the first employed method is the ASMO and the second approach used for the comparison is the SMO.The final fault diagnosis performance is expressed in terms of the average classification accuracy (ACA) as follows: where T p is the number of true positive predictions (i.e., the number of data samples of the specific class m correctly identified as ones belonging to the class m), and N samples is the total number of samples available in the particular dataset.To provide a comprehensive performance evaluation, the two case-studies are considered in this result sections: evaluating fault diagnosis accuracies on crack-variant datasets provided by CWRU and the created custom load-variant datasets.

Crack-Variant Datasets
In this subsection, we investigate the fault identification capabilities of the proposed methodology on four crack-variant datasets, where the torque load remained fixed.In other words, each of the four datasets corresponding to the particular load levels contains three subsets, where each of the subsets is formed using the data instances collected under the specific crack size.In this experiment, each subset comprises of N f × N w data samples, where N f is the number of faulty conditions available in the CWRU benchmark dataset and N w is the number of windows (equal to 100, Section 4.2) cut from the residual signals delivered by the AFSMO in the previous step.The fault diagnosis results obtained for the REB datasets with various crack diameters under fixed load conditions are tabulated in Tables 3-6.The experimental results demonstrated in Tables 3-6 show that the proposed methodology based on the AFSMO, and DT for fault identification of the REBs clearly outperforms the state-of-the-art techniques used for comparison in this paper for all the available fault types with various fault severity degrees (i.e., crack sizes).Despite the advantages of the proposed methodology, the significant drop in classification accuracy can be observed for OR Fault under a 0 hp load with a crack of size 0.007 inches-around 84% (Table 3).This performance drop can be explained as follows.During the analysis of this behavior, it was discovered that the CWRU signals acquired under the conditions mentioned above for the OR Fault and IR Fault overlap significantly.Since the proposed and referenced methodologies create the model of the system being investigated based on the actual data collected from the testbed, they tend to transfer some of the uncertainty conditions presented in the original data into the created model.Because of this overlapping behavior appearing in the originally collected data, the residual signals provided by the proposed methodology and energy values extracted to characterize the windows of these residuals also overlap each other.However, the proposed technique allowed for the improvement of fault identification for the BF and IF compared to the counterparts used in this paper.Thus, even with the significant accuracy decrease for the OF condition, it is clear that the proposed methodology demonstrates better classification performance in terms of the ACA.

Load-Variant Datasets
In this subsection, based on Table 1 we re-configured the available CWRU dataset and created three custom load-variant datasets to validate the robustness of the proposed fault identification technique under changing load conditions.In more detail, each of the three new datasets corresponded to three different sizes of the crack (i.e., 0.007, 0.014, and 0.021 inches in diameter), which contains N f × N load × N w data instances, where N f is the number of fault conditions presented in the available CWRU dataset, N load is the number of load levels, and N w is the number of windows cut from the residual signals delivered by the AFSMO.The results obtained during this experiment are presented in Table 7.The results presented in this table allow us to conclude that the proposed method is highly robust to the changing experimental conditions, such as torque loads and rotating speeds.The proposed approach outperformed its counterparts with the lowest average classification accuracy of 93.8%, achieved for the crack with a diameter of 0.007 inches under different load level conditions in this case study.

Conclusions
This paper proposed a decision tree-based advanced fuzzy SMO (AFSMO) to perform reliable FEDI in a rotating machine.Rolling element bearing (REB) presents a nonlinear system that can now be analyzed in uncertain conditions.To improve the system modeling accuracy, the fuzzy ARX-Laguerre technique is presented in the first step.In the second step, the SMO is designed for the FEDI.To reduce the chattering and increase the estimation accuracy of the SMO, in the third step, the AFSMO is considered.To increase the fault estimation accuracy, the T-S fuzzy algorithm is applied to the AFSMO in the fourth step.To perform fault detection and identification in the presence of uncertainties, decision trees govern the AFSMO to find the exact solution for fault diagnosis under various crack types and motor speed conditions.The effectiveness of the proposed algorithm is validated using a publicly available vibration dataset of CWRU.The proposed decision tree-based AFSMO outperformed the ASMO and SMO in terms of classification accuracy.As a result, the proposed method improved the average fault identification performance by about 4.5%, 5.3%, and 4.8% compared with the ASMO for the crack sizes of 0.007, 0.014, and 0.021 inches, respectively.In addition, the AFSMO improved the average performance of fault identification by about 8.8%, 7.6%, and 8.1% compared with the SMO for the crack sizes of 0.007, 0.014, and 0.021 inches, respectively.Moreover, the proposed technique demonstrated its robustness while performing the tasks of REB fault detection and identification under changing operating conditions, such as variable load levels and variable rotating speeds.Based on the results we can conclude that the proposed methodology is highly efficient in diagnosing bearing faults.However, to ensure the robustness and efficacy in the industrial environment, it is crucial to reduce the error of the system modeling.To address this issue, we will focus on developing an adaptive algorithm in our future work.Moreover, we will focus on exploring and evaluating the capabilities of applying the proposed methodology in conjunction with various vibration health indicators for a task of REB fault prognosis on run-to-failure experimental data.

Figure 3 .
Figure 3.The estimation accuracies in the normal condition: (a) the real and estimated signal, and (b) the estimation signal's error.

Figure 3 .
Figure 3.The estimation accuracies in the normal condition: (a) the real and estimated signal, and (b) the estimation signal's error.

Figure 3 .
The estimation accuracies in the normal condition: (a) the real and estimated signal, and (b) the estimation signal's error.Appl.Sci.2020, 10, x FOR PEER REVIEW 11 of 21

Figure 4 .
Figure 4.The estimation accuracies in the abnormal condition.(a) Real and estimated signals by three techniques, and (b) the estimation signal's error by three techniques.

Figure 4 .
Figure 4.The estimation accuracies in the abnormal condition.(a) Real and estimated signals by three techniques, and (b) the estimation signal's error by three techniques.
Appl.Sci.2020, 10, x FOR PEER REVIEW 15 of 21 different system's conditions are well separable in 1D feature space which means that the proposed methodology can be efficiently applied for purposes of REB fault diagnosis.

Figure 8 .
Figure 8.The energy of the residual signal based on the AFSMO: (a) original, (b) zoom frame.

Figure 8 .
Figure 8.The energy of the residual signal based on the AFSMO: (a) original, (b) zoom frame.

Figure 9 .
Figure 9.The decision tree trained for detecting and diagnosing various bearing faults using the energy values of the residual signal windows for 0.021 inches crack size condition under 3 hp torque load.

Figure 9 . 1 :
Figure 9.The decision tree trained for detecting and diagnosing various bearing faults using the energy values of the residual signal windows for 0.021 inches crack size condition under 3 hp torque load.

Table 1 .
The detailed description of the Case Western Reserve University (CWRU) datasets in normal and abnormal conditions.DatasetConditions Load (hp) Crack Sizes (in)

Table 1 .
The detailed description of the Case Western Reserve University (CWRU) datasets in normal and abnormal conditions.

Table 2 .
The fuzzy rule table for system modeling based on the fuzzy ARX-Laguerre (FAL) technique.

Table 3 .
The accuracy of the FEDI when the torque load is 0 hp.

Table 4 .
The accuracy of the FEDI when the torque load is 1 hp.

Table 5 .
The accuracy of the FEDI when the torque load is 2 hp.

Table 6 .
The accuracy of the FEDI when the torque load is 3 hp.

Table 7 .
The accuracy of the FEDI for the custom load-variant datasets.