Next Article in Journal
An Electrochemical Impedance Spectroscopy System for Monitoring Pineapple Waste Saccharification
Previous Article in Journal
From Data Acquisition to Data Fusion: A Comprehensive Review and a Roadmap for the Identification of Activities of Daily Living Using Mobile Devices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Simultaneous-Fault Diagnosis of Gearboxes Using Probabilistic Committee Machine

Department of Electromechanical Engineering, University of Macau, Macao, China
*
Author to whom correspondence should be addressed.
Sensors 2016, 16(2), 185; https://doi.org/10.3390/s16020185
Submission received: 23 October 2015 / Revised: 21 January 2016 / Accepted: 22 January 2016 / Published: 2 February 2016
(This article belongs to the Section Physical Sensors)

Abstract

:
This study combines signal de-noising, feature extraction, two pairwise-coupled relevance vector machines (PCRVMs) and particle swarm optimization (PSO) for parameter optimization to form an intelligent diagnostic framework for gearbox fault detection. Firstly, the noises of sensor signals are de-noised by using the wavelet threshold method to lower the noise level. Then, the Hilbert-Huang transform (HHT) and energy pattern calculation are applied to extract the fault features from de-noised signals. After that, an eleven-dimension vector, which consists of the energies of nine intrinsic mode functions (IMFs), maximum value of HHT marginal spectrum and its corresponding frequency component, is obtained to represent the features of each gearbox fault. The two PCRVMs serve as two different fault detection committee members, and they are trained by using vibration and sound signals, respectively. The individual diagnostic result from each committee member is then combined by applying a new probabilistic ensemble method, which can improve the overall diagnostic accuracy and increase the number of detectable faults as compared to individual classifiers acting alone. The effectiveness of the proposed framework is experimentally verified by using test cases. The experimental results show the proposed framework is superior to existing single classifiers in terms of diagnostic accuracies for both single- and simultaneous-faults in the gearbox.

1. Introduction

In the rotating machinery, gearboxes are widely used to transmit power from the prime mover to the load. If any failure occurs in the gearbox, it may interrupt normal machine operation and endanger users. Consequently, it is of great significance to develop a reliable and accurate intelligent system to diagnose the main components of the gearbox, such as gears and bearings. There are two main challenges in gearbox diagnosis. One is the existence of simultaneous faults, that is, multiple single faults that appear concurrently. The other is that no unique sensor can detect all the machine faults. To accurately detect more faults, many kinds of sensors and signals may be involved at the same time. However, it is difficult to analyze different kinds of signals simultaneously and make a decision. In the [1,2,3,4,5,6,7], various gearbox diagnostic systems have been proposed. In these systems, the fault diagnosis procedures are mainly divided into two stages: (1) signal processing and (2) fault identification/classification.
The existing problems in signal processing of these systems are that the signals usually contain high-dimensional data and suffer from background noise interference, which degenerates the accuracy and fault identification time. Besides, the gearbox usually has many rotating components working together, such as bearings, gears and spindles, so the diagnosis of the gearbox is a simultaneous fault problem. In traditional gearbox fault diagnostic methods, simultaneous faults are usually considered as an independent label for the classifier, which will result in a high cost in acquiring exponentially increased simultaneous fault signals. For example, with d single-faults (labels) and one normal condition, there are 2d − (d + 1) artificial simultaneous fault labels [8,9,10]. To solve this problem, an effective signal de-noising method and a proper feature extraction technique which can find the single fault pattern features in simultaneous fault patterns are studied together.
Currently, some methods, including spectral subtraction, least squares, and wavelet threshold methods, are widely used for signal de-nosing [11,12]. In order to effectively de-noise the non-stationary signals of a gearbox, a soft threshold method based on the discrete wavelet transform (DWT) is adopted in this study due to its popularity.
References [8,9,10] reported that a simultaneous fault symptom can be identified by analyzing the single fault patterns only if the classifier is trained by using a proper feature extraction technique, so that it can save a lot of resources to collect a large combination of simultaneous fault training data. Existing techniques to select a proper feature extraction technique are reviewed here. At present, there exist many methods to extract features from fault signals, such as Fourier transform, short time Fourier transform, and wavelet transform. The Fourier transform is only suitable for analyzing stationary signals. However, the signals of rotating gears and bearings are non-stationary, which makes the Fourier transform unsuitable for this application. The time-frequency analysis methods, such as short time Fourier transform (STFT) and wavelet transform, can process non-stationary signals, but they all have limitations. STFT has a limitation in non-stationary signal processing because of its use of a fixed time window which makes it impossible to achieve good resolution in the time and frequency domains at the same time. The drawback of the wavelet transform is that it suffers from the effect of the energy leakage because any signal which does not well correlate with the shape of wavelet basis function will be masked or completely ignored. In contrast to STFT and the wavelet transform, the Hilbert-Huang transform (H-HT) is the latest time-frequency signal processing technique to analyze nonlinear and non-stationary signals. The first step of a typical H-HT process is to employ the empirical mode decomposition (EMD) algorithm to decompose a complicated signal into a series of intrinsic mode functions (IMFs), which contains the local characteristics of the original signal at different time scales, and then a Hilbert transform is applied to each intrinsic mode function (IMF) for Hilbert spectrum analysis. The high time-frequency resolution of the H-HT method can effectively describe the rules of the changing frequency compositions with time, which is a good approach for analyzing non-stationary signals. Even though H-HT has been applied to many applications, particularly in fault detection and diagnosis [13,14], it has some disadvantages: (1) the issue of mode mixing; and (2) the redundant intrinsic mode functions easily appear at low frequency, which can cause the distortion of the processed result [15]. To overcome these disadvantages, this study applies ensemble empirical mode decomposition (EEMD), an improved EMD method, to deal with the mode mixing problem, and uses the correlation coefficient method to eliminate the redundant IMFs. The EEMD-based H-HT is hereafter refered to as HHT. It is well-known that different fault conditions show different amplitude- and phase-frequency characteristics in the frequency domain. In other words, fault signal energies in some frequency bands may be enhanced, while the others are restrained. It is reasonable to assume that there are certain corresponding relationships between the signal energy changes in the frequency bands and the fault phenomena. Therefore, on the basis of HHT, energy patterns of the selected intrinsic mode function components are considered in this study to further extract representative fault features from the gearbox vibration and sound signals.
In [1,3,5], most of the existing fault classification systems for the rotating machinery are constructed by a single classifier which is trained based on one type of signal. However, a single classifier-based fault diagnostic system may not give reliable fault diagnostic results due to the fact that a universal classifier is difficult to develop, especially when the data available for training the classifier are not abundant. Furthermore, a single classifier can only be trained by one type of signal. Obviously, only one type of signal may not be able to cover all the faults. To let a fault classification system generate more reliable diagnostic result and diagnose more faults, this paper proposes a new probabilistic committee machine (PCM) to combine the diagnostic results from vibration and sound signals. From the gearbox point of view, vibration and sound signals are usually used to identify the faults because those signals are easily acquired and highly related to the conditions of the gearbox [16,17,18,19,20,21]. The committee machine concept involves combining results acquired by individual classifiers so as to obtain a group decision that is superior to any individual classifier acting alone [22,23,24], because a group decision is usually better than a single person’s decision.
Moreover, a proper classifier must be able to offer the probabilities of all possible faults so that the user can at least trace the other possible faults according to the rank of their probabilities when the fault(s) predicted by the classifier are incorrect. Therefore, it is logical to employ a probabilistic classifier for each member in the committee machine for simultaneous-fault diagnosis of the gearbox. Currently, there are two common probabilistic classifiers, the probabilistic neural network (PNN) [25,26] and relevance vector machine (RVM) [27,28] available in the relevant literature. The main drawback of PNN lies in the limited number of inputs because the complexity of the network and the training time are heavily related to the number of inputs. Hence, RVM is selected as a probabilistic classifier to build each committee member in this study. Generally, the aforementioned probabilistic classifiers are suitable to solve the binary classification. Nevertheless, most of the practical applications are multi-class classification problems. One-versus-all strategy is usually employed to fix the multi-class classification problem. However, this strategy does not consider the correlation between every pair of faults or labels, which was verified to produce a large region of indecision [29]. To solve the multi-class classification problem effectively and generate a probability, a suitable pairwise coupling strategy is adopted for the above probabilistic classifiers to generate a pairwise-coupled probabilistic neural network (PCPNN) and pairwise-coupled relevance vector machine (PCRVM).
After determining the methods of signal de-noising, feature extraction and committee members, there are still two major factors, the decision threshold ε and member weight w, affecting the system accuracy in the proposed framework. The probabilistic committee machine only produces the probability of occurrence of each fault. To determine the occurrence of the faults, a decision threshold must be applied to those probabilities (e.g., output probabilistic vector P = [0.35, 0.58, 0.48, 0.83], if ε = 0.5, fault labels (2, 4) are considered as faults). Besides, different committee members usually have various reliabilities, so a fair committee machine should assign different weights to their committee members. Hence, an efficient searching algorithm, particle swarm optimization (PSO) [30,31], to determine optimal member weights and decision threshold is considered in the proposed framework. Finally, a fair measure, F-measure, is employed to evaluate the performance of the proposed diagnostic framework.
In a nutshell, this paper proposes a new framework which can diagnose simultaneous faults in the gearbox while the framework is trained using only single-fault patterns. Besides, the proposed framework can provide probabilities of all possible faults to users to trace the other possible faults according to the rank of probabilities when the diagnostic result is incorrect. Furthermore, the proposed framework can generate a more reliable diagnostic result and diagnose more faults by simultaneously analyzing vibration and sound signals. Even though the authors also proposed a similar framework for simultaneous-fault diagnosis of automotive engines in [21], the proposed framework is targeted at the gearbox system. Moreover, the signal patterns used in this application are totally different from the ones in [21]. The proposed framework is designed based on vibration and sound signals rather than air ratio, ignition and acoustic signals in the previous framework. Besides, the engine signals acquired in [21] do not consider the issue of background noise which can degenerate the accuracy of the diagnostic system. Furthermore, the feature extraction and selection methods rely on EMD + domain knowledge and sample entropy, which are old, time-consuming, out of support from reference materials, and have a risk of mode-mixing. Finally, the objective function in [21] is not well-defined that cannot achieve good diagnostic accuracy. Therefore, the framework in [21] cannot be directly applied and is modified significantly to suit for the gearbox, particularly in the phases of data processing and feature selection. Table 1 summarizes the differences between the diagnostic framework in [21] and this study.
Table 1. Differences of diagnostic framework between reference [21] and this study.
Table 1. Differences of diagnostic framework between reference [21] and this study.
DifferencesReference [21]Present Study
ApplicationAutomotive engineGearbox
Signal patternsAir ratio, ignition and acoustic signalsVibration and sound signals
Signal de-noisingNoneWavelet threshold
Feature extractionEMD and domain knowledgeEEMD-based Hilbert-Huang transform and energy pattern
Feature selection (IMF selection)Value of sample entropyCorrelation coefficient
Objective functionFme 0.925 ± 0.025Fme 0.9
This paper is organized as follows: Section 2 presents the proposed framework and related techniques. The experimental setup and data per-processing are discussed in Section 3. Section 4 discusses the experimental results and a comparison with other approaches. Finally, conclusions are given in Section 5.

2. Proposed Framework

The proposed PCM framework for the gearbox simultaneous-fault diagnosis, evaluation approach and its construction method are illustrated in Figure 1. The framework consists of four sub-modules: (1) data processing; (2) probabilistic committee machine; (3) parameter optimization; and (4) performance evaluation. The details of the four sub-modules in the framework are discussed in the following sub-sections.
Figure 1. Proposed framework of gearbox simultaneous-fault diagnosis using probabilistic committee machine.
Figure 1. Proposed framework of gearbox simultaneous-fault diagnosis using probabilistic committee machine.
Sensors 16 00185 g001
In this case study, signal features are extracted from two kinds of signals xk (k = 1, 2), including the vibration and sound signals, which are denoted as x1 and x2, respectively. Taking the vibration signal as an example, the signal x1, including both single-fault patterns (S) and simultaneous-fault patterns (SM), goes through de-noising and feature extraction. After the data processing, the processed dataset is divided into three independent groups, including validation dataset, training dataset, and test dataset which are named as x1-PTra, x1-PVal, and x1-PTes, respectively. The x1-PVal and x1-PTes involve the combination of both single-fault patterns and simultaneous-fault patterns, while x1-PTra contains the single-fault patterns only. The divided datasets are used to train, validate, and test the proposed framework.

2.1. Data Processing

2.1.1. Signal De-Noising

The acquired signals are display interference from the background noise. To decrease the interference, the acquired signals have to be de-noised. A discrete wavelet transform (DWT) technique, which is an effective de-noising technique for non-stationary signals [11,13], is selected in this paper. The DWT can be defined as:
DWT ( s , R ) = 1 2 s x ( t ) ψ ( t 2 s R 2 s ) d t
where s and R are integers, 2s and 2sR represent the scale and translation parameters respectively, Ψ represents the mother wavelet and Ψ* is the complex conjugate of Ψ. The original signal in time-domain xk = x(t) goes through a set of low pass and high pass filters emerging as low frequency (approximations, a*) and high frequency (details, d i * ) signals. Therefore, the original signal x(t) can be written as:
x ( t ) = a n * + i = 1 n d i *
The DWT-based de-noising technique is performed in three steps: (1) signal decomposition; (2) determination of the threshold and nonlinear shrinking coefficients; and (3) signal reconstruction. In the family of mother wavelets, the Daubechies wavelet (Db) is the most popular one and hence it is employed in this study. Moreover, the soft threshold signal is defined as s i g n ( x ( t ) ) ( | x ( t ) T | ) , if | x ( t ) | > T , and otherwise is 0, where T denotes a universal threshold that equals to 2 log ( l e n g t h x ( t ) ) . The detail of the de-noising is described in Section 3.2.

2.1.2. Feature Extraction Based on Hilbert-Huang Transform

The Hilbert-Huang transform (HHT) mentioned in this paper combines EEMD and the Hilbert transform. EEMD defines the true IMFs as the ensemble mean of trails, which consist of the decomposition of the signal plus a white noise of finite amplitude. In most cases, the range of the standard deviation is from 0.1 to 0.4 [32]. The EEMD algorithm [33] is given as follows:
(1)
Initialize the number of ensemble J, the amplitude of the added white noise, and set j = 1.
(2)
Perform the jth trial on the white noise-added signal. A white noise series with the given amplitude is added to the investigated signal:
x ´ j = x ( t ) ´ + n j
where nj represents the jth added white noise series, x(t)’ is the de-noised signal and x’j denotes the noise-added signal of the jth trial.
(3)
With the EMD method, the noise-added signal xj is decomposed into I IMFs as ci,j(t), for i = 1, 2, …, I, where ci,j represents the ith IMF of the jth trial, and I is the number of IMFs.
(4)
If j < J then let j = j + 1. Repeat Steps 2 and 3 again and again, but with different white noise series each time until j = J.
(5)
Calculate the ensemble mean c i ¯ of J trials for each IMF:
c i ¯ = 1 J j = 1 J c i , j ,     i = 1 , 2 , ... , I ,   j = 1 , 2 , ... , J
(6)
Report the mean c i ¯ of the I IMFs as the final IMFs.
Applying the Hilbert transform to each IMF, and calculating the instantaneous frequency ω j (t) and amplitude Aj(t), the Hilbert spectrum of x(t)’, H ( ω , t ) , is then calculated by the following equation:
H ( ω , t ) = Re j = 1 I A j ( t ) exp ( i ω j ( t ) d t )
Accordingly, the marginal spectrum of Hilbert-Huang transform, h( ω ), can be defined by an integrated spectrum with respect to time, t, i.e.:
h ( ω ) = 0 l H ( ω , t ) d t
where h( ω ) reflects the amplitude changing with frequency in the entire frequency range, and l is the length of the signal x(t)’. The instantaneous frequency of IMF, which is obtained from the Hilbert transform, is well-localized in the time-frequency domain and reveals important characteristics of the signal.

2.2. Probabilistic Committee Machine

PCM is a group decision method which combines the results from the individual classifier and generates superior performance to any of the individual classifier acting alone. As mentioned previously, RVM is selected for constructing the probabilistic fault classifier. To solve the multi-label classification problem effectively, RVM adopts a pairwise coupling strategy which is named PCRVM. Moreover, a new ensemble method is proposed to combine the output of each committee member. In the proposed ensemble method, the committee members should be assigned suitable weights since every member/classifier in the group usually has its own strength. The details of PCRVM algorithm and ensemble method are described in the following sections.

2.2.1. Relevance Vector Machine

RVM is a statistical learning method utilizing Bayesian learning framework and popular kernels. In this research, predicting the posterior probability of each fault tn for unseen symptoms f is conducted by RVM based on experimental data. Given a set of training data (f, t) = {fn,tn}, n = 1 to N, tn {0, 1}, and N is the number of training data. It follows the statistical convention and generalizes the linear model by applying the logistic sigmoid function σ ( y ( f ) ) = 1 / ( 1 + exp ( y ( f ) ) ) to the predicted decision y(f) and adopting the Bernoulli distribution for P ( t | F ) , the likelihood of the data is written as:
P ( t | F , θ ) = n = 1 N σ { y ( f n ; θ ) t n } [ 1 σ { y ( f n ; θ ) } ] 1 t n where     y ( f ; θ ) = i = 1 N θ i K ( f , f i ) + θ 0
where θ = ( θ 0 , θ 1 , ... , θ N ) T is a weight vector and K is a kernel function. In the open literatures, three kinds of kernel functions, radial basis function (RBF), polynomial, and Gaussian kernels, are available. Among these kernel functions, Gaussian kernel is the most popular kernel function in RVM to deal with the issue of classification for industrial applications [34].
The optimal weight vector θ * for the given dataset needs to be computed so as to maximize the probability P( θ |t, F, α) P(t|F, θ )P( θ * |α), with α = [α0, α1, …, αN] a vector of N + 1 hyperparameters. However, the weights cannot be determined analytically. Thus, the following approximation procedure is chosen, which is based on Laplace’s method:
(1)
For the current fixed values of α, the most probable weights θ MP are found. Since P( θ |t, F, α) P(t|F, θ )P( θ |α), this step is equivalent to the following maximization.
θ MP = arg   max θ log { P ( t | F , θ ) P ( θ | α ) } = arg   max θ { n = 1 N [ t n log d n + ( 1 t n ) ( 1 log d n ) ] 1 2 θ T A θ }
where d n = σ { y ( f n ; θ ) } , A = diag ( α 0 , α 1 , ... , α N ) .
(2)
Laplace’s method is simply a Gaussian approximation to the log-posterior around the mode of the weights θ MP . Equation (8) is differentiated twice to give:
θ θ log P ( θ | t , F , α ) | θ M P = ( Φ T B Φ + A )
where B = diag ( β 1 , β 2 , ... , β N ) is a diagonal matrix with β n = σ { y ( f n ; θ ) } [ 1 σ { y ( f n ; θ ) } ] and Φ is a N × (N + 1) design matrix with Φ n m = K ( f n , f m 1 ) and Φ n 0 = 1 , n = 1 to N, and m = 1 to N + 1. By inverting Equation (9), the covariance matrix = ( Φ T B Φ + A ) 1 can be obtained.
(3)
The hyperparameter vector α is updated using an iterative re-estimation equation. Firstly, αi is randomly guessed, then γ i = 1 a i i i is calculated, where i i is the ith diagonal element of the covariance matrix Then, αi is re-estimated as follows:
α n e w = γ i u i 2
where u = θ MP = Φ T B t . The first step is to set α i α i n e w and then γ i and α i n e w are re-estimated again until convergence. Finally, θ = θ MP is set, so that the classification model y ( f ; θ ) = i = 1 N θ i K ( f , f i ) + θ 0 is obtained.

2.2.2. Pairwise-Coupled Relevance Vector Machine as Committee Member

The traditional machine learning methods are designed only for the issue of binary classification, in which the output is either positive (+1) or negative (−1). However, most practical problems are multi-classification as well as probabilistic output. Usually, one-versus-all is employed to deal with multi-classification problems. The one-versus-all strategy constructs a group of classifiers lclass = [C1, C2, …, Cd] in a d-label classification problem. The one-versus-all strategy is simple and easy to implement, however, it generally gives a poor result [29,35] since one-versus-all does not consider the pairwise correlations which causes a much larger indecisive region than the pairwise coupling strategy (using one-versus-one) as showed in Figure 2. The pairwise coupling strategy also constructs a group of classifiers lclass = [C1, C2, …, Cd] in a d-label classification problem. However, each Ci = [Ci1, Ci2, …, Cid] is composed of a set of d − 1 different pairwise classifiers Cij, i j . Since Cij and Cji are complementary, there are totally d(d − 1)/2 classifiers in lclass as shown in Figure 3. To solve the multi-classification and probabilistic output problems, a pairwise coupling strategy is adopted for the RVM and PNN classifiers. The strategy combines all the outputs of every pair of classes to re-estimate the overall probability for a new instance.
Figure 2. Indecisive regions (shaded regions) using one-vs-all (left) and pairwise coupling (right).
Figure 2. Indecisive regions (shaded regions) using one-vs-all (left) and pairwise coupling (right).
Sensors 16 00185 g002
Figure 3. Pairwise coupling strategy of probabilistic classification.
Figure 3. Pairwise coupling strategy of probabilistic classification.
Sensors 16 00185 g003
There are several available methods for pairwise coupling strategy [29], which are, however unsuitable for simultaneous-fault diagnosis because of the constraint ρ i = 1 . Where ρ i is the probability of the ith label. Note that the nature of simultaneous-fault diagnosis is that ρ i is unnecessarily equal to 1. Therefore, the following simple pairwise coupling strategy for simultaneous-fault diagnosis is proposed. Every ρ i is calculated as:
ρ i = C i ( x ) = i = 1 : i j d n i j C i j ( x ) j = 1 : i j d n i j = j = 1 : i j d n i j ρ i j j = 1 : i j d n i j
where nij is the number of training feature vectors with either the ith or jth label. Hence, the probability can be accurately estimated from ρ i j = C i j ( x )   because the pairwise correlation between the labels is taken into account. With the above pairwise coupling strategy, the proposed probabilistic committee member, PCRVM, could estimate the probability vector ρ in a high level of accuracy.
After designing the pairwise coupling strategy for each probabilistic classifier, a new ensemble method is proposed to combine the result from each committee member with optimal weight.

2.2.3. Ensemble Method

One of the most frequently used ensemble methods is weighted averaging. In this method, every committee member has an appropriate weight related to its ability. However, the weighted averaging method cannot give a fair result when it deals with the issue of unbalanced committee member sensitivities to faults. For example, when the committee member 1 is not trained by a dataset with the fault d5, the fault d5 usually cannot be predicted by the committee member 1, which is demonstrated in Table 2. However, the weight averaging method still uses the unpredictable output to calculate the overall average, resulting in an unfair or unpredictable result.
To overcome the above problem, a novel ensemble method with optimal weights and predefined null outputs is proposed which is given by Equation (12). In Equation (12), ρ j i is set to be zero when the jth classifier cannot make a diagnosis for the ith fault label (i.e., the jth classifier is not trained by the ith single-fault). In this way, the proposed method can overcome the problem of the traditional weighted averaging method, which is one of main contributions of this research. The probability of the ith fault is expressed as:
P i = j = 1 k w j - o p t ρ j - i j = 1 k f ( w j - o p t )   , i = 1 , 2 , ... , d   &   j = 1 , 2 , ... , k subject   to   f ( w j - o p t ) = { w j - o p t 0   :   i f   ρ j - i = 0
where wj-opt is the optimal weight for the jth committee member, wj-opt [ 0 ,   1 ] , j = 1 to k, where k is the number of committee members, and the sum of wj-opt is not equal to 1. ρ j i [ 0 ,   1 ] is probability estimated from the jth classifier for the ith single-fault, i = 1 to d where d is the total number of detectable single-faults. Finally, the probabilistic outputs of classifiers are combined with optimal weights to generate the probability vector P = [P1, P2, ..., Pd].
Table 2. Issue of weighted averaging method for balanced and unbalanced committee member sensitivities to gearbox faults.
Table 2. Issue of weighted averaging method for balanced and unbalanced committee member sensitivities to gearbox faults.
Balanced Member Sensitivities to Gearbox FaultsCommittee Member 1Committee Member 2Average Output Probability (P3) for d3
Fault d3trainedtrained P 3 = w 1 ρ 1 2 + w 2 ρ 2 2 w 1 + w 2 [ 0 1 ]
P3 is a reasonable result
Output probability for d3 for an unseen case ρ 1 3 [ 0   1 ] ρ 2 3 [ 0   1 ]
Unbalanced Member Sensitivities to Gearbox FaultsCommittee Member 1Committee Member 2Average Output Probability (P5) for d5
Fault d5Unable to traintrained P 5 = w 1 ρ 1 5 + w 2 ρ 2 5 w 1 + w 2
P5 is an unfair/unpredictable result
Output probability for d5 for an unseen case ρ 1 5 is unpredictable ρ 2 5 [ 0   1 ]
Remark: w1 and w2 are weights for Committee members 1 and 2 respectively; P3 and P5 are average output probabilities for d3 and d5 respectively.
In this application, the processed training datasets xk-PTra, are employed to train probabilistic classifiers (PCRVM) respectively. The workflow of the PCM is shown in Figure 4.
Figure 4. Procedure for training probabilistic committee machine.
Figure 4. Procedure for training probabilistic committee machine.
Sensors 16 00185 g004

2.3. Parameter Optimization

The probability vector P = [P1, P2, …, Pd] can be provided to the user as a quantitative measure for reference and further processing. However, human experts generally cannot identify the number of simultaneous-faults directly based on the output probability of each fault. Therefore, a decision threshold (DT) ε is introduced to identify the simultaneous-faults from P such that:
y i = { 0 1 if   P i ε
where ε [ 0   1 ] and 1 denotes that the corresponding fault occurs. For example, given an unseen input x, if P = [0.72, 0.42, 0.51, 0.81, 0.39] and ε = 0.5, then y = DT(P) = [1, 0, 1, 1, 0]. Therefore, the unseen x is diagnosed as a simultaneous-fault for the labels (1, 3, 4).
Obviously, the weight and the decision threshold are the major factors affecting the classification accuracy. By reviewing the literature [30,31], it is seen that PSO has the same effectiveness as a typical optimization method, genetic algorithms, in finding the global optimal solution, but with better computational efficiency. Hence, PSO is adopted to determine the best weights wopt and decision threshold εopt in this study.

Particle Swarm Optimization

PSO is a population-based optimizer. The population is regarded as a swarm and the individuals are considered as particles. For an z-dimensional search space and a swarm consisting of H particles, the ith particle can be represented by an z-dimensional vector ui = (ui1, ui2, …, uiz), the velocity of this particle can be an z-dimensional vector vi = (vi1, vi2, …, viz), and the best previous position encountered by this particle can be described as pi = (pi1, pi2, …, piz). Let g represent the index of the particle that attains the best previous position among all the particles in the swarm. Then, the swarm is manipulated in accordance with the following equations:
v i ( j + 1 ) = W f v i ( j ) + q 1 r 1 [ p i ( j ) u i ( j ) ] + q 2 r 2 [ p g ( j ) u i ( j ) ]
u i ( j + 1 ) = u i ( j ) + v i ( j + 1 )
where i is the particle index i = [1, 2, …, H], Wf is the weight factor, q1 and q2 are positive constants, r1 and r2 are the random numbers selected between [0, 1]. The selection of the above parameters was presented in [36]. With reference to the literature, Table 3 shows the PSO parameters selected for this case study.
Table 3. PSO parameters.
Table 3. PSO parameters.
Number of generations1000
Population size50
Wf0.9
q12
q22
To evaluate the fitness of each iteration, a common evaluation method called F-measure [37] and an objective function described in Section 2.4 are employed. The procedure of the proposed PSO approach is illustrated in Figure 5, which is performed in three steps:
(1)
Initializing the parameters of PSO: The candidate weight (w1, w2) and decision threshold are randomly selected from interval [0, 1].
(2)
Calculating the output of F-measure: Following the procedure in Figure 5, the candidate weight and decision threshold are entered into the PCM model and Equation (13), respectively.
(3)
Comparing the output of F-measure with the objective function: If the F-measure satisfies the objective function, the corresponding weights and decision threshold are taken as optimal parameters, otherwise PSO updates the weights and decision threshold based on Equations (14) and (15), and then repeats Steps 2 and 3. When it reaches the present number of generation or satisfies the objective function, the corresponding weights and decision threshold of the highest output of F-measure are taken as optimal parameters.
Figure 5. Procedure for optimization of committee member weights and decision threshold.
Figure 5. Procedure for optimization of committee member weights and decision threshold.
Sensors 16 00185 g005

2.4. Performance Evaluation

The traditional performance evaluation of classifiers only considers exact matching of the decision vector y against the true vector t. This evaluation is however unsuitable for simultaneous-fault diagnosis where partial matching is preferred. F-measure is mostly used as a performance evaluation for information retrieval systems where a document may belong to a single or multiple tags simultaneously, which is very similar to the current study. By using F-measure, the evaluation of both single-fault and simultaneous-fault test cases can be fairly examined. The definition of F-measure is given in Equation (16). The larger the F-measure value, the higher the diagnostic accuracy is:
F m e = 2 j = 1 d i = 1 N t y i j t i j j = 1 d i = 1 N t y i j + j = 1 d i = 1 N t t i j [ 0 ,   1 ]
where yi = [yi1, yi2, …, yid] and ti = [ti1, ti2, …, tid] are the predicted decision vector and the true decision vector respectively, for j = 1 to d and i = 1 to Nt and ∀yij, tij ∈ [0, 1]. Nt is the number of single-fault and simultaneous-fault test patterns. For optimization of the weights and decision threshold, Fme also serves as an important parameter in an objective function. In order to avoid over-fitting to the validation dataset and achieve high diagnostic accuracy, the objective function is specifically defined as:
F m e B
where B is the preset optimal accuracy of F-measure and B lies between 0 and 1. In this study, B is set to be 0.9 as a trial. Figure 6 summarizes the evaluation process for the proposed diagnostic framework.
Figure 6. Evaluation of proposed framework.
Figure 6. Evaluation of proposed framework.
Sensors 16 00185 g006

3. Experimental Setup and Data Preprocessing

To verify the effectiveness of the proposed framework, experiments were carried out. The detail of the experimental set up is presented in the following subsections. All the proposed methods were implemented by using MatLab R2008a and executed on a computer with a Core 2 Duo E6750 @ 2.13 GHz with 4 GB RAM.

3.1. Test Rig and Sample Data Acquisition

The experiments were performed on a test rig as shown in Figure 7, which can simulate most of the faults in a gearbox. In this study, some common gearbox faults, including gear faults, bearing faults, and structural faults, are introduced. In the experiments, the gear faults include a broken tooth with whole tooth damage, a chipped tooth with 1/4 tooth damage, and a gear crack with a 5 mm crack on the tooth face, whereas the bearing faults include medium wear on the rolling elements and outer races. The structural faults contain unbalance, looseness, and misalignment, which are simulated by respectively adding one eccentric mass on the output shaft, unfastening some screws of the gearbox, and adjusting one height of the gearbox with shims. In the test rig, the signal acquisition module (NI 9234) with accelerometers and a microphone acquires the vibration and sound signals, respectively. The accelerometer is used to record the vibration signals along the vertical direction. In this study, a total of 12 cases, including eight single-faults and four simultaneous faults which are described in Table 4, are simulated in the test rig in order to generate sample training and test datasets. According to practical experience, a machine cannot be operated if there are too many faults at the same time. Therefore, the type of simultaneous faults is an experimental selection in this case study. Besides, the relationship between simulated faults and signal types is presented in Table 5, which explains that one kind of signal can only detect a limited number of faults. For example, previous experiments have found that the vertical vibration signal cannot be used to detect d4 and d5 because the loading on the tapered roller bearing along the vertical direction is insignificant. Moreover, the sound signal is relatively unaffected by structural resonance [38], so the structural failures (d1, d2 and d3) cannot be easily detected using the sound signal. To extend the number of detectable faults and enhance the reliability of the fault diagnostic system, the vibration and sound signals are therefore simultaneously employed to diagnose the simultaneous-faults in the gearbox.
Figure 7. Collection of fault patterns from a rotating machinery.
Figure 7. Collection of fault patterns from a rotating machinery.
Sensors 16 00185 g007
Table 4. Description of single-faults and simultaneous-faults.
Table 4. Description of single-faults and simultaneous-faults.
Case No.Single-FaultsCase No.Simultaneous-Faults
d1Unbalancesi9Broken gear tooth & Chipped tooth
d2Looseness
d3Mechanical misalignmentsi10Chipped tooth & Bearing with worn outer race
d4Bearing with worn rolling elements
d5Bearing with worn outer racesi11Broken gear tooth & Bearing with worn rolling elements
d6Broken gear tooth
d7Gear cracksi12Bearing with worn rolling elements & Bearing with worn outer race
d8Chipped tooth
Table 5. Relationship of single-faults and signal types.
Table 5. Relationship of single-faults and signal types.
d1d2d3d4d5d6d7d8
Vertical vibration
Sound
To construct and test the proposed diagnostic framework, the samples for each single fault and simultaneous fault were repeated 200 times under two testing conditions (800 rpm and 1500 rpm). Each time, 1 s of raw signal, including the vibration and sound signals,wa simultaneously recorded with a sampling rate of 25.6 kHz. In other words, one case of each type of signal has 25,600 sampling data points. For each type of signal xk (k = 1, 2), there are 1600 single-fault sample data (i.e., eight kinds of single faults × 200 samples) and 800 simultaneous fault sample data (i.e., four kinds of simultaneous faults × 200 samples). In order to evaluate the diagnostic performance for both single faults and simultaneous faults, each sample data is divided into different subsets as shown in Table 6.
Table 6. Division of sample dataset into different subsets.
Table 6. Division of sample dataset into different subsets.
Type of DatasetSingle-Faults (1600)Simultaneous-Faults (800)
Raw sample data (xk)Validation datasetDk-Val (800)Dk-Val (600)
Training datasetDk-Tra (600)
Test datasetDk-Tes (200)Dk-Tes (200)
After feature extractionValidation datasetDk-PVal (800)Dk-PVal (600)
Training datasetDk-PTra (600)
Test datasetDk-PTes (200)Dk-PTes (200)

3.2. Data Processing and Signal De-Noising in Case Study

In order to obtain the feature vector, the IMF energy pattern based on HHT is calculated with the following steps: (1) signal de-noising; (2) IMF component selection; and (3) IMF energy pattern calculation.
(1) Signal de-noising. In the signal de-noising phase, the mother wavelet and the level of decomposition L are selected according to a trial-and-error method. In this case study, four Daubechies wavelets (Db3, Db4, Db5, and Db6) are tried and the range of L is set from 3 to 5. Moreover, the soft threshold T is equal to 4.476 according to the equation T = 2   log   ( length   x ( t ) ) . The effectiveness of de-noising using Db wavelets is verified by using signal to noise ratio (SNR) which is given as follows:
SNR = 10 × log 10 ( S σ N σ )
where S σ and N σ are the standard deviation of de-noised signal and noise signal respectively. A large value of SNR means more noise is eliminated. Considering the sound signal of d6 as an example, the de-noised result is shown in Table 7. It demonstrates that the SNR of Db5 with Level 3 is the highest, so it is suitable to de-noise the signal.
Table 7. Signal to noise ratio under different combinations of Db wavelets.
Table 7. Signal to noise ratio under different combinations of Db wavelets.
SNRLevel 3Level 4Level 5
Db312.689 db11.041 db10.191 db
Db412.690 db11.090 db10.207 db
Db512.847 db11.126 db10.271 db
Db612.720 db11.118 db10.272 db
(2) IMF component selection. After de-noising the signals, the IMFs of all de-noised signals are calculated by using EEMD in which the ensemble number and white noise amplitude of EEMD are set as 100 and 0.3 time of the standard deviation of the investigated signal respectively [33]. In this case study, EEMD decomposes the de-noised sound signal into ten IMFs and a residual signal. To select the proper number of IMFs, the correlation coefficient method [13] is used. The correlation coefficient between an IMF component Ii(t) and its de-noised signal x(t)’ can be defined as:
C o e x ( t ) , I i ( t ) = i = 1 M ( x ( t ) x ¯ ) ( I i ( t ) I i ¯ ) i = 1 M ( x ( t ) x ¯ ) 2 i = 1 M ( I i ( t ) I i ¯ ) 2
where x ¯ and I i ¯ is the mean values of the x(t)’ and Ii(t) respectively and M is the number of IMFs. A large C o e x ( t ) , I i ( t ) value means a high correlation between Ii(t) and x(t)’, and also implies that Ii(t) contains more fault information. A signal of correlation coefficients of de-noised sound signal of d6 is presented in Table 8 as a demonstration in which the correlation coefficient of IMF I10 is obviously smaller than the others. Thus, only the IMFs from levels 1–9 are considered to extract the energy pattern in this case study.
Table 8. Correlation coefficients of each IMF component for an example of de-noised signal of d6.
Table 8. Correlation coefficients of each IMF component for an example of de-noised signal of d6.
De-noised sound of d6 IMF Component
I1I2I3I4I5I6I7I8I9I10
Correlation coefficient0.20540.20890.21320.23750.24890.34750.31340.28760.22730.0274
(3) IMF energy pattern calculation. In this case study, the energy patterns of selected IMFs are considered to extract the fault features. The energy of the ith IMF, Ei, can be calculated by using the following equation:
E i = j = 1 n [ ( j Δ t ) | I i ( j Δ t ) | 2 ]
where Δ t is the time interval, n and j are the total number and index of data points respectively, and I i ( j · Δ t ) denotes the decomposition coefficient of the ith IMF at the moment of j · Δ t . A nine-dimensional energy feature vector is extracted as E = [E1, E2, …, E9]. Furthermore, under different fault conditions, the HHT marginal spectra show various maximum values and corresponding frequencies in the patterns. To enrich the fault information, the maximum amplitude of a marginal spectrum of HHT, Am, and its corresponding frequency, fm, are added to the feature vector E. Therefore, the extracted feature vector is extended to an eleven-dimensional vector, which can be rewritten as E = [E1, E2, …, E9, Am, fm]. The procedure of data processing is illustrated in Figure 8.
Figure 8. Flowchart of proposed feature extraction approach.
Figure 8. Flowchart of proposed feature extraction approach.
Sensors 16 00185 g008

4. Experimental Results and Discussion

4.1. Performance of Various Combinations of Feature Extraction Techniques

In the experiments, two typical feature extraction methods, fast Fourier transform (FFT) and wavelet package transform with principal component analysis (WPT + PCA) are compared with HHT. For those feature extraction methods, some settings are necessary. For the wavelet package transform (WPT), the Daubechies wavelet is the most popular one, so it is employed. In this case study, Db4 with level 4 decomposition is employed after carrying out many trials. Besides, two classification techniques are used to compare with the proposed PCM framework, including PCPNN and PCRVM. There are two hyper-parameters, spread S* and width W* in the kernel function, which are necessary to be defined in PCPNN and PCRVM respectively. Meanwhile, PCRVM is employed as a committee member of PCM, so PCM and PCRVM share the same hyper-parameter width W*. By using a trial-and-error method, S* and W* are set to be 0.3 and 0.64 respectively.
After determining the configurations of the feature extraction and classification techniques, the reasonable combinations of feature extraction techniques are tested as shown in Figure 9, in which the weight of each committee member and decision threshold are predefined as 1 (i.e., w1 = w2 = 1) and 0.5 respectively. Note that PCPNN and PCRVM determine their F-measures by combining all the features extracted from vibration and sound signals as their input vectors, whereas PCM employs two PCRVM committee members to analyze the respective extracted features.
Figure 9 illustrates that the feature extraction techniques are effective. Taking the proposed PCM framework as an example, the feature extraction techniques, FFT, WPT + PCA, and Hilbert-Huang transform + energy pattern (HHT + E) give 14.12%, 18.18%, and 21.48% improvement respectively as compared with the method without any feature extraction. By using PCPNN and PCRVM as classifiers, the feature extraction methods also improve the diagnostic accuracy from 16.06% to 21.16% as compared with the method without feature extraction. Note that the classifiers only employ a training set of single-fault patterns to construct the classifiers while the performance is evaluated using simultaneous-fault test patterns. Figure 9 also indicates that no matter which classification technique is, HHT + E gives the best performance. The reason is that extracting the energy from HHT can reflect not only the energy amount of each IMF, but also the energy distribution of each IMF changing with time, which can provide more faulted component information. This result also verifies that the proposed feature extraction technique (HHT + E) is effective to extract the features of single-faults from simultaneous-fault patterns of the gearbox.
Figure 9. Diagnostic accuracies of different combinations of feature extraction techniques.
Figure 9. Diagnostic accuracies of different combinations of feature extraction techniques.
Sensors 16 00185 g009

4.2. Result and Discussion of Optimization Approach

After selecting HHT + E as feature extraction technique, the extracted features are employed to construct and train the committee machine. Then, PSO and Equations (16) and (17) are employed to determine the best wopt for each committee member and decision threshold εopt. The optimized weights and threshold as well as their corresponding Fme are shown in Table 9 in which the optimal weight for the first committee member w1 (0.7752) is higher than that of w2. In other words, the committee member trained by vertical vibration signal shows a great impact on the simultaneous-fault diagnosis. The main reason is that the sound signal is easily interfered by background noise. It implies that the first committee member is assigned with greater weight by PSO in order to make the output satisfying the objective function. Table 9 also illustrates that the proposed optimization framework can improve the diagnostic accuracy by 3.82% as compared with the empirical decision threshold of 0.5 and identical weights (w1 = w2 = 1) under the same feature extraction technique and simultaneous-fault test dataset. It means that the proposed optimization framework is effective.
Table 9. Selection of optimal weights and decision threshold using PSO.
Table 9. Selection of optimal weights and decision threshold using PSO.
ClassifierNo. of FeaturesOptimization MethodDecision ThresholdWeightsFme Based on Simultaneous-Fault Test Dataset
PCMVibration = 11
Sound = 11
-0.5w1 = 1
w2 = 1
0.7890
PCMVibration = 11
Sound = 11
PSO0.7583w1 = 0.7752
w2 = 0.6991
0.8272
Remark: Feature extraction method is based HHT + E.

4.3. Overall Evaluation of Proposed Framework

To verify the effectiveness of the proposed PCM diagnostic framework, the aforesaid two single probabilistic classifiers are compared with the proposed framework based on the optimal weights and decision threshold obtained by PSO. The experimental result of F-measure is shown in Table 10. Compared with PCPNN and PCRVM, the training time and average fault detection time of PCM are the longest, 36.189 s and 17.8574 s, respectively, while the result shows the diagnostic accuracy of PCM outperforms PCPNN and PCRVM by 5.24% and 4.18% respectively under the same test dataset of simultaneous-faults. Note that the training time of PCM is only based on the training dataset of single fault patterns; the average fault detection time of PCM relies on calculating the average time of test datasets of single, simultaneous and overall faults. Table 10 also reveals the proposed framework achieves the best accuracy for single faults (94.60%) and overall faults (89.24%) which include both single and simultaneous fault patterns. The main reason is that the committee members in the proposed framework are trained with different types of signals. In this way, each committee member becomes different from each other, which can improve the classification accuracy of the ensemble. For example, considering an ensemble of k trained classifiers [C1, C2, ..., Ck], if the classifiers are trained using different subsets and their errors are uncorrelated, then even when Ci is wrong, most of the other classifiers Cj (where ij) may still be correct.
In a nutshell, the proposed framework is an effective approach to detect the simultaneous-faults without costly simultaneous-fault training patterns. Moreover, the proposed method employs vibration and sound signals to train the diverse committee members, which can ensure the diagnostic result to be more reliable and accurate. Therefore, it can be concluded that the proposed framework is an effective technique to overcome both challenges in fault diagnosis of the gearbox.
Table 10. Evaluation result of PCM, PCPNN and PCRVM.
Table 10. Evaluation result of PCM, PCPNN and PCRVM.
ClassifierFeature NumberDecision ThresholdOptimal WeightAccuracies for Test Cases (Fme)
Single- FaultsSimultaneo- Us-FaultsOverall- FaultsAverage Fault Detection Time (s)
PCPNN11 + 11 = 220.6830-0.91630.77170.85638.8014
PCRVM11 + 11 = 220.6754-0.91410.78230.86429.7685
PCMVibration = 11
Sound = 11
0.7583w1 = 0.7752
w2 = 0.6991
0.94600.82410.892417.8574
Remark: Feature extraction method is based on HHT + E.

5. Conclusions

In this paper, a new framework, which combines signal de-noising, feature extraction, probabilistic committee machine, parameter optimization and F-measure, has successfully been developed to overcome the challenges of simultaneous fault diagnosis and multiple signal analysis in a gearbox. In consideration of the features of vibration and sound signals in this application, DWT and HHT + E are used for signal de-noising and feature extraction, respectively, so that the diagnostic system can effectively capture the single fault components from the noise-polluted simultaneous fault patterns. It implies that the acquisition of large amount of simultaneous fault signals can be avoided. Moreover, PSO is effective for optimizing the weight of each committee member and decision threshold in the PCM framework. To verify the effectiveness of the proposed probabilistic committee machine and make a comparison, the single probabilistic classifiers, PCPNN and PCRVM, are also employed to diagnose the simultaneous faults. Although the results show that those machine learning methods can diagnose the simultaneous faults in the gearbox, it is found that the proposed PCM framework is superior to the single classifiers. Therefore, the proposed PCM framework is suitable to detect the simultaneous faults in the gearbox.
In practice, most mechanical faults can be diagnosed by analyzing vibrations, sounds, currents, oil debris and temperature signals. As the number and type of committee members in the proposed framework can be adjusted by the user, the proposed framework can be applied to other similar diagnostic applications.

Acknowledgments

The authors would like to thank the financial support from the University of Macau, grant Numbers: MYRG2014-00178-FST, MYRG079(Y1-L2)-FST13-YZX, and MYRG2015-00077-FST. The authors would also like to thank the support from Yueqiao Chen.

Author Contributions

Pak Kin Wong and Jian-Hua Zhong conceived and designed the experiments; Jian-Hua Zhong performed the experiments; Jian-Hua Zhong and Zhi-Xin Yang analyzed the data; Zhi-Xin Yang contributed reagents/materials/analysis tools; Pak Kin Wong and Jian-Hua Zhong wrote the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hu, Q.; He, Z.; Zhang, Z.; Zi, Y. Fault diagnosis of rotating machinery based on improved wavelet package transform and svms ensemble. Mech. Syst. Signal Process. 2007, 21, 688–705. [Google Scholar] [CrossRef]
  2. Lei, Y.; He, Z.; Zi, Y.; Hu, Q. Fault diagnosis of rotating machinery based on multiple anfis combination with gas. Mech. Syst. Signal Process. 2007, 21, 2280–2294. [Google Scholar] [CrossRef]
  3. Sanz, J.; Perera, R.; Huerta, C. Fault diagnosis of rotating machinery based on auto-associative neural networks and wavelet transforms. J. Sound Vib. 2007, 302, 981–999. [Google Scholar] [CrossRef]
  4. Widodo, A.; Yang, B.-S. Application of nonlinear feature extraction and support vector machines for fault diagnosis of induction motors. Expert Syst. Appl. 2007, 33, 241–250. [Google Scholar] [CrossRef]
  5. Widodo, A.; Yang, B.S. Support vector machine in machine condition monitoring and fault diagnosis. Mech. Syst. Signal Process. 2007, 21, 2560–2574. [Google Scholar] [CrossRef]
  6. Wong, P.K.; Yang, Z.; Vong, C.M.; Zhong, J. Real-time fault diagnosis for gas turbine generator systems using extreme learning machine. Neurocomputing 2014, 128, 249–257. [Google Scholar] [CrossRef]
  7. Santos, P.; Villa, L.F.; Reñones, A.; Bustillo, A.; Maudes, J. An SVM-based solution for fault detection in wind turbines. Sensors 2015, 15, 5627–5648. [Google Scholar] [CrossRef] [PubMed]
  8. Vong, C.M.; Wong, P.K.; Ip, W.F. A new framework of simultaneous-fault diagnosis using pairwise probabilistic multi-label classification for time-dependent patterns. IEEE Trans. Ind. Electron. 2013, 60, 3372–3385. [Google Scholar] [CrossRef]
  9. Yang, Z.; Wong, P.K.; Vong, C.M.; Zhong, J.; Liang, J. Simultaneous-fault diagnosis of gas turbine generator systems using a pairwise-coupled probabilistic classifier. Math. Probl. Eng. 2013, 2013. [Google Scholar] [CrossRef]
  10. Yélamos, I.; Escudero, G.; Graells, M.; Puigjaner, L. Simultaneous fault diagnosis in chemical plants using support vector machines. In Computer Aided Chemical Engineering; Valentin, P., Paul Şerban, A., Eds.; Elsevier: Philadelphia, PA, USA, 2007; Volume 24, pp. 1253–1258. [Google Scholar]
  11. Wang, Y.S.; Lee, C.M.; Kim, D.G.; Xu, Y. Sound-quality prediction for nonstationary vehicle interior noise based on wavelet pre-processing neural network model. J. Sound Vib. 2007, 299, 933–947. [Google Scholar] [CrossRef]
  12. Ahn, J.H.; Kwak, D.H.; Koh, B.H. Fault detection of a roller-bearing system through the emd of a wavelet denoised signal. Sensors 2014, 14, 15022–15038. [Google Scholar] [CrossRef] [PubMed]
  13. Wang, Y.; Ma, Q.; Zhu, Q.; Liu, X.; Zhao, L. An intelligent approach for engine fault diagnosis based on hilbert–huang transform and support vector machine. Appl. Acoust. 2014, 75, 1–9. [Google Scholar] [CrossRef]
  14. Soualhi, A.; Medjaher, K.; Zerhouni, N. Bearing health monitoring based on hilbert–huang transform, support vector machine, and regression. IEEE Trans. Instrum. Meas. 2015, 64, 52–62. [Google Scholar] [CrossRef]
  15. Jiang, L.L.; Li, B.B.; Li, X.J. An Improved hht Method and Its Application in Fault Diagnosis of Roller Bearing. Appl. Mech. Mater. 2013, 273, 264–268. [Google Scholar] [CrossRef]
  16. Wu, J.D.; Liu, C.H. Investigation of engine fault diagnosis using discrete wavelet transform and neural network. Expert Syst. Appl. 2008, 35, 1200–1213. [Google Scholar] [CrossRef]
  17. Wu, J.D.; Chan, J.J. Faulted gear identification of a rotating machinery based on wavelet transform and artificial neural network. Expert Syst. Appl. 2009, 36, 8862–8875. [Google Scholar] [CrossRef]
  18. Loutas, T.H.; Sotiriades, G.; Kalaitzoglou, I.; Kostopoulos, V. Condition monitoring of a single-stage gearbox with artificially induced gear cracks utilizing on-line vibration and acoustic emission measurements. Appl. Acoust. 2009, 70, 1148–1159. [Google Scholar] [CrossRef]
  19. Yang, Y.; Yu, D.; Cheng, J. A fault diagnosis approach for roller bearing based on imf envelope spectrum and svm. Measurement 2007, 40, 943–950. [Google Scholar] [CrossRef]
  20. Cerrada, M.; Sánchez, R.V.; Cabrera, D.; Zurita, G.; Li, C. Multi-stage feature selection by using genetic algorithms for fault diagnosis in gearboxes based on vibration signal. Sensors 2015, 15, 23903–23926. [Google Scholar] [CrossRef] [PubMed]
  21. Wong, P.K.; Zhong, J.; Yang, Z.; Vong, C.M. Sparse bayesian extreme learning committee machine for engine simultaneous fault diagnosis. Neurocomputing 2016, 174, 331–343. [Google Scholar] [CrossRef]
  22. Tresp, V. A bayesian committee machine. Neural Comput. 2000, 12, 2719–2741. [Google Scholar] [CrossRef] [PubMed]
  23. Chen, S.; Wang, W.; van Zuylen, H. Construct support vector machine ensemble to detect traffic incident. Expert Syst. Appl. 2009, 36, 10976–10986. [Google Scholar] [CrossRef]
  24. Hansen, L.K.; Salamon, P. Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 993–1001. [Google Scholar] [CrossRef]
  25. Wu, J.D.; Chiang, P.H.; Chang, Y.W.; Shiao, Y.J. An expert system for fault diagnosis in internal combustion engines using probability neural network. Expert Syst. Appl. 2008, 34, 2704–2713. [Google Scholar] [CrossRef]
  26. Wang, C.; Zhou, J.; Qin, H.; Li, C.; Zhang, Y. Fault diagnosis based on pulse coupled neural network and probability neural network. Expert Syst. Appl. 2011, 38, 14307–14313. [Google Scholar] [CrossRef]
  27. Wang, G.; Yang, Y.; Xie, Q.; Zhang, Y. Force based tool wear monitoring system for milling process based on relevance vector machine. Adv. Eng. Softw. 2014, 71, 46–51. [Google Scholar] [CrossRef]
  28. Zio, E.; Di Maio, F. Fatigue crack growth estimation by relevance vector machine. Expert Syst. Appl. 2012, 39, 10681–10692. [Google Scholar] [CrossRef]
  29. Wu, T.F.; Lin, C.J.; Weng, R.C. Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 2004, 5, 975–1005. [Google Scholar]
  30. Robinson, J.; Rahmat-Samii, Y. Particle swarm optimization in electromagnetics. IEEE Trans. Antennas Propag. 2004, 52, 397–407. [Google Scholar] [CrossRef]
  31. Trelea, I.C. The particle swarm optimization algorithm: Convergence analysis and parameter selection. Inf. Process. Lett. 2003, 85, 317–325. [Google Scholar] [CrossRef]
  32. Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
  33. Lei, Y.; He, Z.; Zi, Y. Application of the eemd method to rotor fault diagnosis of rotating machinery. Mech. Syst. Signal Process. 2009, 23, 1327–1338. [Google Scholar] [CrossRef]
  34. Widodo, A.; Yang, B.-S. Application of relevance vector machine and survival probability to machine degradation assessment. Expert Syst. Appl. 2011, 38, 2592–2599. [Google Scholar] [CrossRef]
  35. Hastie, T.; Tibshirani, R. Classification by pairwise coupling. Ann. Stat. 1998, 26, 451–471. [Google Scholar] [CrossRef]
  36. Wong, P.K.; Tam, L.M.; Li, K.; Vong, C.M. Engine idle-speed system modelling and control optimization using artificial intelligence. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2010, 224, 55–72. [Google Scholar] [CrossRef]
  37. Hripcsak, G.; Rothschild, A.S. Agreement, the f-measure, and reliability in information retrieval. J. Am. Med. Inform. Assoc. 2005, 12, 296–298. [Google Scholar] [CrossRef] [PubMed]
  38. Qu, Y.; He, D.; Yoon, J.; van Hecke, B.; Bechhoefer, E.; Zhu, J. Gearbox tooth cut fault diagnostics using acoustic emission and vibration sensors—A comparative study. Sensors 2014, 14, 1372–1393. [Google Scholar] [CrossRef] [PubMed]

Share and Cite

MDPI and ACS Style

Zhong, J.-H.; Wong, P.K.; Yang, Z.-X. Simultaneous-Fault Diagnosis of Gearboxes Using Probabilistic Committee Machine. Sensors 2016, 16, 185. https://doi.org/10.3390/s16020185

AMA Style

Zhong J-H, Wong PK, Yang Z-X. Simultaneous-Fault Diagnosis of Gearboxes Using Probabilistic Committee Machine. Sensors. 2016; 16(2):185. https://doi.org/10.3390/s16020185

Chicago/Turabian Style

Zhong, Jian-Hua, Pak Kin Wong, and Zhi-Xin Yang. 2016. "Simultaneous-Fault Diagnosis of Gearboxes Using Probabilistic Committee Machine" Sensors 16, no. 2: 185. https://doi.org/10.3390/s16020185

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop