Intelligent Gearbox Diagnosis Methods Based on SVM, Wavelet Lifting and RBR

Given the problems in intelligent gearbox diagnosis methods, it is difficult to obtain the desired information and a large enough sample size to study; therefore, we propose the application of various methods for gearbox fault diagnosis, including wavelet lifting, a support vector machine (SVM) and rule-based reasoning (RBR). In a complex field environment, it is less likely for machines to have the same fault; moreover, the fault features can also vary. Therefore, a SVM could be used for the initial diagnosis. First, gearbox vibration signals were processed with wavelet packet decomposition, and the signal energy coefficients of each frequency band were extracted and used as input feature vectors in SVM for normal and faulty pattern recognition. Second, precision analysis using wavelet lifting could successfully filter out the noisy signals while maintaining the impulse characteristics of the fault; thus effectively extracting the fault frequency of the machine. Lastly, the knowledge base was built based on the field rules summarized by experts to identify the detailed fault type. Results have shown that SVM is a powerful tool to accomplish gearbox fault pattern recognition when the sample size is small, whereas the wavelet lifting scheme can effectively extract fault features, and rule-based reasoning can be used to identify the detailed fault type. Therefore, a method that combines SVM, wavelet lifting and rule-based reasoning ensures effective gearbox fault diagnosis.

For example, the support vector data grow linearly with the number of training samples, which may lead to excessive fitting and is time-consuming in calculation; probability prediction can not be obtained with a SVM; users of SVM must give an error parameter, which significantly influences the results. Unfortunately, the value of the given parameter is highly subjective, and all its possible values have to be guessed in order to find the best result. Moreover, the kernel function of SVM must fulfill Mercer's condition [4].
It is well known that the bottleneck of fault diagnosis is a lack of fault samples, which provides SVM a bright application future in machine fault diagnosis. Jack has used SVM to detect the rolling bearing condition [5]; he also optimized the SVM parameter with a genetic algorithm and achieved a good generalization [6]. Thukaram et al. compared the differences between neural networks and SVMs in recognizing faults, and demonstrated the advantages of SVM in situations with small sample size. Nonetheless, most studies are still limited in laboratory tests; there are not many applications of SVM in intelligent fault diagnosis systems in practice. More research and field tests are required for application of SVM in practice. We investigate further in this field.
The wavelet transform is a breakthrough in signal processing technology in the past two decades [7]. Currently, the wavelet lifting analysis algorithm has been successfully applied in many fields, even though it was only recently proposed. Calderbank, Daubechies and Sweldens et al. have applied wavelet lifting analysis to the image compressing field and have achieved better compression compared to first generation wavelet analysis [8,9]. Howlet and Nguyen have applied the wavelet lifting transform to audio signal analysis to reduce the signal Shannon entropy by approximately 6% [10]. MIT investigators Sudarshan et al. combined the wavelet lifting transform with finite element analysis and proposed a novel multiresolution finite element method. In fault diagnosis, application of the wavelet lifting transform has just begun. Based on Claypoole's self-adaptive wavelet transformation, Samuel and Pines at the University of Maryland developed a new method using the wavelet lifting combined with matching pursuit gear fault features, which has led to satisfactory results in helicopter transmission fault diagnostics [11]. Zhengjia He and Chendong Duan et al. in Xi'an Jiaotong University have also done considerable research in this field. They have deduced several construction methods of wavelet lifting and obtained excellent analysis results in signal processing, time-frequency analysis and feature extraction when combining wavelet lifting with other methods [8,12].
Rule-based reasoning (RBR) is a traditional intelligent diagnosis method. Experience and knowledge will be represented in the form of rules which will be saved in knowledge base, and the reasoning mechanisms will be used to get the diagnosis conclusions with the rules. Considering the engine wear process, Peilin Zhang, Bing Li and Shubao Liang et al. have established a fuzzy rule based on typical wear faults for certain engines. They have introduced a symmetric fuzzy cross-entropy method for fault reasoning and established a model of engine fault diagnosis based on a combined method of symmetric fuzzy cross-entropy and rule-based reasoning [13]. In order to perform a flexible, rapid and precise case adaptation in a case-based reasoning design system, Xin Song, Wei Guo and Zhiyong Wang have proposed a case adaptation mechanism that is based on regression analysis and rule-based reasoning [14].
This study presents a method that combines wavelet lifting, an SVM and rule-based reasoning to diagnose gearbox faults. Gearbox vibration signals are initially processed by wavelet packet decomposition. Then, the energy coefficients of each frequency band are calculated and used as input vectors to the SVM to recognize normal and faulty gearbox patterns. Precise analysis from the wavelet lifting scheme was then utilized to obtain the machine fault feature frequency. Finally, based on the fault feature frequency, the existing diagnostic knowledge and rules were used for logical reasoning to establish a knowledge base to identify fault types. The diagnosis scheme based on an SVM, wavelet lifting and rule-based reasoning methods is shown in Figure 1.

Principle of SVM
A support vector machine is based on minimizing structural risks. Its algorithm was initially designed for two-class classification. In the field of machine faults, an SVM can simply determine whether there is a fault.
The SVM method is developed by determining the optimal separating hyperplane in for linear separability. The optimal separating hyperplane is not only able to classify all training samples, but also maximizes the distance between the separating hyperplane and points in training samples that are closest to the separating plane.
The fault training sample set is ( , ), 1,..., , , The solution of this optimal problem is the saddle point of the Lagrange function, and the optimal discriminant function is obtained as: Nonlinear problems can be converted to high-dimensional linear problems with a nonlinear transformation. In high-dimensional space, only the inner-product computation is needed, which can be obtained by using functions in the original low dimensions. According to the relative principles of functional analysis, if one kernel function K(x i ,x) fulfills Mercer's conditions, it corresponds to the inner product of one dimension. Such functions are called kernel functions, and the optimal discriminant function in this situation is changed to: The kernel functions commonly used are the RBF kernel, MLP kernel and Multinomial kernel.
However, the RBF described as is used most widely. It will be described in detail in reference [16]. Because the RBF kernel performs better in recognition than the MLP kernel or Multinomial kernel, and the SVM algorithm has higher recognition accuracy and is more suitable than a BP neural network to deal with a small sample data set [15,16], this study employs the RBF kernel in the LS-SVM toolbox to diagnose the fault.

Feature vector analysis of wavelet packets energy
Using multiresolution analysis and the wavelet packet technique, signals can be decomposed into different frequency bands. Analyzing signals in these frequency bands is called frequency bandwidth analysis. Usually, based on the frequency range where signals of interest are located, users can decompose signals to a certain scale and obtain information from the corresponding frequency bands. Additionally, signals in different frequency bands can be further subject to statistical analyses to obtain feature vectors that represent signal characteristics. Analyzing the signal energy in different frequency bands is called frequency band energy analysis. It is characterized by wide-frequency-range responses when processing nonstationary, transient signals with higher frequency resolution at low frequency and higher time resolution at high frequency. Compared to the FFT, it contains a great deal of non-stationary and nonlinear diagnostic information.
The theoretical basis for wavelet frequency bandwidth analysis is Parseval's theorem. The time ; and these two are linked by Parseval's equation: Thus, vibration signals are decomposed into independent frequency bands of different levels by using a conjugate quadrature filter. Not only are these decomposed signals in quadrature to each other in agreement with the law of conservation of energy, but they also contain a large quantity of nonstationary and nonlinear diagnostic information compared to an FFT. Therefore, the signal energy in every frequency band can be used as a feature vector to represent the operation condition of the machine and is useful for machine fault diagnosis.
The procedure for feature vector extraction using wavelets is the following: Step 1: process vibration signals for wavelet packet decomposition; Step 2: reconstruct each wavelet packet coefficient, and extract signals in different frequency ranges; Step 3: acquire E j , the signal energy at different frequency bands and the total energy E.
, where k = 1, 2, 3…n is defined as the discrete points at frequency band j; j is the number of frequency bands; and jk x represents the amplitude of the discrete points.
Step 4: use the percentile ratio of the signal energy E j at each decomposed frequency band and the total energy E as elements to construct feature vectors.

Wavelet Lifting Scheme
The wavelet lifting transform includes two stages: decomposition and reconstruction. Decomposition consists of splitting, predicting and updating. As shown in Figure 2(a), given data series } ), , the decomposition stage of the wavelet lifting transform based on the lifting scheme is shown below: (1) Split: the data series } ), (2) Predict: suppose ) ( P is the predictor; then use ) (k s e to predict ) (k s o , and define the predictive deviation as the detail signal ) (k d : Then, the detail signal series is } ), is updated based on the detail signal ) (k d . Its result is defined as the approximation signal ) (k c : Then, the approximation signal series is } ), Reconstruction of the wavelet lifting is the reverse process of decomposition, and is composed of recovery prediction, recovery updating and merging: The reconstruction signal s is obtained by merging the odd and even sample series, as shown in Figure 2

Fuzzy reasoning mechanism of typical faults in RBR
Because of the differences in machine working conditions, vibration signals can provide significant qualitative information. However, there is no one-to-one correspondence between fault features and conclusions because of the complexity of the machines. Therefore, in the diagnosis system used in this study, the fuzzy reasoning strategy was used to perfect rule-based diagnosis methods. The knowledge base is represented by the production rule. The fundamental ideas of fuzzy reasoning are as follows: Suppose G is a set of a fuzzy proposition, fuzzy characteristics and a fuzzy relation. For simplicity, the fuzzy proposition, fuzzy characteristics and fuzzy relation are together called the fuzzy assertion. Then, a piece of factual information can be presented by a binary group (P, β). P is the fuzzy assertion, PG  ; β is the reliability of P, One fault symptom may correspond to multiple causes, while one fault cause may also correspond to multiple fault symptoms. Therefore, the relationship between cause and symptom is complicated. Damage of bearing For proper diagnosis, the membership degree between fault causes and fault symptoms needs to be pre-determined. The value of this membership degree can be obtained based on expert experience or theoretical research. Based on years of experience in our lab in field diagnosis, we summarize the rules and establish the knowledge base.
The fuzzy rules of the knowledge base were used for fault cause reasoning to determine the reason for the faults. Then, according to the typical gearbox fault features, the rules of the knowledge base are constructed as shown in Table 1. In Table 1, f r , f m and x q are rotation frequency, gear meshing frequency and kurtosis, respectively.
For gearbox fault diagnosis, a fuzzy matrix was established: In the fuzzy matrix for gearbox fault diagnosis, rows represent sets of fault causes, columns represent sets of faults symptoms, and the values in the matrix represent the membership degree between fault symptoms and causes.
When implementing fault diagnosis with the fuzzy reasoning approach, the fuzzy matrix R is established first. Given a fault symptom A, if the fault conclusion is B, then the fuzzy reasoning formula can be shown as the following: The final diagnosis result includes the vectors with relatively large values upon conclusion of the diagnosis. If there are several relatively large values, the existing fault symptom should be considered for the final conclusion.

Examples of Diagnosis
In this study, SVM, wavelet lifting and diagnosis rules were used to analyze the vibration acceleration signal, according to a broken cog fault of the Z5 gear (tooth 31) in Shaft II of 22 gear-boxes of a high speed wire rolling mill. The gearbox transmission chain of a high-speed wire rolling mill is shown in Figure 3.

(c) (d)
As seen in Figure 4(b), the spectrum on the day of the fault shows single frequency and double frequency, and the amplitude of the single frequency is very high, which indicates a severe gear problem at that time. Figure 4(c) shows the time domain waveform under normal condition and Figure 4(d) shows spectrum under normal condition.

SVM estimation
As mentioned in Section 2.2, the "db10" wavelet was used to decompose the signal into three layers and the energy of each of the eight decomposed frequency bands E j and the total energy E were acquired. The feature vector can be established using the ratio between E j and E. The horizontal axis represents energy of each of the eight frequencies after the signals have been decomposed. The vertical axis represents the ratio.  As shown in Figure 5, wavelet energy was concentrated in frequency bands 1 and 2 under normal conditions, and tended to move to higher frequency bands when a fault occurred. The wavelet energy from the hourly data obtained in early June is set to class 1, which indicates normal conditions; the wavelet energy from data obtained when the rolling mills experienced a fault is set to class 2. Fifteen sets of data were used as SVM input for training. The test data included data from June and September, and each had 15 sets of data. Because gears have different crack patterns, data from September were identified as fault class 2 by the SVM and were significantly different from data in the normal condition. Figure 6 shows the test results.

Wavelet lifting analysis
In order to verify the effectiveness of wavelet lifting on data analysis, the field data obtained 74 days before the machine malfunction were analyzed. Through wavelet lifting, the original signals with the spectrum ranging from 0 to 2,500 Hz were decomposed at two levels, as shown in Figure 7. Two different bands can be obtained after carrying out decomposition at level 1, among which the spectrum range of c 0 is 0~1,250 Hz, and that of d 0 is 1,250~2,500 Hz. Four different bands can be obtained after carrying out decomposition at level 2, among which the spectrum range of c 1 is 0~625 Hz, that of d 1 is 625~1,250 Hz, that of c 2 is 1,250~1,875 Hz, and that of d 2 is 1,875~2,500 Hz. The approximation coefficient of the wavelet lifting decomposition at level 2 c 1 , the approximation coefficient of the wavelet decomposition at level 2 c 2 , and the detail coefficient of the wavelet decomposition at level 2 d 2 are shown in Figure 8. The approximation coefficient of wavelet lifting decomposition contains the low frequency information of the signals, and the detail coefficient contains the high frequency information of the signals.
According to the rotational speed of motor, the frequency of all parts in a rolling mill can be calculated, among which f m , the meshing frequency of high speed axis gear pair Z5/Z6 is 1,140 Hz, and the double frequency is 2,280.4 Hz. Both of the frequencies are included in the reconstructed spectrum d 1 (625~1,250 Hz) and d 2 (1,875~2,500 Hz) respectively, after decomposition at level one and level two. Thus the wavelet lifting only at level one and two are decomposed without any other more decompositions in this paper. The spectrum obtained from autoregressive spectrum analysis of signals in Figure 8 after wavelet lifting decomposition and reconstruction at level 2 is shown in Figure 9. The single frequency f m is 1,139.277 Hz, as shown in Figure 9(b), and the double frequency of the Z5/Z6 gear meshing frequency (f m ) can be found in Figure 9(d). The spectrum of signals (the spectrum of c 0 is 0~1,250 Hz) obtained by reconstruction of the approximation signal after wavelet lifting decomposition at level 1 is shown in Figure 10. There are the gear meshing frequency of the Z5/Z6-1,139.277 Hz, and the side frequencies of 37 Hz and 73 Hz around f m . The side frequencies of 37 Hz and 73 Hz are close to the single frequency and double frequency, respectively, of 36.751 Hz, which is the shaft-frequency of Shaft II.

Rule-based Reasoning analysis
Through wavelet lifting analysis, the gear meshing frequency (f m ) of Z5/Z6 and shaft-frequency (f r ) of Shaft II are obtained 74 days before the machine malfunction. The ratio of the calculated frequency to the feature frequency is shown in Table 2. through monitoring spectrum. f r means shaft-frequency. The calculated frequency is 37.00 Hz obtained through rotational speed of the motor in the field, while the feature frequency is 36.751 Hz obtained through monitoring spectrum. In Figure 10, there are many 37 Hz side frequencies around f m , which is 1,139.277 Hz representing the gear meshing frequency. These frequencies are defined as side frequency. In this case, the side frequency (37 Hz) is very close to the shaft-frequency (36.751 Hz) of Shaft II.
The ratio can be obtained through the following calculation. The closer this ratio is to 1, the more consistent is the feature frequency of the monitoring spectrum with the calculating frequency of the fault part, and the more possibility there is of a part with some fault. The process of calculation is shown as follows: 1,139.277/1,140.00 = 0.999, 36.751/37.00 = 0.993.
By analyzing September data in Table 2, it is noted that the single and double frequencies of the Z5/Z6 meshing frequency, as well as the double shaft-frequency of Shaft II, are outstanding. The calculated kurtosis is greater than 6.  As calculated in the final result, the maximum value that the fault conclusion corresponds to is 0.874; thus, the corresponding fault cause can be confirmed to be a broken gear tooth. A broken cog was found in gearbox Z5 when the machine was disassembled in the field, which is consistent with the diagnosis conclusion.
Together with the fuzzy reasoning approach in the above fault, we have proved that, in fault diagnosis, the application of fuzzy logic can effectively present some fuzzy information and construct a fuzzy matrix; furthermore, the fault type can be effectively diagnosed with fuzzy reasoning.

Analysis of the fault diagnosis ability
In order to describe the ability of intelligent diagnosis method put forward in this paper, we carried out two cases and made comparative analysis with traditional method of Fourier transform. 5.4.1. Case 1: fault diagnosis for tooth collision of helical gear At 14:00 on Nov. 30th, 2008, through Fourier Transform and wavelet lifting analysis of the original vibration signals, both of these two methods show that the gear meshing frequency of Z5/Z6 in Shaft III in the sixth rack of rolling mill in some factory was 45.9 Hz. Figures 11(a,b) shows the original signal and the spectrum after Fourier Transform. Figures 11(c,d) shows the spectrum after autocorrelation analysis about the approximation signals which were obtained after wavelet transform reconstruction and decomposition to the data at level three. It can be seen that the SNR of the wavelet transform is higher in the spectrum analysis. The device disintegrated four days later; the broken cog tooth is shown in Figure 12.  At 4:00 on Jan. 25th, 2008, through Fourier Transform and wavelet lifting analysis of the original vibration signals at low frequency, it is found that the shaft-frequency of Shaft II in gear-box in the second rack of rolling mill was 2.44 Hz. Figures 13(a,b) shows the original signal and the spectrum after Fourier Transform. Figures 13(c,d) shows the spectrum after autocorrelation analysis of the approximation signals which were obtained after wavelet transform reconstruction and decomposition to the data at level three. It can be seen that the wavelet transform had more apparent features in spectrum and of higher SNR. The device was opened and checked 18 days later. The broken cog tooth is shown in Figure 14.  Some conclusions can be obtained through comparison of the above two cases, and are summarized in Table 3. It can be seen that the SNR of wavelet transform is higher, and the features extracted by wavelet transform is more apparent.

Conclusions
By using wavelet lifting, together with support vector machines and rule-based reasoning fault diagnosis methods, a real fault example of a broken cog in gearbox was analyzed and the following conclusions were drawn: SVM is suitable for pattern recognition of problems with small sample sizes. In this study, two-class pattern recognition of actual gearbox faults was accomplished for diagnosis using SVM as the classifier. Based on the second generation wavelet packet feature extraction technology, by taking advantage of the fact that resonance occurs in the high frequency bands in the early stages of a fault, interference from noise signals from other frequency bands is effectively avoided through the decomposition and reconstruction of signals at high frequency bands; thus, fault feature extraction was achieved. According to the features of gearbox faults, a fuzzy production approach was applied to reveal fault rules, and rule-based reasoning was achieved through the fuzzy matrix. As demonstrated with actual data, this approach effectively overcomes the difficulty that some rules are difficult to present precisely.
Integrating different diagnosis technologies has become popular in intelligent diagnosis research. Taking advantage of each method in diagnosis inference such that the methods complement each other and create a hybrid diagnosis system is the goal for designing intelligent diagnosis technology.