Fault Diagnosis of Roller Bearings Based on a Wavelet Neural Network and Manifold Learning

In order to improve the accuracy of the fault diagnosis of roller bearings, this paper proposes a kind of fault diagnosis algorithm based on manifold learning combined with a wavelet neural network. First, a high-dimensional feature signal set is obtained using a conventional feature extraction algorithm; second, an improved Laplacian characteristic mapping algorithm is proposed to reduce the dimensions of the characteristics and obtain an effective characteristic signal. Finally, the processed characteristic signal is inputted into the constructed wavelet neural network whose output is the types of fault. In the actual experiment of recognizing data sets on roller bearing failures, the validity and accuracy of the method for diagnosing faults was verified.


Introduction
Roller bearings are critical components of machinery and directly affect the performance of the whole system [1].However, bearings inevitably degenerate and break down due to an overload of repetitive work.According to statistics, 30 percent of faults in rolling machinery originate from bearing faults; therefore, bearing fault prognostics is an important area of research [2][3][4][5].Recently, studies on bearing faults prognostics have focused on fault signal feature extraction and fault classification.Commonly used methods for extracting the features of fault signals are Empirical Mode Decomposition (EMD), morphology, wavelet transform, signal value decomposition, and principal component analysis [6][7][8][9][10].However, features extracted using the above methods are largely redundant, as bearing fault signals are unstable and nonlinear [11].Manifold learning, such as Isomap [12], Locally Linear Embedding (LLE) [13], and Laplacian Eigenmap (LE) [14] can be used to extract effective features.Bearing vibration signals produce a great deal of noise and the first two methods referred to are very sensitive to noise.LE has a powerful local keeping ability and is not sensitive to noise [15].Good results can be obtained with this method to extract features from roller bearings and reduce dimensionality.The local field selection of manifold learning is the key step and greatly affects performance [16].Therefore, we propose an adaptive local field selection system based on Lalacian mapping to extract features of bearing fault signals.Classifications of fault signals and information fusion approaches such as firefly neural networks [17], recursive complex networks [18], and fuzzy C mean clustering [19] are commonly used.Neural networks have a great ability to fit nonlinear data and recognize bearing faults.The proposed wavelet neural network with expansion and translation obtains better freedom and has a greater flexible effective function approximation ability than other conventional neural networks.It also learns more quickly and can learn from mutation function and discontinuous signal details.
From the above analysis, it is clear that LE is suitable for use in the nonlinear descending dimension method.The performance of LE depends greatly on the neighborhoods k.Generally, k is given according to experience.In this paper, we propose an improved LE that can be changed adaptively for any given k and obtain good results.We can also attain good characteristic signals using this method.Additionally, we propose a fault diagnosis method based on a wavelet neural network model where the merits of wavelet transform are integrated with that of an artificial neural network.The experimental results prove the effectiveness of the proposed methods.

Prognostics Method Based on Manifold Learning and a Wavelet Neural Network
A flow chart of the proposed prognostics method based on manifold learning and a wavelet neural network is shown in Figure 1.First, common feature extraction methods were used in the time domain signals to extract features to construct high-dimensional feature signal sets.Next, methods based on manifold learning were adopted to reduce the dimensionality and obtain effective feature parameters.Lastly, wavelet neural networks were set up to take feature parameters as inputs and fault types as outputs, and train networks until the required errors were reached and the prognostics tested.
expansion and translation obtains better freedom and has a greater flexible effective function approximation ability than other conventional neural networks.It also learns more quickly and can learn from mutation function and discontinuous signal details.
From the above analysis, it is clear that LE is suitable for use in the nonlinear descending dimension method.The performance of LE depends greatly on the neighborhoods k.Generally, k is given according to experience.In this paper, we propose an improved LE that can be changed adaptively for any given k and obtain good results.We can also attain good characteristic signals using this method.Additionally, we propose a fault diagnosis method based on a wavelet neural network model where the merits of wavelet transform are integrated with that of an artificial neural network.The experimental results prove the effectiveness of the proposed methods.

Prognostics Method Based on Manifold Learning and a Wavelet Neural Network
A flow chart of the proposed prognostics method based on manifold learning and a wavelet neural network is shown in Figure 1.First, common feature extraction methods were used in the time domain signals to extract features to construct high-dimensional feature signal sets.Next, methods based on manifold learning were adopted to reduce the dimensionality and obtain effective feature parameters.Lastly, wavelet neural networks were set up to take feature parameters as inputs and fault types as outputs, and train networks until the required errors were reached and the prognostics tested.

High-Dimensional Feature Signal Extraction
The fault prognostics of bearing faults can be considered as a map between two spaces where one is the fault feature set and the other is the state of faults.It is crucial to accurately find the corresponding fault features, i.e., feature extraction.Too many or too few features will produce difficulties in prognostics.In order to obtain fault features from nonlinear and unstable roller bearing vibration signals, traditional methods such as the Hilbert Huang transform, Fast Fourier Transformation (FFT), signal envelope, Empirical Mode Decomposition (EMD) and wavelet transform were taken to extract features from the monitoring time domain vibration signals.It is obvious that the feature signals extracted by these methods are redundant, nonlinear, and uncertain.Therefore, manifold learning is utilized to reduce dimensionality.Improved Laplacian Eigenmaps (ILE) are proposed when considering the local field selection of manifold learning.

ILE Algorithm
The idea of graph theory is introduced with the LE algorithm in this paper.By constructing weighted neighborhood graphs, the relation of the same positions of sample points in local neighborhoods is maintained in high and low dimensional space [14].The neighborhoods k in LE is the key parameter in reducing dimensions.In view of the different characteristics of different manifold surfaces, it is easy to include the traditional fixed k value with the points in the same neighborhood graph.This results in the "short circuit phenomenon", as these points do not belong to the same manifold area.To solve this problem, an improved Laplace Eigen-mapping method dimensionality reduction algorithm is proposed and is shown in Figure 2. Detailed step-by-step instructions are as follows:

High-Dimensional Feature Signal Extraction
The fault prognostics of bearing faults can be considered as a map between two spaces where one is the fault feature set and the other is the state of faults.It is crucial to accurately find the corresponding fault features, i.e., feature extraction.Too many or too few features will produce difficulties in prognostics.In order to obtain fault features from nonlinear and unstable roller bearing vibration signals, traditional methods such as the Hilbert Huang transform, Fast Fourier Transformation (FFT), signal envelope, Empirical Mode Decomposition (EMD) and wavelet transform were taken to extract features from the monitoring time domain vibration signals.It is obvious that the feature signals extracted by these methods are redundant, nonlinear, and uncertain.Therefore, manifold learning is utilized to reduce dimensionality.Improved Laplacian Eigenmaps (ILE) are proposed when considering the local field selection of manifold learning.

ILE Algorithm
The idea of graph theory is introduced with the LE algorithm in this paper.By constructing weighted neighborhood graphs, the relation of the same positions of sample points in local neighborhoods is maintained in high and low dimensional space [14].The neighborhoods k in LE is the key parameter in reducing dimensions.In view of the different characteristics of different manifold surfaces, it is easy to include the traditional fixed k value with the points in the same neighborhood graph.This results in the "short circuit phenomenon", as these points do not belong to the same manifold area.To solve this problem, an improved Laplace Eigen-mapping method dimensionality reduction algorithm is proposed and is shown in Figure 2. Detailed step-by-step instructions are as follows: Step 1: Construct weighted neighborhood graphs.
From the k-nearest neighbors of the high-dimensional spatial data point X i , the adjacency matrix of the data set is obtained.

1.
Set the initial neighborhood value k, according to the principle of KNN for each of the sample points to determine the neighborhood, For each of the resulting neighborhoods, we can calculate the tangent space coordinates that correspond to the neighborhood points [20].
where Mapping function of neighborhood points to local tangent space coordinates can be expressed as a Equation ( 2) based on Taylor series expansion. where λ is the scale factor.Θ is defined as the threshold.When λ < Θ, the local neighborhood obtained from the k-value is considered to be linearly represented by the tangent plane, which is in accordance with the characteristics of the ideal neighborhood, Θ ∈ (0, 0.1).

3.
Calculate the difference function of a high and low dimension distribution.
The difference function of the high and low dimension distribution of all points in the i-th neighborhood is defined as follows [21]: where p ij is the distribution function of the j-th neighborhood point of the i-th high-dimensional sample point; and q ij is the distribution function of the Euclidean distance of the low-dimensional neighborhood point.The k-value adjustment strategy is shown in Equation ( 4): where Assume that the neighborhood weights of each sample point is Calculate the weight using the following equation: The initial value of weight is one by default.γ is the adjustment factor, Ψ i,C is the update operator, and the small weight points will be removed as invalid neighborhood points while the remaining ones will be reserved as effective neighborhood points.
Step 2: Determine the weight of the neighborhood graph edge.
Determine the weight between the points and use the thermal kernel function to determine the weight size.If the two nodes are adjacent, the weight is one, otherwise the weight is zero.
Step 3: Calculate the feature map.
Compute the eigenvectors and eigenvalues of the Laplace matrix L: where D is a diagonal matrix, the eigenvectors are outputted as the result of the reduced dimension that corresponds to the smallest m nonzero eigenvalue.
Appl.Sci.2017, 7, 158 4 of 10 where is a diagonal matrix, the eigenvectors are outputted as the result of the reduced dimension that corresponds to the smallest m nonzero eigenvalue.

Wavelet Neural Network
The wavelet neural network (WNN) is a forward network, which uses the wavelet basis function as the neuron activation function [22].It adaptively adjusts the scaling and translation factors of the wavelet function, network connection weights, and the approximation function in batch mode.The fault diagnosis of the bearing based on the wavelet neural network is shown in Figure 3. where , , … , is the bearing fault feature,which can botain by the ILE method proposed.is the bearing state ,where is the normal state; the ball failure; the inner ring failure; and is the outer ring failure four."1" is in this state, while "0" is not in this state.Ψ is the wavelet base; its Fourier transform ( ) satisfies the admissible condition: By scale scaling and translating for wavelet base functions ( ) , the family of wavelet functions is obtained: where , ∈ , ≠ 0 .and are scaled translation scale factors, respectively.The wavelet transform is:

Wavelet Neural Network
The wavelet neural network (WNN) is a forward network, which uses the wavelet basis function as the neuron activation function [22].It adaptively adjusts the scaling and translation factors of the wavelet function, network connection weights, and the approximation function in batch mode.The fault diagnosis of the bearing based on the wavelet neural network is shown in Figure 3.
where is a diagonal matrix, the eigenvectors are outputted as the result of the reduced dimension that corresponds to the smallest m nonzero eigenvalue.

Wavelet Neural Network
The wavelet neural network (WNN) is a forward network, which uses the wavelet basis function as the neuron activation function [22].It adaptively adjusts the scaling and translation factors of the wavelet function, network connection weights, and the approximation function in batch mode.The fault diagnosis of the bearing based on the wavelet neural network is shown in Figure 3. where , , … , is the bearing fault feature,which can botain by the ILE method proposed.is the bearing state ,where is the normal state; the ball failure; the inner ring failure; and is the outer ring failure four."1" is in this state, while "0" is not in this state.Ψ is the wavelet base; its Fourier transform ( ) satisfies the admissible condition: By scale scaling and translating for wavelet base functions ( ) , the family of wavelet functions is obtained: where , ∈ , ≠ 0 .and are scaled translation scale factors, respectively.The wavelet transform is: where {x 1 , x 2 , . . . ,x N } is the bearing fault feature, which can botain by the ILE method proposed.{y 1 y 2 y 3 y 4 } is the bearing state ,where y 1 is the normal state; y 2 the ball failure; y 3 the inner ring failure; and y 4 is the outer ring failure four."1" is in this state, while "0" is not in this state.
Ψ is the wavelet base; its Fourier transform Ψ (ω) satisfies the admissible condition: By scale scaling and translating for wavelet base functions Ψ (x), the family of wavelet functions is obtained: where a, b ∈ R, a = 0. a and b are scaled translation scale factors, respectively.The wavelet transform is: Discretizing a and b, when a = a m 0 , b = nb 0 a m 0 , n, m ∈ Z, then: Equation ( 10) is the discrete wavelet transform.The discrete wavelet transform is: Select the scaling parameters a > 0, b > 0 to ensure that the wavelet system Ψ n,m (t) (n, m) ∈ Z 2 satisfies the L 2 (R) frame condition: In the formula, A, B are the frame boundaries.If Ψ n,m , Ψ n,m (t) are the dual function, then for any function f (t) ∈ L 2 (R), f (t) can be expressed as a form of wavelet series: The function can be expanded by a wavelet basis function and an approximate function: The above equation can be realized by a neural network with a hidden layer, and by adjusting the weights a j and b j to an approximate function.In the formula, Ψ i (t) is the wavelet basis function (i = 1, 2, . . ., n); y k (k = 1, 2, . . ., n) the output of the network; and w ij , w jk is the connection weight between the input layer/the hidden layer and between the hidden layer/the output layer.
The wavelet basis function Ψ i (t) chose the Mexican Hat wavelet as follows: Take the energy function: In this formula, P is the number of samples, y is the actual value, and ŷ is the output of WNN.The network output layer neuron function takes the Sigmoid function, that is ∅ (x) = 1/ (1 + e −x ).Then, the expression of the network output y is: j = 1, 2, . . ., n, k = 1, 2, . . ., q.By using gradient descent method: In the formula, h shows that w ij , w jk , a j , b j , η is the learning rate, α is the momentum factor, and η, α ∈ (0, 1).Equation ( 16) is minimized by adjusting the parameters of the WNN.

Experimental Subject
The bearing data was generated by the Industry/University Cooperative Research Centers Program (I/UCRC) for Intelligent Maintenance Systems.The test platform is shown in Figure 4.The rotation speed was kept constant at 2000 RPM by an alternating current motor and at a radial load of 6000 lbs.Data collection was facilitated by National Instruments Data Acquisition (NI Card 6062E, with the sampling rate set at 20 kHz [23].

Roller Bearing Vibration Signals
The test was carried out for 35 days.Figure 5a-d shows the vibration signals for the normal state, a roller element failure, inner race failure, and outer race failure, respectively [23].

Roller Bearing Vibration Signals
The test was carried out for 35 days.Figure 5a-d shows the vibration signals for the normal state, a roller element failure, inner race failure, and outer race failure, respectively [23].

Roller Bearing Vibration Signals
The test was carried out for 35 days.Figure 5a-d shows the vibration signals for the normal state, a roller element failure, inner race failure, and outer race failure, respectively [23].At first, features were obtained using conventional methods as much as possible.Next, we reduce dimensions using the ILE method mentioned above.The result of the dimension reduction is shown in Table 1 where (1) is normal; (2) is a roller element failure; (3) is the inner race failure; and ( 4) is an outer race failure.All other characteristic signals found in all types of situations are also shown in Table 1.

Diagnosis of a Wavelet Neural Network
According to the characteristics and state of the network, we set a wavelet neural network to include eight input nodes for the eight features and four output nodes to indicate the four types of state.We selected five groups from every state, with a total of 20 types of groups composed of training samples and 12 groups as test samples.We set ±0.01 as the training error of the network.Figure 6 highlights the design of the wavelet neural network.When the network was trained, this meant that an error had met a demand.Test data was inputted into the wavelet neural network, and the test results are shown in Table 2.As shown in Table 2, four states of the bearing can be well separated, with a 100% rate of recognition.This shows that: (1) the feature extraction is effective; (2) the wavelet neural network can be used for information fusion; and (3) good classification results can be achieved.
In order to further verify the effectiveness of the proposed method, we compared it with the methods described in Reference [24] based on the same bearing data.The authors of Reference [24] used principal component analysis (PCA), local and nonlocal preserving projection (LNPP), linear discriminant analysis (LDA), and supervised-learning-based local and nonlocal preserving projection (SLNPP) to extract features, respectively, before using the features as the inputs of k-nearest neighbor algorithm (KNN).These results are shown in Table 3, where the proposed method has a higher recognition rate.When the network was trained, this meant that an error had met a demand.Test data was inputted into the wavelet neural network, and the test results are shown in Table 2.As shown in Table 2, four states of the bearing can be well separated, with a 100% rate of recognition.This shows that: (1) the feature extraction is effective; (2) the wavelet neural network can be used for information fusion; and (3) good classification results can be achieved.
In order to further verify the effectiveness of the proposed method, we compared it with the methods described in Reference [24] based on the same bearing data.The authors of Reference [24] used principal component analysis (PCA), local and nonlocal preserving projection (LNPP), linear discriminant analysis (LDA), and supervised-learning-based local and nonlocal preserving projection (SLNPP) to extract features, respectively, before using the features as the inputs of k-nearest neighbor algorithm (KNN).These results are shown in Table 3, where the proposed method has a higher recognition rate.

Conclusions
Roller bearing forms the core of rotating machinery, and its monitoring has always been of significant research interest.As roller bearing fault signals are nonlinear and non-stationary, we have done two things to resolve the issues of fault identification and fault classification.
(1) The performance of LE depends greatly on the neighborhoods k.Generally, k is valued according to experience.We proposed an improved LE algorithm that allows for adaptive change for any given k.The experimental results demonstrated that the proposed method can effectively obtain k and extract the feature signals of the roller bearings.(2) Based on the feature signals and the integration of the merits of wavelet transform with that of an artificial neural network, we constructed a wavelet neural network for fault identification and classification.The experimental results indicate that this proposed method has excellent clinical practical value, with a classification accuracy rate of 100%.

Figure 1 .
Figure 1.The fault diagnosis method based on a manifold learning and a neural network.

Figure 1 .
Figure 1.The fault diagnosis method based on a manifold learning and a neural network.

Figure 2 .
Figure 2. Diagram of the steps in the method.

Figure 3 .
Figure 3. Fault diagnosis model based on the wavelet neural network.

Figure 2 .
Figure 2. Diagram of the steps in the method.

Figure 2 .
Figure 2. Diagram of the steps in the method.

Figure 3 .
Figure 3. Fault diagnosis model based on the wavelet neural network.

Figure 3 .
Figure 3. Fault diagnosis model based on the wavelet neural network.
The bearing data was generated by the Industry/University Cooperative Research Centers Program (I/UCRC) for Intelligent Maintenance Systems.The test platform is shown in Figure4.The rotation speed was kept constant at 2000 RPM by an alternating current motor and at a radial load of 6000 lbs.Data collection was facilitated by National Instruments Data Acquisition (NI DAQ) Card 6062E, with the sampling rate set at 20 kHz[23].

Figure 4 .
Figure 4. Illustration of the bearing test rig and sensor placement [23].

Figure 4 .
Figure 4. Illustration of the bearing test rig and sensor placement [23].
The bearing data was generated by the Industry/University Cooperative Research Centers Program (I/UCRC) for Intelligent Maintenance Systems.The test platform is shown in Figure4.The rotation speed was kept constant at 2000 RPM by an alternating current motor and at a radial load of 6000 lbs.Data collection was facilitated by National Instruments Data Acquisition (NI DAQ) Card 6062E, with the sampling rate set at 20 kHz[23].

Figure 4 .
Figure 4. Illustration of the bearing test rig and sensor placement [23].

Figure 6 .
Figure 6.Diagnosis of the wavelet neural network.

Figure 6 .
Figure 6.Diagnosis of the wavelet neural network.

Table 1 .
Fault data set.

Table 3 .
Recognition rate of each method.