Data Screening Based on Correlation Energy Fluctuation Coefficient and Deep Learning for Fault Diagnosis of Rolling Bearings

Qin, Bo; Luo, Quanyi; Li, Zixian; Zhang, Chongyuan; Wang, Huili; Liu, Wenguang

doi:10.3390/en15072707

Open AccessArticle

Data Screening Based on Correlation Energy Fluctuation Coefficient and Deep Learning for Fault Diagnosis of Rolling Bearings

by

Bo Qin

^1,2,*

,

Quanyi Luo

¹,

Zixian Li

^3,*,

Chongyuan Zhang

⁴,

Huili Wang

⁴ and

Wenguang Liu

¹

School of Mechanical Engineering, Inner Mongolia University of Science and Technology, Baotou 014010, China

²

Mining Research Institute, Inner Mongolia University of Science and Technology, Baotou 014010, China

³

State Key Laboratory of Mechanical Transmissions, Chongqing University, Chongqing 400044, China

⁴

Department of Mechanical Engineering, Baotou Vocational & Technical College, Baotou 014030, China

^*

Authors to whom correspondence should be addressed.

Energies 2022, 15(7), 2707; https://doi.org/10.3390/en15072707

Submission received: 15 February 2022 / Revised: 23 March 2022 / Accepted: 31 March 2022 / Published: 6 April 2022

(This article belongs to the Special Issue Intelligent Designing, Measuring and Control for Frontier Instrument and Equipment)

Download

Browse Figures

Versions Notes

Abstract

:

The accuracy of the intelligent diagnosis of rolling bearings depends on the quality of its vibration data and the accuracy of the state identification model constructed accordingly. Aiming at the problem of “poor quality” of data and “difficult to select” structural parameters of the identification model, a method is proposed to integrate data cleaning in order to select effective learning samples and optimize the selection of the structural parameters of the deep belief network (DBN) model. First, by calculating the relative energy fluctuation value of the finite number of intrinsic function components using the variational modal decomposition of the rolling bearing vibration data, the proportion of each component containing the fault component is characterized. Then, high-quality learning samples are obtained through screening and reconstruction to achieve the effective cleaning of vibration data. Second, the improved particle swarm algorithm (IPSO) is used to optimize the number of nodes in each hidden layer of the DBN model in order to obtain the optimal structural parameters of the intelligent diagnosis model. Finally, the high-quality learning samples obtained from data cleaning are used as input to construct an intelligent identification model for rolling bearing faults. The results showed that the proposed method not only screens out the intrinsic mode function components that contain the fault effective components in the rolling bearing vibration data, but also finds the optimal solution for the number of nodes in the DBN hidden layer, which improves bearing state identification accuracy by 3%.

Keywords:

learning sample screening; correlation energy fluctuation coefficient; rolling bearing; DBN optimized by IPSO; fault recognition rate

1. Introduction

As the “joint” of electromechanical equipment, rolling bearings are widely used in rail transit, wind turbines, weaponry, steel metallurgy, and other fields due to their low friction resistance and high rotation accuracy [1]. According to statistics, 30% of faults in rotating machinery are caused by bearing failures [2]. Once the mechanical equipment fails, it will cause downtime and even casualties. In order to monitor the status of machinery and equipment in real time and achieve early warning, a condition monitoring system (CMS) is widely used in various industries; at the same time, it provides “mass data” with a large capacity, diversity, and high-rate characteristics [3]. Therefore, how to gather the running status information of rolling bearings from the data in a timely and quantitatively accurate manner has important theoretical significance and engineering application value in the safe and efficient operation and in the health management of the transmission system.

Generally, there are two factors that affect the accuracy of the intelligent state recognition of rolling bearings: the quality of learning samples in the modeling process and the accuracy of the diagnostic model constructed based on the machine learning method. Because there are many application scenarios of rolling bearings and extremely complex operating conditions, the measured bearing vibration signals are transmitted through multiple interfaces, and the energy is attenuated. This causes the signals mentioned above to have strong background noise, strong nonlinearity, nonstationarity, and amplitude and frequency modulation characteristics. This often results in poor accuracy in subsequent intelligent diagnoses when those signals are used as learning samples. Therefore, how to accurately clean and screen out the key components related to the fault from the mass of rolling bearing vibration signals and use them as the learning sample for the intelligent diagnosis model is the key to improving the accuracy of the intelligent diagnosis.

So as to capture as much fault information in rolling bearing vibration signals as possible, extensive research has been carried out with regard to this problem. Huang et al. [4] proposed the empirical mode decomposition (EMD) method to decompose nonlinear non-stationary signals. Compared with traditional time–frequency analysis methods such as wavelet and wavelet packet, EMD avoids the selection of basis functions and reduces manual influence; however, EMD has the defects of modal aliasing and end effect. In order to suppress modal aliasing, Wu and Huang [5] proposed an ensemble empirical mode decomposition (EEMD) algorithm for the modal aliasing problem in EMD. It improves the distribution of the extreme points of the signal by adding Gaussian white noise to the signal; after multiple averaging, it achieves the purpose of reducing modal aliasing. Although EEMD reduces modal aliasing, the added noise is not completely removed, which increases the reconstruction error of the vibration signal.

Jonathan [6] proposed the local mean decomposition (LMD) method, which adaptively decomposes the signal whose frequency changes with time into the sum of a finite number of instantaneous frequency product functions (PF) to reduce the modal aliasing and end effect of the EMD and EEMD algorithms. However, it cannot completely solve the problem of modal aliasing.

In 2014, Dragomiretskiy et al. [7] proposed variational mode decomposition (VMD). The difference from the method above is that VMD transfers the signal decomposition process to the variational framework and searches for the optimal solution for the constrained variational model through iterative methods. Its mathematical theoretical foundation is complete and it reduces the end effect and modal aliasing. It is widely used in signal decomposition in the field of fault diagnosis.

In the process of building a data-driven diagnostic model, the fault recognition rate of the model depends on the quality of the learning samples. The prerequisite for improving the fault recognition rate of the data-driven intelligent identification model for the rolling bearing state is to screen out the main intrinsic mode function (IMF) components. The IMF components contain the main component of the fault, which can be obtained from the decomposition of the rolling bearing fault signal of the electromechanical equipment above and then adopted in order to construct the learning samples. At present, many scholars have performed related work. Hua L et al. [8] proposed a fault feature extraction method for rolling bearings, combining EEMD and improved frequency band entropy. Li et al. [9] selected the IMF, which is closely correlated to the fault signal according to the resonance frequency. Zhao et al. [10] proposed an IMF selection method based on the correlation coefficient between the filtered signal and each IMF to select which contains fault feature information. Zhang et al. [11] calculated the kurtosis values of all IMF obtained by the VMD algorithm of the rolling bearing vibration signal and selected the two largest components as the effective fault components. Wang et al. [12] optimized the VMD algorithm based on the beetle antennae search algorithm and selected the fault component by calculating the kurtosis value of the IMF for a subsequent intelligent diagnosis. However, most of the studies above are limited to considering or selecting a certain main IMF component as the learning sample from a single perspective, and the number of effective fault components in the selected IMF components are not quantified. In addition, the displacement and contact stiffness excitations induced by the damage defect on the rolling element and raceway surfaces are slow and disorderly, and the impact component in the detected vibration signal is submerged by strong background noise. A resulting low fault recognition rate is expected if the selected IMF component is directly used for fault identification due to its nonuniform and slowly varying shock characteristics. Therefore, there is an urgent need to propose a new and effective component screening method for bearing faults in order to realize accurate identification. This article studies the physical essence of rolling bearing fault diagnosis and changes in the energy fluctuation of the vibration signal during the operation of the rolling bearing. A correlation energy fluctuation coefficient is proposed as a criterion for evaluating the fault sensitivity of IMFs decomposed by the VMD algorithm, based on the correlation between each component of the vibration signal obtained by VMD and the original vibration signal. Finally, the high-quality learning samples are built through heavy weighting technology to ensure the accuracy of subsequent fault recognition models.

With the rapid development of technologies such as intelligent sensing and network transmission, massive data that can represent the state of equipment can be collected by CMS in real time. However, it also increases the capacity of learning samples in the process of intelligent diagnosis, which has become a huge challenge that affects the accuracy of intelligent diagnosis. At present, the construction of a data-driven intelligent diagnostic model for rolling bearings has become a research hotspot in the field of state identification and online monitoring. Li et al. [13] proposed a deep stacking least squares support vector machine together with the concept of stacking-based representation learning for rolling bearing fault diagnosis. Zhou et al. [14] combined EEMD, weighted permutation entropy, and an improved support vector machine (SVM) ensemble classifier to realize the fault diagnosis of rolling bearings. W Dong et al. [15] proposed an intelligent fault diagnosis method that used the adaptive whale optimization algorithm to optimize the extreme learning machine (ELM) algorithm. Luo et al. [16] combined the real-valued gravitational search algorithm and binary gravity search algorithm to optimize the ELM and then sent the preprocessed rolling bearing vibration signal to the diagnostic model to diagnose the rolling bearing fault. Li et al. [17] used a back propagation (BP) neural network to perform local feature learning on the sub-signals of the decomposed rolling bearing vibration signal for fault diagnosis. However, the research work above has some shortcomings: the algorithms used in model building are all shallow machine learning algorithms, which are weak in extracting effective fault features and have poor generalization ability. Among them, the selection of the structural parameters of the hidden layer of the BP neural network is based on experience, the exhaustive trial and error is blind, and the solution is not unique. SVM is suitable for small sample data as it will occupy a lot of computing resources if it deals with the massive bearing operating status data obtained by the online monitoring system, which will inevitably increase the time cost. For the multi-classification situation, the models above have low fault recognition rate and poor effect. ELM requires more neuron nodes, and the initial randomization of its internal weights and thresholds can easily lead to a decrease in the failure recognition rate. To address these issues, a deep learning [18] algorithm with a strong data feature extraction ability is needed to resolve the shortcomings of the rolling bearing state identification model, constructed based on the above-mentioned shallow neural network algorithm. Among them, the DBN proposed by Hinton [19] activated the operation of neural nodes by calculating the scalar energy of the data, which has the advantages of strong expressive ability, easy reasoning, and fast calculation speed; it is suitable for the intelligent fault diagnosis of rolling bearings based on vibration signals, but its own structural parameters have a great influence on the performance of the model. Therefore, to solve this problem, Li et al. [20] proposed to optimize the DBN structure parameters by using the particle swarm optimization (PSO) algorithm. The optimal DBN structure parameters are found by continuously and iteratively updating the position and velocity of the particles in the range of the objective function.

However, although the PSO algorithm has an extremely fast convergence speed, it will fall into local extreme value due to the influence of its calculation parameters, producing a final optimization result that is not the global optimal value. To address this issue, this paper proposes to use the IPSO algorithm to optimize the number of nodes in multiple hidden layers, in turn to ensure the accuracy of the rolling bearing defect state identification model.

The rest of the paper is organized as follows. Section 2 introduces the data cleaning method based on the correlation energy fluctuation coefficient and the process of rolling bearing defect identification based on DBN optimized by IPSO. Section 3 verifies the effectiveness of the signal data cleaning method through numerical simulation research and carries out laboratory experiments to verify the effectiveness of the method proposed in Section 2, and the experimental results and the proposed method are summarized and discussed. Section 4 presents the conclusion.

2. Intelligent Identification Method

The rolling bearing vibration signal collected in real time by the condition monitoring system has the characteristics of nonlinearity, non-stationarity, strong background noise, etc. To address the issues of poor quality of learning samples for the state identification of rolling bearings and the difficult selection of parameters for state identification models, a signal cleaning method based on the correlation energy fluctuation coefficient and reweighting in combination with an intelligent identification method for rolling bearing defects are proposed. The specific algorithm flow and framework are shown in Figure 1, which mainly comprises two parts: signal data cleaning based on the correlation energy fluctuation coefficient and the intelligent identification of rolling bearing defects based on the DBN optimized by IPSO network model.

2.1. Data Cleaning

The quality of the learning samples is a prerequisite for improving the fault recognition rate of the data-driven rolling bearing state identification model. To improve the diagnostic accuracy, it is urgent to clean the vibration signals of the rolling bearings picked up by the sensor in order to screen out high-quality learning samples. However, the rolling bearing vibration signal has time-varying modulation characteristics. After VMD decomposition, a limited number of IMF components representing different frequency components of the original signal are obtained; however, whether the above-mentioned modal components contain rolling bearing fault components, and as to how many of these have not yet been proven, a standard is needed to quantify the degree of fault-sensitive information contained in each modal component.

2.1.1. Principle of Evaluation

According to the characteristics of vibration signal shock energy fluctuation and its correlation of rolling bearings, the degree of fault information contained in the decomposed IMF components are evaluated, and a criterion for quantifying and screening the relationship between IMF components and sensitive fault components is proposed. The specific calculation is expressed as follows:

\begin{matrix} p_{m} = \frac{\frac{1}{N (N - 1)} \sum |e_{l} - e_{o}|}{1 - ρ_{x (t), {IMF}_{m}}} \\ l, o = 1, 2, \dots N, l \neq o; e = {(\frac{1}{K} \sum_{1}^{K} \sqrt{| u (k) |})}^{2} \end{matrix}

(1)

where N is the number of segments of a single IMF component, and the value of

N

should be determined through experiments in combination with the application scenario of rolling bearings,

l

and

o

are the

l

th and

o

th root amplitudes in the root amplitudes calculated by the current IMF components and

l \neq 1, e

is the square root amplitude, m is the mth IMF component obtained by decomposing the original signal,

u (k)

is one of the components after the IMF is evenly divided into N segments, k is the kth sample point in the component segment

u (k)

, k = 1, 2, ..., K, and

ρ

is the Pearson correlation coefficient.

The idea behind its evaluation is that among the indicators that characterize the vibration signal of rolling bearings, one must fully consider the ability of the square root amplitude of the dimensioned index. This highlights the energy of the discretized vibration signal and the similarity of the fault impact in the IMF component to the original signal. When the fault impact component in the vibration signal of the rolling bearing is stronger, and the number and trend of the impact are similar to the original signal, the related energy fluctuation evaluation value

p_{m}

will be more obvious, and vice versa.

2.1.2. Screening and Reconstruction of IMF Components

According to the proposed evaluation criteria, the IMF components obtained by decomposing the vibration signal of the rolling bearing by VMD are evaluated so as to remove the component with a high proportion of noise and retain the component with a large amount of fault information.

The original vibration signal

x (t)

of the rolling bearing is decomposed by VMD to obtain M IMF components; the specific equation is as follows:

x (t) = {IMF}_{1} + {IMF}_{2} + \dots + {IMF}_{M}

(2)

where

x (t)

is the measured original vibration signal of the rolling bearing, and M is the number of IMF components obtained by VMD.

According to the proposed “correlated energy fluctuation assessment” criterion, the energy of the above M IMF components is assigned. Then, Equation (1) is used to calculate the relevant energy fluctuation evaluation value p and comprehensively consider the reduction of the influence of modal aliasing on the VMD. p is compared with the evaluation value mean

p_{mean}

, an amount less than

p_{mean}

is defined as redundant and is discarded. The modal component greater than

p_{mean}

is regarded as the effective component of fault information, and the relevant energy fluctuation evaluation coefficient p of the IMF is defined as the weight value. Equation (3) is then used to perform weighted reconstruction to obtain a new signal

x^{'} (t)

.

x^{'} (t) = p_{s 1} {IMF}_{s 1} + \dots + p_{s i} {IMF}_{s f}

(3)

where

x^{'} (t)

is the signal after screening and reconstruction,

{IMF}_{s 1} \sim {IMF}_{s f}

are the f effective components of fault information after screening, and

p_{s 1} \sim p_{s j}

are the relevant energy fluctuation evaluation coefficients corresponding to the effective IMF components of the fault.

2.2. Intelligent Diagnosis Model

The selection of structural parameters for the intelligent diagnosis model determines its accuracy. The number of nodes in the hidden layer of the intelligent diagnosis model based on DBN is a very important structural parameter. At present, its selection relies on manual experience and exhaustive trial, and the accuracy rate is sometimes good and sometimes bad. To overcome this defect and avoid falling into the local optimal solution, an improved particle swarm optimization algorithm with a better global search ability is introduced to select the number of hidden layer nodes for the selection of the optimal model structure parameters; it is also used to build a rolling bearing state identification model with the best recognition rate based on DBN.

2.2.1. DBN Network Model

DBN is a deep neural network algorithm proposed by Hinton in 2006. Its network structure is composed of a visible layer V, a hidden layer H (multiple restricted boltzmann machines (RBM)), and an output layer; each layer is composed of several activation functions (such as sigmoid), and the layers are connected bidirectionally through connection weights w, visible layer bias a, and hidden layer bias b. The network structure is shown in Figure 2.

As for the rolling bearing fault vibration signal acquired by CMS, it has the following characteristics: it has a large amount of data, it is non-linear, non-stationary, and contains periodic shock. The acquisition of fault data from the rolling bearing vibration signal sample is based on energy dissipation, which has the characteristics of obvious energy change. At the same time, the proposed data cleaning algorithm has the characteristics of local energy enhancement. Therefore, the DBN fault identification model constructed by using neuron nodes, which is activated by energy based on the RBM structure, is suitable for processing data-driven nonlinear rolling bearing fault data. The specific steps are as follows:

(1) Data feature learning. Input the rolling bearing vibration data from the visible layer

V_{1}

to the hidden layer

H_{1}

of RBM

_{1}

. Based on the characteristic that the impact energy contained in the data above is greater than the other components contained, the energy calculation of the hidden layer

H_{1}

neuron node of the RBM

_{1}

is activated. According to the update of parameters such as weights

w_{1}

, offset

a_{V 1}

of

V_{1}

and offset

b_{H 1}

of

H_{1}

after each neuron node in the hidden layer

H_{1}

of RBM

_{1}

operates the rolling bearing vibration data, the intrinsic characteristics of the vibration data above are learned.

(2) Weight update. The hidden layer

H_{1}

of RBM

_{1}

is used as the visible layer

V_{2}

of RBM

_{2}

, and the principle of (1) is used to perform self-learning on the vibration data of the rolling bearing in order to update the connection weights

w_{2}

, the offset

a_{V 2}

of

V_{2}

, and the offset

b_{H 2}

of

H_{2}

.

(3) Result output. Output the result of the final operation of the vibration data of the rolling bearing in the hidden layer

H_{n}

of the last RBM

_{n}

, and output the result above via the output layer O.

(4) Reverse update, improve the recognition rate.

Backpropagating the result of (3). From the back to the front, the output layer O updates the connection weight w of each neuron, the bias a of the visible layer V, and the bias b of the hidden layer H according to the feedback information of the previous layer in order to further improve the fault identification rate of the rolling bearing fault identification model.

2.2.2. IPSO Optimizes the Structure Parameters of DBN

As an excellent deep learning method, DBN has the advantages of high accuracy, low computational cost, and the initialization weight parameters do not easily fall into the local optimum. The DBN model training process is the process of continuously changing the scalar energy, which is extremely suitable for the intelligent state identification of rolling bearing vibration signals with a large number of shocks and strong energy. However, since the number of hidden layers and the number of nodes in each layer are artificially set, the final recognition accuracy cannot be guaranteed. Therefore, an optimization algorithm is needed to optimize the number of nodes in each layer.

PSO is a simple and implementable optimization algorithm with few parameters and fast convergence. However, the original speed update equation and the inertia weight of PSO can easily cause the algorithm to fall into the local optimum. The author proposed the IPSO algorithm to improve the PSO in 2017, described the content in detail [21], which has been applied on many occasions. This algorithm will not be described in detail since it is not the focus of this article.

This paper proposes a method of using the IPSO algorithm to find the optimal solution for the number of hidden layer nodes in DBN and to set each particle in IPSO as the number of neurons in the hidden layer. Meanwhile, the method sets the error rate of the trained current hidden layer as the objective function and adaptively searches for the optimal number of hidden layer nodes corresponding to the lowest error rate through iteration. The process is as follows:

Step 1 Initialize the particle swarm.

Initialize the particle’s velocity

v_{0}

and position

x_{0}

.

Step 2 DBN training.

Use the high-quality training samples selected by the data cleaning algorithm proposed in this paper to train the DBN constructed according to the particle parameters.

Step 3 Fitness calculation.

Calculate the fitness of each particle and find the optimal value

p_{best}

and the global optimal value

g_{best}

of each particle in this round.

Step 4 Use the IPSO method to update the particle’s velocity

v_{i}

and position

x_{i}

.

Step 5 If the number of iterations is reached, the IPSO optimization ends; otherwise, repeat Step 3–Step 5 until the iterative conditions are satisfied.

Step 6 Use the trained DBN network to train the test data, and then output the bearing classification result.

The specific optimization process is shown in Figure 3.

3. Experimental Analysis and Validation

In this section, the data cleaning algorithm and the DBN optimized by the IPSO fault identification model after data cleaning will be verified through the simulated signal test and the bench test.

3.1. Validation by Simulated Signal

In order to verify the effectiveness of the proposed rolling bearing signal cleaning algorithm, we used Matlab to construct a fault vibration model of the rolling bearing to simulate its outer ring fault. The simulated signal is expressed by Equation (4), and the parameter settings are shown in Table 1.

Y (t) = r (t) + n (t) = \sum A e^{- ε (t - z T_{0})} sin (2 π f_{n} (t - z T_{0})) + n (t)

(4)

where r(t) is the cyclic periodic shock, n(t) is the amplitude of white Gaussian noise, t is time, z is the number of impulses generated in the intercepted time period, A is the signal amplitude,

ϵ

is the attenuation coefficient,

T_{0}

is the fault impulse period,

f_{0} = 1 / T_{0}

is the fault characteristic frequency, and

f_{n}

is the natural frequency.

The simulated signal Y(t) is constructed according to the parameters given in Table 1, and its time domain waveform is drawn, as shown in Figure 4. As the original signal shock is submerged by Gaussian white noise and the signal cycle cannot be clearly identified, it is difficult to judge the fault type of the rolling bearing from the time domain waveform of the simulated signal Y(t) by means of artificial experience. Therefore, it is necessary to use the data cleaning algorithm proposed in this paper to process and analyze the simulated signal Y(t). The specific cleaning process is as follows:

(1) VMD.

Firstly, Y(t) is decomposed using the VMD algorithm. The decomposition bandwidth is set to 3500, and the six IMF components shown in Figure 5 are obtained. As shown in Figure 5, the shock period and amplitude are the most obvious in the image corresponding to IMF

_{6}

. From the perspective of human experience, IMF

_{6}

is the most ideal component for judging fault types.

(2) Calculating the relevant energy fluctuation evaluation value

p_{m}

.

Subsequently, use Equation (1) to calculate the relevant energy fluctuation coefficients for the six IMF components of the rolling bearing outer ring fault’s simulated signal Y(t) shown in Figure 5. Among them, to facilitate the calculation, the number of decomposition segments of each signal is set to 1024; that is, there is a segment every four sample points. The specific calculation results are shown in Table 2. As shown in this Table, the relevant energy fluctuation coefficients from large to small are: IMF

_{6}

, IMF

_{1}

, IMF

_{5}

, IMF

_{4}

, IMF

_{2}

, and IMF

_{3}

.

(3) Evaluation and screening.

Next, according to the process shown in the left half of Figure 1 in Section 2, the modal components are evaluated and weighted for the six IMF components shown in Figure 5. The IMF evaluation results are shown in Figure 6. According to the IMF component evaluation results shown in Figure 6, among all the six IMF components, only the relevant energy fluctuation coefficient of IMF

_{6}

is higher than the threshold indicated by the red line. Therefore, IMF

_{6}

will be used as the component that is sensitive to the fault obtained by the signal cleaning algorithm from the fault-simulated signal Y(t) of the outer ring of the rolling bearing. This result is also consistent with the empirical analysis of Figure 5. Therefore, it is also reasonable to use the relevant energy fluctuation coefficients in order to evaluate the individual modal components.

(4) Weighted reconstruction.

Each component is then weighted and reconstructed according to different thresholds, and the reconstructed time domain waveform

x^{'} (t)

is shown in Figure 7. The fault impact of the signal is obviously prominent as compared with Figure 7, and it can be observed that the fault impact period

Δ T

is 100 sample points, which is consistent with the set fault frequency.

In order to judge and analyze the signal fault types more intuitively and verify the effect of the proposed signal cleaning method, the reconstructed time domain signal

x^{'} (t)

is subjected to envelope analysis; the envelope spectrum of the signal is shown in Figure 8. Here, it can be observed that there are three obvious peaks: 0.4916, 0.2765, and 0.1122, and the frequency corresponding to the maximum peak of 0.4916 is 99.61 Hz. This is very close to the set bearing outer ring fault frequency of

f_{c}

= 100, with only an error of 0.39 Hz, that is, 0.39%. It can be determined that the peak corresponds to the set bearing outer ring fault frequency

f_{c}

. The other two obvious peaks are the double frequency and triple frequency of the fault frequency

f_{c}

of the bearing outer ring, and the peak values decrease in turn. The phenomenon shown in Figure 8 is consistent with the frequency domain characteristics of the actual bearing outer ring fault signal; hence, the method of signal cleaning and reconstruction proposed in Section 2 is effective.

3.2. Validation by Bench Test

3.2.1. Test System Construction and Signal Acquisition

So as to verify the effectiveness of the proposed data cleaning algorithm and the DBN optimized by the IPSO fault diagnosis method for rolling bearings, we used the multi-channel data acquisition instrument of Siemens and the comprehensive machinery fault simulator-lite test bench made by Spectra Quest, Inc, Richmond, America, to build the test system of the bearing vibration signal as shown in Figure 9. The bearing is an American ER-12K deep groove ball bearing, and the basic parameters of the rolling bearing are shown in Table 3.

In the test, normal bearings were installed in the bearing housing 1 at the motor end, and bearings in four different states—normal, outer ring failure, inner ring failure, and rolling element failure—were installed in the bearing housing 2; then, the PCB608A11 accelerometer was used to collect the vibration signals of the above bearing in the horizontal and vertical directions. A total of 300 sets of data were collected in each of the four states of the bearing and 4096 sample points were collected for each set of data. One group of time domain signals corresponding to each of the four bearing states is shown in Figure 10.

3.2.2. Data Cleaning of Bench Test

First, the VMD algorithm is used to decompose the vibration signals of the four states of the ER-12K rolling bearings. The decomposition bandwidth is set to 3500. The number of decomposition components is set to six, and the decomposition results for a set of data for each state is shown in Figure 11.

Equation (1) is subsequently used to calculate the correlation energy fluctuation coefficients for the IMF corresponding to each group of vibration signals in the four states of the bearing. In order to facilitate the calculation, the number of decomposition segments for each signal is set to 1024, that is, every four sample points corresponds to one segment. Then, according to the process shown in Figure 1, the modal component of each group of IMF components in the four states of the bearing is evaluated. The IMF evaluation results for a group in the four states are given in Figure 12. According to the IMF component evaluation results shown in Figure 12, among the IMF components corresponding to the fault vibration signal of the inner race of the ER-12K rolling bearing, the correlation energy fluctuation coefficients of IMF

_{3}

, IMF

_{4}

, and IMF

_{6}

are higher than the threshold; the vibration signals of the rolling element fault are the same as those of the inner race fault; the vibration signals of the outer race fault are consistent with normal states; and the correlation energy fluctuation coefficients of IMF

_{3}

, IMF

_{4}

, and IMF

_{5}

are higher than the threshold.

Finally, the IMF components of each group of signals in the four states of the bearing are weighted and reconstructed according to different thresholds, and the reconstructed time-domain waveforms are displayed in each state, as shown in Figure 13. According to the time-domain waveforms in each state of the reconstructed bearing shown in Figure 13, the signal fault impact is more prominent than the original signal shown in Figure 10.

3.2.3. Intelligent State Identification of Bearing Based on DBN Optimized by IPSO

First of all, a total of 1200 sets of data collected in four state types of the ER-12K rolling bearings are used as data sets T, random 200 sets of data in each state are used as the training sets, and the remaining 100 groups are used as the test sets to verify the accuracy of the model; that is, a total of 800 sets of data are used as the total training sets, and 400 sets of data as the test sets. In addition, due to the low recognition rate of DBN for the time-domain waveforms, and as the input data must be between [0, 1], the data sets constructed by the time-domain vibration signals in the four states of the bearing are subjected to fast fourier transform, which normalizes the final result. The processed data set T’ is shown in Table 4.

Then, according to the size of the constructed bearing data set T’, IPSO is used to seek the optimal number of neurons in the DBN hidden layer. The learning rate is set at c = 1.7, the population number at m = 50, and the number of iterations at t = 50. As a rule of thumb, the number of hidden layers is set at two; that is, the DBN has two RBMs, and the objective function is set to the error rate of each RBM. The preprocessed bearing data set T’ is sent to the IPSO for two calculations in order to obtain the optimal number of nodes in each layer. The calculation results corresponding to each RBM are shown in Figure 14.

As shown in the Figure 14a, the value of the set objective function stabilizes in the fourth iteration, which is about 17; that is, in the fourth iteration, the optimal number of hidden layer nodes of the first RBM in the DBN structure is found, and the corresponding model error rate after training is 17%. Combined with Figure 14b, it can be seen that the optimal hidden layer node number for the first RBM is 200. In the same way, Figure 14c shows that the value of the set objective function stabilizes in the fifth iteration, which is about 6.9; that is, in the fifth iteration, the optimal number of hidden layer nodes for the second RBM is found, and the corresponding model error rate after training is 6.9%. Combined with Figure 14d, the number of optimal hidden layer nodes for the second RBM found during iteration is 200.

Finally, the weight parameters of each node in DBN are initialized, and with experience, the learning rate is selected as

α

= 0.1 and

β

= 0.5, and the number of hidden layer nodes is set according to the results obtained in Figure 14; the first layer and second layer parameters are set to 200, respectively. Then, the preprocessed data are sent, and T is set into the DBN model optimized by IPSO for fault status identification, with the training number set at 200 and the test sets used to test the accuracy of the model; the result is shown in Figure 15.

In the classification accuracy shown in Figure 15, the total classification accuracy of the four-state test samples of the ER-12K rolling bearing after signal cleaning is 100%, and the test samples of all state types are classified into the correct position.

To further verify the effectiveness of the proposed signal cleaning method, the original vibration signal of the bearing that has not undergone signal cleaning is preprocessed and used as a data set for fault status identification. The preprocessing method is consistent with the preprocessing method of the data set T cleaned by the signal cleaning method mentioned earlier in this article. Finally, the fault identification of the original vibration signal of the bearing, which has not been cleaned, is also tested for accuracy. The result is shown in Figure 16.

The classification results in Figure 16 show that the total classification accuracy of the four-state test samples of the rolling bearings without signal cleaning is 97.25%. More specifically, the classification accuracy of the inner race fault is 99%, and one sample is classified as a rolling element fault. The classification accuracy of the outer race fault is 91%, and nine sample are classified as normal. The classification accuracy of the rolling element fault is 99%, and 1 sample is classified as inner race fault. The classification accuracy of the normal state is 100%, each sample is classified into the correct position. Compared with the classification accuracy preprocessed by the signal cleaning method shown in Figure 15, the training sample of the data sets without signal cleaning preprocessing is 2.75% lower than that of the cleaned data sets. Therefore, the signal cleaning method based on the correlation energy fluctuation coefficient proposed in this paper is effective.

Similarly, to further verify the effectiveness of the proposed IPSO-optimized DBN model, three additional groups of different hidden layer neuron node numbers are set to identify the fault state, and the combinations of the node numbers are [100, 100], [150, 150], and [250, 250]. So as to highlight the experimental effect more clearly, the data preprocessing method is set to not perform signal cleaning. Finally, the accuracy of the rolling bearing fault identification model under the three situations mentioned above is tested, and the test results are shown in Figure 17.

As shown in Figure 17, when the number of nodes in the two hidden layers of the DBN model is [100, 100], the corresponding model classification accuracy is 95.5%. When the number of nodes in the two hidden layers of the DBN model is [150, 150], the corresponding model classification accuracy is 97%. When the number of nodes is [250, 250], the corresponding model classification accuracy is also 97%. Compared with the model accuracy shown in Figure 16, the model with a DBN neuron node combination of [200, 200] has the highest accuracy. When the numbers of nodes are set to [100, 100] and [150, 150], the model training is insufficient, and the accuracy of the model without signal cleaning decreases by 1.75% and 0.25%, respectively. However, when the number of nodes in the two hidden layers of DBN is [250, 250], it will lead to model overfitting, which also reduces the accuracy of the model without signal cleaning by 0.25%. The signal-cleaned model has obvious characteristics from the sample data, so the method proposed in this paper for the IPSO to optimize the number of neurons in the hidden layer of the DBN model is effective.

In order to avoid the error in the test results caused by accidental factors, all the tests above were performed 10 times, and the average value of the classification accuracy of the model obtained from these 10 tests is shown in Table 5. (The accuracy of the node combinations of [100, 100], [150, 150], [200, 200] , and [250, 250] after preprocessing is the same. To avoid repeating, the other three results were not added to the table.) According to the results shown in Table 5, the classification accuracy of the rolling bearing DBN intelligent identification model after signal cleaning preprocessing is 2.75% higher than that without signal cleaning preprocessing. In addition, the classification accuracy of the rolling bearing DBN intelligent identification model after the number of hidden layer nodes are optimized by IPSO is higher than that of the unoptimized ones. Therefore, the signal cleaning method based on the correlation energy fluctuation coefficient and the method of IPSO optimized the parameters of the DBN network model proposed in this paper are both effective. We will further verify the effectiveness of the proposed algorithm in practical engineering.

4. Conclusions

This paper mainly proposed a signal cleaning method based on a correlation energy fluctuation coefficient and an intelligent identification model of bearing defects based on DBN optimized by IPSO. Through an analysis of the digitally simulated signal and the actual experimental signal, the feasibility of the proposed signal cleaning method and the early intelligent identification model of DBN optimized by the IPSO bearing was verified. Compared with the intelligent identification model constructed without signal preprocessing, the original fault characteristics of the signal were significantly highlighted after the denoising processing, the accuracy of fault status identification was increased by 2.75%, and the accuracy of the DBN bearing intelligent state identification model optimized by IPSO was also significantly improved.The signal cleaning method proposed in this paper was mainly based on the fault vibration signal characteristics of key components of the transmission system such as bearings. Bearing failure is caused by damage to the contact surface, resulting in changes in contact stiffness or meshing stiffness, and periodic abnormal shock excitation occurs during operation, resulting in obvious periodic energy fluctuations in the time domain. However, other components such as boxes or other vibration signals may not all have similar energy variation characteristics, so the proposed cleaning method may not be applicable to other components or fields.

Author Contributions

B.Q., Q.L. and Z.L. were mainly responsible for the vibration data cleaning algorithm, network structure optimization algorithm, program debugging, experimental scheme formulation, and test data analysis. C.Z., H.W. and W.L. were mainly responsible for building the test system and data collection. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (No. 51865045 and No.52065054), and Inner Mongolia Scientific Research Projects of Colleges and Universities (No. NJZY21085 and No. NJZY21090).

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, J.; Xu, Z.; Zhou, L.; Yu, W.; Shao, Y. A statistical feature investigation of the spalling propagation assessment for a ball bearing. Mech. Mach. Theory 2019, 131, 336–350. [Google Scholar] [CrossRef]
Rubini, R. Application of the envelope and wavelet transform analyses for the diagnosis of incipient faults in ball bearing. Mech. Syst. Signal Process. 2001, 15, 287–302. [Google Scholar] [CrossRef]
Qin, Y.; Li, W.T.; Yuen, C.; Tushar, W.; Saha, T. IIoT-enabled health monitoring for integrated heat pump system using Mixture Slow Feature Analysis. IEEE Trans. Inf. Inform. 2021. [Google Scholar] [CrossRef]
Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.-C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the Hilbert speetrum for nonlinear and nonstationary time series analysis. Proc. Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
Smith, J.S. The local mean decomposition and its application to EEG perception data. J. R. Soc. Interface 2005, 2, 443–454. [Google Scholar] [CrossRef]
Dragomiretskiy, K.; Zosso, D. Variational mode decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
Li, H.; Liu, T.; Wu, X.; Chen, Q. Application of EEMD and improved frequency band entropy in bearing fault feature extraction. ISA Trans. 2019, 88, 170–185. [Google Scholar] [CrossRef]
Li, H.; Liu, T.; Wu, X.; Chen, Q. Application of optimized variational mode decomposition based on kurtosis and resonance frequency in bearing fault feature extraction. Trans. Inst. Meas. Control 2019, 42, 518–527. [Google Scholar] [CrossRef]
Zhao, L.; Yu, W.; Yan, R. Rolling Bearing Fault Diagnosis Based on CEEMD and Time Series Modeling. Math. Probl. Eng. 2014, 2014, 101867. [Google Scholar] [CrossRef]
Zhang, C.; Yao, W.; Deng, W. Fault Diagnosis for Rolling Bearings Using Optimized Variational Mode Decomposition and Resonance Demodulation. Entropy 2020, 22, 739. [Google Scholar] [CrossRef]
Wang, H.D.; Deng, S.E.; Yang, J.X.; Liao, H.; Li, W.B. Parameter-Adaptive VMD Method Based on BAS Optimization Algorithm for Incipient Bearing Fault Diagnosis. Math. Probl. Eng. 2020, 2020, 5659618. [Google Scholar] [CrossRef] [Green Version]
Li, X.; Yang, Y.; Pan, H.; Cheng, J.; Cheng, J. A novel deep stacking least squares support vector machine for rolling bearing fault diagnosis. Comput. Ind. 2019, 110, 36–47. [Google Scholar] [CrossRef]
Zhou, S.; Qian, S.; Chang, W.; Xiao, Y.; Cheng, Y. A Novel Bearing Multi-Fault Diagnosis Approach Based on Weighted Permutation Entropy and an Improved SVM Ensemble Classifier. Sensors 2018, 18, 1934. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Dong, W.; Zhang, S.; Jiang, A.; Jiang, W.; Zhang, L.; Hu, M. Intelligent Fault Diagnosis of Rolling Bearings Based on Refined Composite Multi-Scale Dispersion q-Complexity and Adaptive Whale Algorithm-Extreme Learning Machine. Measurement 2021, 176, 108977. [Google Scholar] [CrossRef]
Luo, M.; Li, C.; Zhang, X.; Li, R.; An, X. Compound feature selection and parameter optimization of ELM for fault diagnosis of rolling element bearings. ISA Trans. 2016, 65, 556–566. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Yao, X.; Wang, X.; Yu, Q.; Zhang, Y. Multiscale local features learning based on BP neural network for rolling bearing intelligent fault diagnosis. Measurement 2019, 153, 107419. [Google Scholar] [CrossRef]
Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Wang, L.; Jiang, L. Rolling bearing fault diagnosis based on DBN algorithm improved with PSO. J. Vib. Shock 2020, 39, 89–96. [Google Scholar]
Qin, B.; Sun, G.D.; Zhang, L.Q.; Liu, Y.L.; Zhang, C.; Wang, J.G. Study on the Rolling Bearing Fault Diagnosis based on the Hilbert Envelope Spectrum Singular Value and IPSO-SVM. J. Mech. Transm. 2017, 41, 166–171. [Google Scholar]

Figure 1. Flow chart of bearing intelligent identification method.

Figure 2. DBN network model structure of n stacked RBMs.

Figure 3. The flow chart of the IPSO algorithm optimizes the number of nodes in the DBN hidden layer.

Figure 4. Bearing outer ring fault—simulated signal Y(t).

Figure 5. IMF decomposed from VMD.

Figure 6. Evaluation results of IMF components.

Figure 7. Time domain waveform

x^{'} (t)

after the weighted reconstruction of the simulated signal of the rolling bearing outer ring fault.

Figure 7. Time domain waveform

x^{'} (t)

after the weighted reconstruction of the simulated signal of the rolling bearing outer ring fault.

Figure 8. Envelope spectrum of signal

x^{'} (t)

.

Figure 8. Envelope spectrum of signal

x^{'} (t)

.

Figure 9. Test system based on the Simcenter Testlab and MFS-LT.

Figure 10. Time domain waveforms of ER–12K rolling bearings in four states: (a) Inner race fault, (b) Outer race fault, (c) Rolling element fault, (d) Normal state.

Figure 11. IMF of vibration signals of the ER–12K rolling bearing in 4 states after VMD. (a) Inner race fault, (b) Outer race fault, (c) Rolling element fault, (d) Normal state.

Figure 12. Evaluation results of IMF components corresponding to vibration signals of the ER-12K rolling bearing in 4 states: (a) Inner race fault, (b) Outer race fault, (c) Rolling element fault, (d) Normal state.

Figure 13. Reconstructed signal time domain waveform of the ER–12K rolling bearing in 4 states: (a) Inner race fault, (b) Outer race fault, (c) Rolling element fault, (d) Normal state.

Figure 14. IPSO calculates the number of nodes in each hidden layer of DBN. (a) Change curve of the first layer’s objective function value; (b) Change curve of the node number in the first layer; (c) Change curve of the second layer’s objective function value; (d) Change curve of the node number in the second layer.

Figure 15. Test results of the DBN model classification accuracy after signal cleaning.

Figure 16. Classification accuracy test results of the DBN model without signal cleaning.

Figure 17. Accuracy of the DBN model after combining the different number of nodes in the hidden layers. (a) [100, 100], (b) [150, 150], (c) [250, 250].

Table 1. Parameters of the rolling bearing outer ring fault’s simulated signal.

Amplitude of shock at bearing fault location (A)	5
Amplitude of Gaussian noise ( $n (t)$ )	2
Natural frequency of bearing vibration (Hz) ( $f_{n}$ )	4500
Attenuation coefficient ( $ϵ$ )	800
Characteristic frequency (Hz) ( $f_{o}$ )	100
Sampling frequency (Hz) ( $f_{s}$ )	12,000
Sampling points	4096

Table 2. Results for the correlation energy fluctuation coefficient calculated by the six IMFs.

IMF $_{1}$	0.4140
IMF $_{2}$	0.3281
IMF $_{3}$	0.3211
IMF $_{4}$	0.3884
IMF $_{5}$	0.3889
IMF $_{6}$	0.9251

Table 3. Detailed parameters of the damaged bearings.

Bearing inner diameter (mm)	19
Bearing outer diameter (mm)	47
Number of rolling elements	8
Rolling element diameter (mm)	7.9
Pitch circle diameter (mm)	33.5
Contact angle (deg.)	0

Table 4. Preprocessed data set T’ for 4 states of the ER-12K rolling bearing.

State Type	Sample Number	Sample Data Volume
State Type	Sample Number	1	2	3	···	2046	2047	2048
Inner	1	2.939 × $10^{- 6}$	5.375 × $10^{- 6}$	7.977 × $10^{- 6}$		1.800 × $10^{- 3}$	1.700 × $10^{- 3}$	1.800 × $10^{- 3}$
	···
	300	1.479 × $10^{- 5}$	4.698 × $10^{- 6}$	4.898 × $10^{- 6}$		1.637 × $10^{- 4}$	1.102 × $10^{- 4}$	5.767 × $10^{- 5}$
Outer	1	3.167 × $10^{- 6}$	6.216 × $10^{- 6}$	1.001 × $10^{- 5}$		2.743 × $10^{- 4}$	4.496 × $10^{- 4}$	5.377 × $10^{- 4}$
	···
	300	1.123 × $10^{- 6}$	1.681 × $10^{- 6}$	2.521 × $10^{- 6}$		7.721 × $10^{- 4}$	6.835 × $10^{- 4}$	6.333 × $10^{- 4}$
Ball	1	1.001 × $10^{- 6}$	2.001 × $10^{- 6}$	2.921 × $10^{- 6}$		2.335 × $10^{- 4}$	8.175 × $10^{- 5}$	1.297 × $10^{- 4}$
	···
	300	2.881 × $10^{- 6}$	5.758 × $10^{- 6}$	8.707 × $10^{- 6}$		1.600 × $10^{- 3}$	1.500 × $10^{- 3}$	1.400 × $10^{- 3}$
Normal	1	1.042 × $10^{- 6}$	2.093 × $10^{- 6}$	3.048 × $10^{- 6}$		7.573 × $10^{- 5}$	7.707 × $10^{- 5}$	7.095 × $10^{- 5}$
	···
	300	3.142 × $10^{- 6}$	6.281 × $10^{- 6}$	9.567 × $10^{- 6}$		1.200 × $10^{- 3}$	1.100 × $10^{- 3}$	1.100 × $10^{- 3}$

Table 5. Average classification accuracy of the DBN model under various conditions.

Method	Type				Average
Method	Inner	Outer	Ball	Normal	Average
After preprocessing and the node combination is [200, 200]	100%	100%	100%	100%	100%
Without preprocessing and the node combination is [200, 200]	99%	91%	99%	100%	97.25%
Without preprocessing and the node combination is [100, 100]	99%	89%	94%	100%	95.5%
Without preprocessing and the node combination is [150, 150]	100%	92%	96%	100%	97%
Without preprocessing and the node combination is [250, 250]	100%	97%	91%	100%	97%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qin, B.; Luo, Q.; Li, Z.; Zhang, C.; Wang, H.; Liu, W. Data Screening Based on Correlation Energy Fluctuation Coefficient and Deep Learning for Fault Diagnosis of Rolling Bearings. Energies 2022, 15, 2707. https://doi.org/10.3390/en15072707

AMA Style

Qin B, Luo Q, Li Z, Zhang C, Wang H, Liu W. Data Screening Based on Correlation Energy Fluctuation Coefficient and Deep Learning for Fault Diagnosis of Rolling Bearings. Energies. 2022; 15(7):2707. https://doi.org/10.3390/en15072707

Chicago/Turabian Style

Qin, Bo, Quanyi Luo, Zixian Li, Chongyuan Zhang, Huili Wang, and Wenguang Liu. 2022. "Data Screening Based on Correlation Energy Fluctuation Coefficient and Deep Learning for Fault Diagnosis of Rolling Bearings" Energies 15, no. 7: 2707. https://doi.org/10.3390/en15072707

APA Style

Qin, B., Luo, Q., Li, Z., Zhang, C., Wang, H., & Liu, W. (2022). Data Screening Based on Correlation Energy Fluctuation Coefficient and Deep Learning for Fault Diagnosis of Rolling Bearings. Energies, 15(7), 2707. https://doi.org/10.3390/en15072707

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data Screening Based on Correlation Energy Fluctuation Coefficient and Deep Learning for Fault Diagnosis of Rolling Bearings

Abstract

1. Introduction

2. Intelligent Identification Method

2.1. Data Cleaning

2.1.1. Principle of Evaluation

2.1.2. Screening and Reconstruction of IMF Components

2.2. Intelligent Diagnosis Model

2.2.1. DBN Network Model

2.2.2. IPSO Optimizes the Structure Parameters of DBN

3. Experimental Analysis and Validation

3.1. Validation by Simulated Signal

3.2. Validation by Bench Test

3.2.1. Test System Construction and Signal Acquisition

3.2.2. Data Cleaning of Bench Test

3.2.3. Intelligent State Identification of Bearing Based on DBN Optimized by IPSO

4. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI