Data Screening Based on Correlation Energy Fluctuation Coefﬁcient and Deep Learning for Fault Diagnosis of Rolling Bearings

: The accuracy of the intelligent diagnosis of rolling bearings depends on the quality of its vibration data and the accuracy of the state identiﬁcation model constructed accordingly. Aiming at the problem of “poor quality” of data and “difﬁcult to select” structural parameters of the identiﬁcation model, a method is proposed to integrate data cleaning in order to select effective learning samples and optimize the selection of the structural parameters of the deep belief network (DBN) model. First, by calculating the relative energy ﬂuctuation value of the ﬁnite number of intrinsic function components using the variational modal decomposition of the rolling bearing vibration data, the proportion of each component containing the fault component is characterized. Then, high-quality learning samples are obtained through screening and reconstruction to achieve the effective cleaning of vibration data. Second, the improved particle swarm algorithm (IPSO) is used to optimize the number of nodes in each hidden layer of the DBN model in order to obtain the optimal structural parameters of the intelligent diagnosis model. Finally, the high-quality learning samples obtained from data cleaning are used as input to construct an intelligent identiﬁcation model for rolling bearing faults. The results showed that the proposed method not only screens out the intrinsic mode function components that contain the fault effective components in the rolling bearing vibration data, but also ﬁnds the optimal solution for the number of nodes in the DBN hidden layer, which improves bearing state identiﬁcation accuracy by 3%.


Introduction
As the "joint" of electromechanical equipment, rolling bearings are widely used in rail transit, wind turbines, weaponry, steel metallurgy, and other fields due to their low friction resistance and high rotation accuracy [1]. According to statistics, 30% of faults in rotating machinery are caused by bearing failures [2]. Once the mechanical equipment fails, it will cause downtime and even casualties. In order to monitor the status of machinery and equipment in real time and achieve early warning, a condition monitoring system (CMS) is widely used in various industries; at the same time, it provides "mass data" with a large capacity, diversity, and high-rate characteristics [3]. Therefore, how to gather the running status information of rolling bearings from the data in a timely and quantitatively accurate manner has important theoretical significance and engineering application value in the safe and efficient operation and in the health management of the transmission system. Generally, there are two factors that affect the accuracy of the intelligent state recognition of rolling bearings: the quality of learning samples in the modeling process and the accuracy of the diagnostic model constructed based on the machine learning method. Because there are many application scenarios of rolling bearings and extremely complex operating conditions, the measured bearing vibration signals are transmitted through multiple interfaces, and the energy is attenuated. This causes the signals mentioned above to have strong background noise, strong nonlinearity, nonstationarity, and amplitude and frequency modulation characteristics. This often results in poor accuracy in subsequent intelligent diagnoses when those signals are used as learning samples. Therefore, how to accurately clean and screen out the key components related to the fault from the mass of rolling bearing vibration signals and use them as the learning sample for the intelligent diagnosis model is the key to improving the accuracy of the intelligent diagnosis.
So as to capture as much fault information in rolling bearing vibration signals as possible, extensive research has been carried out with regard to this problem. Huang et al. [4] proposed the empirical mode decomposition (EMD) method to decompose nonlinear nonstationary signals. Compared with traditional time-frequency analysis methods such as wavelet and wavelet packet, EMD avoids the selection of basis functions and reduces manual influence; however, EMD has the defects of modal aliasing and end effect. In order to suppress modal aliasing, Wu and Huang [5] proposed an ensemble empirical mode decomposition (EEMD) algorithm for the modal aliasing problem in EMD. It improves the distribution of the extreme points of the signal by adding Gaussian white noise to the signal; after multiple averaging, it achieves the purpose of reducing modal aliasing. Although EEMD reduces modal aliasing, the added noise is not completely removed, which increases the reconstruction error of the vibration signal.
Jonathan [6] proposed the local mean decomposition (LMD) method, which adaptively decomposes the signal whose frequency changes with time into the sum of a finite number of instantaneous frequency product functions (PF) to reduce the modal aliasing and end effect of the EMD and EEMD algorithms. However, it cannot completely solve the problem of modal aliasing.
In 2014, Dragomiretskiy et al. [7] proposed variational mode decomposition (VMD). The difference from the method above is that VMD transfers the signal decomposition process to the variational framework and searches for the optimal solution for the constrained variational model through iterative methods. Its mathematical theoretical foundation is complete and it reduces the end effect and modal aliasing. It is widely used in signal decomposition in the field of fault diagnosis.
In the process of building a data-driven diagnostic model, the fault recognition rate of the model depends on the quality of the learning samples. The prerequisite for improving the fault recognition rate of the data-driven intelligent identification model for the rolling bearing state is to screen out the main intrinsic mode function (IMF) components. The IMF components contain the main component of the fault, which can be obtained from the decomposition of the rolling bearing fault signal of the electromechanical equipment above and then adopted in order to construct the learning samples. At present, many scholars have performed related work. Hua L et al. [8] proposed a fault feature extraction method for rolling bearings, combining EEMD and improved frequency band entropy. Li et al. [9] selected the IMF, which is closely correlated to the fault signal according to the resonance frequency. Zhao et al. [10] proposed an IMF selection method based on the correlation coefficient between the filtered signal and each IMF to select which contains fault feature information. Zhang et al. [11] calculated the kurtosis values of all IMF obtained by the VMD algorithm of the rolling bearing vibration signal and selected the two largest components as the effective fault components. Wang et al. [12] optimized the VMD algorithm based on the beetle antennae search algorithm and selected the fault component by calculating the kurtosis value of the IMF for a subsequent intelligent diagnosis. However, most of the studies above are limited to considering or selecting a certain main IMF component as the learning sample from a single perspective, and the number of effective fault components in the selected IMF components are not quantified. In addition, the displacement and contact stiffness excitations induced by the damage defect on the rolling element and raceway surfaces are slow and disorderly, and the impact component in the detected vibration signal is submerged by strong background noise. A resulting low fault recognition rate is expected if the selected IMF component is directly used for fault identification due to its nonuniform and slowly varying shock characteristics. Therefore, there is an urgent need to propose a new and effective component screening method for bearing faults in order to realize accurate identification. This article studies the physical essence of rolling bearing fault diagnosis and changes in the energy fluctuation of the vibration signal during the operation of the rolling bearing. A correlation energy fluctuation coefficient is proposed as a criterion for evaluating the fault sensitivity of IMFs decomposed by the VMD algorithm, based on the correlation between each component of the vibration signal obtained by VMD and the original vibration signal. Finally, the high-quality learning samples are built through heavy weighting technology to ensure the accuracy of subsequent fault recognition models.
With the rapid development of technologies such as intelligent sensing and network transmission, massive data that can represent the state of equipment can be collected by CMS in real time. However, it also increases the capacity of learning samples in the process of intelligent diagnosis, which has become a huge challenge that affects the accuracy of intelligent diagnosis. At present, the construction of a data-driven intelligent diagnostic model for rolling bearings has become a research hotspot in the field of state identification and online monitoring. Li et al. [13] proposed a deep stacking least squares support vector machine together with the concept of stacking-based representation learning for rolling bearing fault diagnosis. Zhou et al. [14] combined EEMD, weighted permutation entropy, and an improved support vector machine (SVM) ensemble classifier to realize the fault diagnosis of rolling bearings. W Dong et al. [15] proposed an intelligent fault diagnosis method that used the adaptive whale optimization algorithm to optimize the extreme learning machine (ELM) algorithm. Luo et al. [16] combined the real-valued gravitational search algorithm and binary gravity search algorithm to optimize the ELM and then sent the preprocessed rolling bearing vibration signal to the diagnostic model to diagnose the rolling bearing fault. Li et al. [17] used a back propagation (BP) neural network to perform local feature learning on the sub-signals of the decomposed rolling bearing vibration signal for fault diagnosis. However, the research work above has some shortcomings: the algorithms used in model building are all shallow machine learning algorithms, which are weak in extracting effective fault features and have poor generalization ability. Among them, the selection of the structural parameters of the hidden layer of the BP neural network is based on experience, the exhaustive trial and error is blind, and the solution is not unique. SVM is suitable for small sample data as it will occupy a lot of computing resources if it deals with the massive bearing operating status data obtained by the online monitoring system, which will inevitably increase the time cost. For the multi-classification situation, the models above have low fault recognition rate and poor effect. ELM requires more neuron nodes, and the initial randomization of its internal weights and thresholds can easily lead to a decrease in the failure recognition rate. To address these issues, a deep learning [18] algorithm with a strong data feature extraction ability is needed to resolve the shortcomings of the rolling bearing state identification model, constructed based on the above-mentioned shallow neural network algorithm. Among them, the DBN proposed by Hinton [19] activated the operation of neural nodes by calculating the scalar energy of the data, which has the advantages of strong expressive ability, easy reasoning, and fast calculation speed; it is suitable for the intelligent fault diagnosis of rolling bearings based on vibration signals, but its own structural parameters have a great influence on the performance of the model. Therefore, to solve this problem, Li et al. [20] proposed to optimize the DBN structure parameters by using the particle swarm optimization (PSO) algorithm. The optimal DBN structure parameters are found by continuously and iteratively updating the position and velocity of the particles in the range of the objective function.
However, although the PSO algorithm has an extremely fast convergence speed, it will fall into local extreme value due to the influence of its calculation parameters, producing a final optimization result that is not the global optimal value. To address this issue, this paper proposes to use the IPSO algorithm to optimize the number of nodes in multiple hidden layers, in turn to ensure the accuracy of the rolling bearing defect state identification model.
The rest of the paper is organized as follows. Section 2 introduces the data cleaning method based on the correlation energy fluctuation coefficient and the process of rolling bearing defect identification based on DBN optimized by IPSO. Section 3 verifies the effectiveness of the signal data cleaning method through numerical simulation research and carries out laboratory experiments to verify the effectiveness of the method proposed in Section 2,and the experimental results and the proposed method are summarized and discussed. Section 4 presents the conclusion.

Intelligent Identification Method
The rolling bearing vibration signal collected in real time by the condition monitoring system has the characteristics of nonlinearity, non-stationarity, strong background noise, etc. To address the issues of poor quality of learning samples for the state identification of rolling bearings and the difficult selection of parameters for state identification models, a signal cleaning method based on the correlation energy fluctuation coefficient and reweighting in combination with an intelligent identification method for rolling bearing defects are proposed. The specific algorithm flow and framework are shown in Figure 1, which mainly comprises two parts: signal data cleaning based on the correlation energy fluctuation coefficient and the intelligent identification of rolling bearing defects based on the DBN optimized by IPSO network model.

Data Cleaning
The quality of the learning samples is a prerequisite for improving the fault recognition rate of the data-driven rolling bearing state identification model. To improve the diagnostic accuracy, it is urgent to clean the vibration signals of the rolling bearings picked up by the sensor in order to screen out high-quality learning samples. However, the rolling bearing vibration signal has time-varying modulation characteristics. After VMD decomposition, a limited number of IMF components representing different frequency components of the original signal are obtained; however, whether the above-mentioned modal components contain rolling bearing fault components, and as to how many of these have not yet been proven, a standard is needed to quantify the degree of fault-sensitive information contained in each modal component.

Principle of Evaluation
According to the characteristics of vibration signal shock energy fluctuation and its correlation of rolling bearings, the degree of fault information contained in the decomposed IMF components are evaluated, and a criterion for quantifying and screening the relationship between IMF components and sensitive fault components is proposed. The specific calculation is expressed as follows: where N is the number of segments of a single IMF component, and the value of N should be determined through experiments in combination with the application scenario of rolling bearings, l and o are the lth and oth root amplitudes in the root amplitudes calculated by the current IMF components and l = 1, e is the square root amplitude, m is the mth IMF component obtained by decomposing the original signal, u(k) is one of the components after the IMF is evenly divided into N segments, k is the kth sample point in the component segment u(k), k = 1, 2, . . . , K, and ρ is the Pearson correlation coefficient. The idea behind its evaluation is that among the indicators that characterize the vibration signal of rolling bearings, one must fully consider the ability of the square root amplitude of the dimensioned index. This highlights the energy of the discretized vibration signal and the similarity of the fault impact in the IMF component to the original signal. When the fault impact component in the vibration signal of the rolling bearing is stronger, and the number and trend of the impact are similar to the original signal, the related energy fluctuation evaluation value p m will be more obvious, and vice versa.

Screening and Reconstruction of IMF Components
According to the proposed evaluation criteria, the IMF components obtained by decomposing the vibration signal of the rolling bearing by VMD are evaluated so as to remove the component with a high proportion of noise and retain the component with a large amount of fault information.
The original vibration signal x(t) of the rolling bearing is decomposed by VMD to obtain M IMF components; the specific equation is as follows: where x(t) is the measured original vibration signal of the rolling bearing, and M is the number of IMF components obtained by VMD. According to the proposed "correlated energy fluctuation assessment" criterion, the energy of the above M IMF components is assigned. Then, Equation (1) is used to calculate the relevant energy fluctuation evaluation value p and comprehensively consider the reduction of the influence of modal aliasing on the VMD. p is compared with the evaluation value mean p mean , an amount less than p mean is defined as redundant and is discarded. The modal component greater than p mean is regarded as the effective component of fault information, and the relevant energy fluctuation evaluation coefficient p of the IMF is defined as the weight value. Equation (3) is then used to perform weighted reconstruction to obtain a new signal x (t).
where x (t) is the signal after screening and reconstruction, IMF s1 ∼ IMF s f are the f effective components of fault information after screening, and p s1 ∼ p sj are the relevant energy fluctuation evaluation coefficients corresponding to the effective IMF components of the fault.

Intelligent Diagnosis Model
The selection of structural parameters for the intelligent diagnosis model determines its accuracy. The number of nodes in the hidden layer of the intelligent diagnosis model based on DBN is a very important structural parameter. At present, its selection relies on manual experience and exhaustive trial, and the accuracy rate is sometimes good and sometimes bad. To overcome this defect and avoid falling into the local optimal solution, an improved particle swarm optimization algorithm with a better global search ability is introduced to select the number of hidden layer nodes for the selection of the optimal model structure parameters; it is also used to build a rolling bearing state identification model with the best recognition rate based on DBN.

DBN Network Model
DBN is a deep neural network algorithm proposed by Hinton in 2006. Its network structure is composed of a visible layer V, a hidden layer H (multiple restricted boltzmann machines (RBM)), and an output layer; each layer is composed of several activation functions (such as sigmoid), and the layers are connected bidirectionally through connection weights w, visible layer bias a, and hidden layer bias b. The network structure is shown in Figure 2. As for the rolling bearing fault vibration signal acquired by CMS, it has the following characteristics: it has a large amount of data, it is non-linear, non-stationary, and contains periodic shock. The acquisition of fault data from the rolling bearing vibration signal sample is based on energy dissipation, which has the characteristics of obvious energy change. At the same time, the proposed data cleaning algorithm has the characteristics of local energy enhancement. Therefore, the DBN fault identification model constructed by using neuron nodes, which is activated by energy based on the RBM structure, is suitable for processing data-driven nonlinear rolling bearing fault data. The specific steps are as follows: (1) Data feature learning. Input the rolling bearing vibration data from the visible layer V 1 to the hidden layer H 1 of RBM 1 . Based on the characteristic that the impact energy contained in the data above is greater than the other components contained, the energy calculation of the hidden layer H 1 neuron node of the RBM 1 is activated. According to the update of parameters such as weights w 1 , offset a V1 of V 1 and offset b H1 of H 1 after each neuron node in the hidden layer H 1 of RBM 1 operates the rolling bearing vibration data, the intrinsic characteristics of the vibration data above are learned.
(2) Weight update. The hidden layer H 1 of RBM 1 is used as the visible layer V 2 of RBM 2 , and the principle of (1) is used to perform self-learning on the vibration data of the rolling bearing in order to update the connection weights w 2 , the offset a V2 of V 2 , and the offset b H2 of H 2 .
(3) Result output. Output the result of the final operation of the vibration data of the rolling bearing in the hidden layer H n of the last RBM n , and output the result above via the output layer O.
Backpropagating the result of (3). From the back to the front, the output layer O updates the connection weight w of each neuron, the bias a of the visible layer V, and the bias b of the hidden layer H according to the feedback information of the previous layer in order to further improve the fault identification rate of the rolling bearing fault identification model.

IPSO Optimizes the Structure Parameters of DBN
As an excellent deep learning method, DBN has the advantages of high accuracy, low computational cost, and the initialization weight parameters do not easily fall into the local optimum. The DBN model training process is the process of continuously changing the scalar energy, which is extremely suitable for the intelligent state identification of rolling bearing vibration signals with a large number of shocks and strong energy. However, since the number of hidden layers and the number of nodes in each layer are artificially set, the final recognition accuracy cannot be guaranteed. Therefore, an optimization algorithm is needed to optimize the number of nodes in each layer.
PSO is a simple and implementable optimization algorithm with few parameters and fast convergence. However, the original speed update equation and the inertia weight of PSO can easily cause the algorithm to fall into the local optimum. The author proposed the IPSO algorithm to improve the PSO in 2017, described the content in detail [21], which has been applied on many occasions. This algorithm will not be described in detail since it is not the focus of this article. This paper proposes a method of using the IPSO algorithm to find the optimal solution for the number of hidden layer nodes in DBN and to set each particle in IPSO as the number of neurons in the hidden layer. Meanwhile, the method sets the error rate of the trained current hidden layer as the objective function and adaptively searches for the optimal number of hidden layer nodes corresponding to the lowest error rate through iteration. The process is as follows: Step 1 Initialize the particle swarm. Initialize the particle's velocity v 0 and position x 0 .
Step 2 DBN training. Use the high-quality training samples selected by the data cleaning algorithm proposed in this paper to train the DBN constructed according to the particle parameters.
Step 3 Fitness calculation. Calculate the fitness of each particle and find the optimal value p best and the global optimal value g best of each particle in this round.
Step 4 Use the IPSO method to update the particle's velocity v i and position x i .
Step 5 If the number of iterations is reached, the IPSO optimization ends; otherwise, repeat Step 3-Step 5 until the iterative conditions are satisfied.
Step 6 Use the trained DBN network to train the test data, and then output the bearing classification result.
The specific optimization process is shown in Figure 3.

Experimental Analysis and Validation
In this section, the data cleaning algorithm and the DBN optimized by the IPSO fault identification model after data cleaning will be verified through the simulated signal test and the bench test.

Validation by Simulated Signal
In order to verify the effectiveness of the proposed rolling bearing signal cleaning algorithm, we used Matlab to construct a fault vibration model of the rolling bearing to simulate its outer ring fault. The simulated signal is expressed by Equation (4), and the parameter settings are shown in Table 1.
where r(t) is the cyclic periodic shock, n(t) is the amplitude of white Gaussian noise, t is time, z is the number of impulses generated in the intercepted time period, A is the signal amplitude, is the attenuation coefficient, T 0 is the fault impulse period, f 0 = 1/T 0 is the fault characteristic frequency, and f n is the natural frequency.
The simulated signal Y(t) is constructed according to the parameters given in Table 1, and its time domain waveform is drawn, as shown in Figure 4. As the original signal shock is submerged by Gaussian white noise and the signal cycle cannot be clearly identified, it is difficult to judge the fault type of the rolling bearing from the time domain waveform of the simulated signal Y(t) by means of artificial experience. Therefore, it is necessary to use the data cleaning algorithm proposed in this paper to process and analyze the simulated signal Y(t). The specific cleaning process is as follows:  (1) VMD. Firstly, Y(t) is decomposed using the VMD algorithm. The decomposition bandwidth is set to 3500, and the six IMF components shown in Figure 5 are obtained. As shown in Figure 5, the shock period and amplitude are the most obvious in the image corresponding to IMF 6 . From the perspective of human experience, IMF 6 is the most ideal component for judging fault types. shown in Figure 5. Among them, to facilitate the calculation, the number of decomposition segments of each signal is set to 1024; that is, there is a segment every four sample points. The specific calculation results are shown in Table 2. As shown in this Table, the relevant energy fluctuation coefficients from large to small are: IMF 6 , IMF 1 , IMF 5 , IMF 4 , IMF 2 , and IMF 3 . (3) Evaluation and screening. Next, according to the process shown in the left half of Figure 1 in Section 2, the modal components are evaluated and weighted for the six IMF components shown in Figure 5. The IMF evaluation results are shown in Figure 6. According to the IMF component evaluation results shown in Figure 6, among all the six IMF components, only the relevant energy fluctuation coefficient of IMF 6 is higher than the threshold indicated by the red line. Therefore, IMF 6 will be used as the component that is sensitive to the fault obtained by the signal cleaning algorithm from the fault-simulated signal Y(t) of the outer ring of the rolling bearing. This result is also consistent with the empirical analysis of Figure 5. Therefore, it is also reasonable to use the relevant energy fluctuation coefficients in order to evaluate the individual modal components. (4) Weighted reconstruction. Each component is then weighted and reconstructed according to different thresholds, and the reconstructed time domain waveform x (t) is shown in Figure 7. The fault impact of the signal is obviously prominent as compared with Figure 7, and it can be observed that the fault impact period ∆T is 100 sample points, which is consistent with the set fault frequency. In order to judge and analyze the signal fault types more intuitively and verify the effect of the proposed signal cleaning method, the reconstructed time domain signal x (t) is subjected to envelope analysis; the envelope spectrum of the signal is shown in Figure 8. Here, it can be observed that there are three obvious peaks: 0.4916, 0.2765, and 0.1122, and the frequency corresponding to the maximum peak of 0.4916 is 99.61 Hz. This is very close to the set bearing outer ring fault frequency of f c = 100, with only an error of 0.39 Hz, that is, 0.39%. It can be determined that the peak corresponds to the set bearing outer ring fault frequency f c . The other two obvious peaks are the double frequency and triple frequency of the fault frequency f c of the bearing outer ring, and the peak values decrease in turn. The phenomenon shown in Figure 8 is consistent with the frequency domain characteristics of the actual bearing outer ring fault signal; hence, the method of signal cleaning and reconstruction proposed in Section 2 is effective.

Test System Construction and Signal Acquisition
So as to verify the effectiveness of the proposed data cleaning algorithm and the DBN optimized by the IPSO fault diagnosis method for rolling bearings, we used the multi-channel data acquisition instrument of Siemens and the comprehensive machinery fault simulator-lite test bench made by Spectra Quest, Inc, Richmond, America,to build the test system of the bearing vibration signal as shown in Figure 9. The bearing is an American ER-12K deep groove ball bearing, and the basic parameters of the rolling bearing are shown in Table 3.
In the test, normal bearings were installed in the bearing housing 1 at the motor end, and bearings in four different states-normal, outer ring failure, inner ring failure, and rolling element failure-were installed in the bearing housing 2; then, the PCB608A11 accelerometer was used to collect the vibration signals of the above bearing in the horizontal and vertical directions. A total of 300 sets of data were collected in each of the four states of the bearing and 4096 sample points were collected for each set of data. One group of time domain signals corresponding to each of the four bearing states is shown in Figure 10.

Data Cleaning of Bench Test
First, the VMD algorithm is used to decompose the vibration signals of the four states of the ER-12K rolling bearings. The decomposition bandwidth is set to 3500. The number of decomposition components is set to six, and the decomposition results for a set of data for each state is shown in Figure 11. Equation (1) is subsequently used to calculate the correlation energy fluctuation coefficients for the IMF corresponding to each group of vibration signals in the four states of the bearing. In order to facilitate the calculation, the number of decomposition segments for each signal is set to 1024, that is, every four sample points corresponds to one segment. Then, according to the process shown in Figure 1, the modal component of each group of IMF components in the four states of the bearing is evaluated. The IMF evaluation results for a group in the four states are given in Figure 12. According to the IMF component evaluation results shown in Figure 12, among the IMF components corresponding to the fault vibration signal of the inner race of the ER-12K rolling bearing, the correlation energy fluctuation coefficients of IMF 3 , IMF 4 , and IMF 6 are higher than the threshold; the vibration signals of the rolling element fault are the same as those of the inner race fault; the vibration signals of the outer race fault are consistent with normal states; and the correlation energy fluctuation coefficients of IMF 3 , IMF 4 , and IMF 5 are higher than the threshold. Finally, the IMF components of each group of signals in the four states of the bearing are weighted and reconstructed according to different thresholds, and the reconstructed time-domain waveforms are displayed in each state, as shown in Figure 13. According to the time-domain waveforms in each state of the reconstructed bearing shown in Figure 13, the signal fault impact is more prominent than the original signal shown in Figure 10.

Intelligent State Identification of Bearing Based on DBN Optimized by IPSO
First of all, a total of 1200 sets of data collected in four state types of the ER-12K rolling bearings are used as data sets T, random 200 sets of data in each state are used as the training sets, and the remaining 100 groups are used as the test sets to verify the accuracy of the model; that is, a total of 800 sets of data are used as the total training sets, and 400 sets of data as the test sets. In addition, due to the low recognition rate of DBN for the time-domain waveforms, and as the input data must be between [0, 1], the data sets constructed by the time-domain vibration signals in the four states of the bearing are subjected to fast fourier transform, which normalizes the final result. The processed data set T' is shown in Table 4.
Then, according to the size of the constructed bearing data set T', IPSO is used to seek the optimal number of neurons in the DBN hidden layer. The learning rate is set at c = 1.7, the population number at m = 50, and the number of iterations at t = 50. As a rule of thumb, the number of hidden layers is set at two; that is, the DBN has two RBMs, and the objective function is set to the error rate of each RBM. The preprocessed bearing data set T' is sent to the IPSO for two calculations in order to obtain the optimal number of nodes in each layer. The calculation results corresponding to each RBM are shown in Figure 14.  As shown in the Figure 14a, the value of the set objective function stabilizes in the fourth iteration, which is about 17; that is, in the fourth iteration, the optimal number of hidden layer nodes of the first RBM in the DBN structure is found, and the corresponding model error rate after training is 17%. Combined with Figure 14b, it can be seen that the optimal hidden layer node number for the first RBM is 200. In the same way, Figure 14c shows that the value of the set objective function stabilizes in the fifth iteration, which is about 6.9; that is, in the fifth iteration, the optimal number of hidden layer nodes for the second RBM is found, and the corresponding model error rate after training is 6.9%. Figure 14d, the number of optimal hidden layer nodes for the second RBM found during iteration is 200.

Combined with
Finally, the weight parameters of each node in DBN are initialized, and with experience, the learning rate is selected as α = 0.1 and β = 0.5, and the number of hidden layer nodes is set according to the results obtained in Figure 14; the first layer and second layer parameters are set to 200, respectively. Then, the preprocessed data are sent, and T is set into the DBN model optimized by IPSO for fault status identification, with the training number set at 200 and the test sets used to test the accuracy of the model; the result is shown in Figure 15. In the classification accuracy shown in Figure 15, the total classification accuracy of the four-state test samples of the ER-12K rolling bearing after signal cleaning is 100%, and the test samples of all state types are classified into the correct position.
To further verify the effectiveness of the proposed signal cleaning method, the original vibration signal of the bearing that has not undergone signal cleaning is preprocessed and used as a data set for fault status identification. The preprocessing method is consistent with the preprocessing method of the data set T cleaned by the signal cleaning method mentioned earlier in this article. Finally, the fault identification of the original vibration signal of the bearing, which has not been cleaned, is also tested for accuracy. The result is shown in Figure 16.
The classification results in Figure 16 show that the total classification accuracy of the four-state test samples of the rolling bearings without signal cleaning is 97.25%. More specifically, the classification accuracy of the inner race fault is 99%, and one sample is classified as a rolling element fault. The classification accuracy of the outer race fault is 91%, and nine sample are classified as normal. The classification accuracy of the rolling element fault is 99%, and 1 sample is classified as inner race fault. The classification accuracy of the normal state is 100%, each sample is classified into the correct position. Compared with the classification accuracy preprocessed by the signal cleaning method shown in Figure 15, the training sample of the data sets without signal cleaning preprocessing is 2.75% lower than that of the cleaned data sets. Therefore, the signal cleaning method based on the correlation energy fluctuation coefficient proposed in this paper is effective.
Similarly, to further verify the effectiveness of the proposed IPSO-optimized DBN model, three additional groups of different hidden layer neuron node numbers are set to identify the fault state, and the combinations of the node numbers are [100, 100], [150,150], and [250,250]. So as to highlight the experimental effect more clearly, the data preprocessing method is set to not perform signal cleaning. Finally, the accuracy of the rolling bearing fault identification model under the three situations mentioned above is tested, and the test results are shown in Figure 17. As shown in Figure 17, when the number of nodes in the two hidden layers of the DBN model is [100, 100], the corresponding model classification accuracy is 95.5%. When the number of nodes in the two hidden layers of the DBN model is [150,150], the corresponding model classification accuracy is 97%. When the number of nodes is [250,250], the corresponding model classification accuracy is also 97%. Compared with the model accuracy shown in Figure 16, the model with a DBN neuron node combination of [200,200] has the highest accuracy. When the numbers of nodes are set to [100, 100] and [150,150], the model training is insufficient, and the accuracy of the model without signal cleaning decreases by 1.75% and 0.25%, respectively. However, when the number of nodes in the two hidden layers of DBN is [250, 250], it will lead to model overfitting, which also reduces the accuracy of the model without signal cleaning by 0.25%. The signal-cleaned model has obvious characteristics from the sample data, so the method proposed in this paper for the IPSO to optimize the number of neurons in the hidden layer of the DBN model is effective. In order to avoid the error in the test results caused by accidental factors, all the tests above were performed 10 times, and the average value of the classification accuracy of the model obtained from these 10 tests is shown in Table 5. (The accuracy of the node combinations of [100, 100], [150,150], [200,200] , and [250, 250] after preprocessing is the same. To avoid repeating, the other three results were not added to the table.) According to the results shown in Table 5, the classification accuracy of the rolling bearing DBN intelligent identification model after signal cleaning preprocessing is 2.75% higher than that without signal cleaning preprocessing. In addition, the classification accuracy of the rolling bearing DBN intelligent identification model after the number of hidden layer nodes are optimized by IPSO is higher than that of the unoptimized ones. Therefore, the signal cleaning method based on the correlation energy fluctuation coefficient and the method of IPSO optimized the parameters of the DBN network model proposed in this paper are both effective. We will further verify the effectiveness of the proposed algorithm in practical engineering. After preprocessing and the node combination is [200,200]