A Novel Fault Diagnosis Method for Rolling Bearing Based on Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy and PSO-ELM

Chen, Yinsheng; Yuan, Zichen; Chen, Jiahui; Sun, Kun

doi:10.3390/e24111517

Open AccessArticle

A Novel Fault Diagnosis Method for Rolling Bearing Based on Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy and PSO-ELM

by

Yinsheng Chen

^1,2,*

,

Zichen Yuan

^1,*,

Jiahui Chen

² and

Kun Sun

²

¹

School of Measurement and Communication Engineering, Harbin University of Science and Technology, Harbin 150080, China

²

National Experimental Teaching Demonstration Center of Measurement and Control Technology and Instrumentation, Harbin University of Science and Technology, Harbin 150080, China

^*

Authors to whom correspondence should be addressed.

Entropy 2022, 24(11), 1517; https://doi.org/10.3390/e24111517

Submission received: 20 August 2022 / Revised: 9 October 2022 / Accepted: 10 October 2022 / Published: 24 October 2022

Download

Browse Figures

Review Reports Versions Notes

Abstract

This paper proposes a novel fault diagnosis method for rolling bearing based on hierarchical refined composite multiscale fluctuation-based dispersion entropy (HRCMFDE) and particle swarm optimization-based extreme learning machine (PSO-ELM). First, HRCMFDE is used to extract fault features in the vibration signal at different time scales. By introducing the hierarchical theory algorithm into the vibration signal decomposition process, the problem of missing high-frequency signals in the coarse-grained process is solved. Fluctuation-based dispersion entropy (FDE) has the characteristics of insensitivity to noise interference and high computational efficiency based on the consideration of nonlinear time series fluctuations, which makes the extracted feature vectors more effective in describing the fault information embedded in each frequency band of the vibration signal. Then, PSO is used to optimize the input weights and hidden layer neuron thresholds of the ELM model to improve the fault identification capability of the ELM classifier. Finally, the performance of the proposed rolling bearing fault diagnosis method is verified and analyzed by using the CWRU dataset and MFPT dataset as experimental cases, respectively. The results show that the proposed method has high identification accuracy for the fault diagnosis of rolling bearings with varying loads and has a good load migration effect.

Keywords:

rolling bearing fault diagnosis; feature extraction; hierarchical refined composite multiscale fluctuation-based dispersion entropy (HRCMFDE); particle swarm optimization-based extreme learning machine (PSO-ELM); load migration

1. Introduction

As a highly standardized precision mechanical device, rolling bearings are widely used in aerospace, wind power generation, automotive railways, and industrial production [1]. It is one of the most common components in rotating machinery and mainly serves as a connection and support [2]. However, some studies have shown that 45% to 55% of rotating machinery faults are caused by rolling bearings [3]. The occurrence of these faults is highly likely to cause catastrophic emergencies and severe loss of life and property [4]. Early identification and severity estimation of rolling bearing fault is the focus and difficulty of condition monitoring of rotating machinery. Therefore, it is necessary to conduct in-depth research on the fault generation mechanism and fault diagnosis methods and select automatic, systematic, and intelligent signal processing and feature extraction methods to improve the operational reliability and health management level of rolling bearings. The study of rolling bearing fault diagnosis is conducive to solving the problems of equipment maintenance and economic efficiency in the final industrial application. It is of great significance in ensuring the regular operation of rotating machinery.

The essence of rolling bearing fault diagnosis based on vibration signal features is a pattern identification process, which mainly includes feature extraction and fault classification [5]. As the vibration signal characteristics are closely related to the physical structure of the bearing, the vibration signal often contains rich information about the fault state of the rolling bearing [6,7]. At present, in the rolling bearing condition of monitoring and fault diagnosis technology, vibration signal analysis is one of the current widely used technical means.

As one of the critical technologies for rolling bearing fault diagnosis, the result of fault feature extraction directly affects the final fault identification accuracy. Therefore, selecting a method that can maximize the useful information without loss and have an excellent clustering of the extracted fault features is one of the crucial challenges faced by research in this field in recent years. Additionally, for nonlinear time series, the feature extraction method based on signal complexity is very important. There are two typical time series complexity indicators, one is Lempel-Ziv complexity (LZC), and the other is entropy. LZC [8] is typically used to evaluate the disorder of a given length sequence and is essentially based on coarse-grained processes, the simplest method being conversion to binary sequences (zeros and ones). However, LZC is susceptible to human factors. Therefore, in recent years, many experts and scholars have improved the traditional LZC algorithm [9]. Mao et al. [10] combined dispersion entropy (DE) with LZC to propose a new metric, dispersion Lempel-Ziv complexity (DLZC), and applied it to the biomedical field with good results. Li et al. [11] firstly added fluctuation information to DLZC. They proposed fluctuation-based DLZC (FDLZC), and then introduced an improved coarse-grained operation to propose refined composite multiscale FDLZC (RCMFDLZC), which was applied to both bearing fault diagnosis and ship signal classification with a high recognition rate.

Entropy, as a measure of time series uncertainty and irregularity [12], is widely used to extract the dynamic characteristics of rotating machinery because of its advantages, such as good clustering capability and classification accuracy [13]. Ali et al. [14] used an empirical mode decomposition (EMD) combined with energy entropy for feature vector extraction to automatically monitor bearing fault severity. However, EMD is prone to modal mixing and endpoint effect, which will affect the result of signal decomposition to a certain extent. Li et al. [15] proposed a new method for rolling bearing fault feature extraction, which is composed of two simple methods: ensemble empirical mode decomposition (EEMD) and improved frequency band entropy (IFBE), which can extract the early weak fault features of rolling bearing effectively. It can achieve accurate diagnosis of rolling bearing. Although EEMD solves the modal mixing problem of EMD by adding white Gaussian noise (WGN), the computation time increases as the number of iterations increases. Liu et al. [16] proposed the method of extracting feature vectors using local mean decomposition (LMD) and multiscale entropy (MSE) to identify different bearing operating conditions. Although LMD has improved the proper method to make the envelope function at the endpoints closer to the actual situation, the computational efficiency is lower. Wavelet packet transform (WPT) is also a typical time-frequency analysis method for rolling bearing vibration signals. Chen et al. [17] generated feature vectors describing different fault types of rolling bearings by calculating the energy entropy values of varying frequency band sub-signals obtained from WPT decomposition. However, WPT decomposition needs to determine the number of wavelet basis functions and the number of WPT decomposition layers before use, which affects the adaptive capability of the method. Variational mode decomposition (VMD) is a new multi-component signal decomposition method based on Wiener filtering and Hilbert transform. Ding et al. [18] applied VMD to the feature extraction of gear fault vibration signals by using the sample entropy (SE) values of several intrinsic mode functions (IMFs) decomposed by VMD as fault features. However, the decomposition effect of VMD is also primarily influenced by parameter selection. With the successive introduction of various improved time-frequency analysis methods, the issue of practical efficiency and application scope cannot be ignored. Therefore, the operability and accuracy of feature extraction methods have also become a particular focus and research direction for many scholars, and some of them began trying to optimize the decomposition step and improve it by incorporating it into the process of entropy calculation [19,20].

The above common entropy measures still have some limitations. The approximate entropy (ApEn) proposed by Pincus [21] leads to less stable results due to the excessive dependence on the length of the time series. The SE proposed by Richman et al. [22] is computationally inefficient and unsuitable for analyzing long-time series. Feature extraction based on permutation entropy (PE) [23] ignores the influence of the amplitude of the elements in the time series on the entropy value, making the extracted fault features more random, which in turn, affects the accuracy of subsequent fault identification. To solve the problems of PE, Hamed et al. [24] proposed amplitude-aware permutation entropy (AAPE) and effectively improved the amplitude and frequency sensitivity of PE to time series. Given this, Chen et al. [25] combined AAPE and multiscale entropy algorithm, and proposed multiscale amplitude aware permutation entropy (MAAPE) to extract the fault characteristics of rolling bearings, but when the scale factor in MAAPE is high, the number of elements in the coarse-grained time series decreases, which will seriously affect the stability of the entropy measure. To solve the above problem, Rostaghi and Azami [26] proposed a new, more stable, and computationally efficient entropy measure called dispersion entropy (DE) in 2016. Based on DE, further considering the fluctuation of the signal, they proposed fluctuation-based dispersion entropy (FDE) in 2018 [27]. FDE generates different fluctuation dispersion patterns by mapping each element of the measurement sequence to another class, which not only gives FDE a strong anti-noise capability but also avoids the defect of losing amplitude information in PE.

However, the FDE only describes the complexity of nonlinear time series at a single scale, resulting in the loss of a large amount of important information. Azami et al. [28] further carried out a multiscale expansion of the FDE and proposed the multiscale fluctuation-based dispersion entropy (MFDE). However, traditional coarse-grained methods do not consider the relationship between different coarse-grained time series, and the entropy value will become unstable as the scale factor increases [29]. Gan et al. [30] proposed the composite multiscale fluctuation-based dispersion entropy (CMFDE) and demonstrated that the coarse-grained sliding algorithm could improve the stability of MFDE. Subsequently, Zhou et al. [31] utilized the refined composite multiscale fluctuation-based dispersion entropy (RCMFDE), which is the same as the CMFDE coarse-grained method and has similar principles. Unlike CMFDE, which calculates the mean entropy values of different coarse-grained time series, RCMFDE adopts the refined composite method. However, both MFDE and RCMFDE focus only on the low-frequency features of the time series but ignore the equally crucial high-frequency information [32]. Ke et al. [33] proposed hierarchical fluctuation-based dispersion entropy (HFDE). Although HFDE can accurately describe the low and high frequency components of the signal by constructing its characteristics in different frequency bands to achieve accurate and comprehensive measurements, it has low utilization of the elements in the original time series and is easy to cause information loss. At the same time, the stability of HFDE will be affected by the increase in the number of layers, which is not conducive to the extraction of bearing fault features. Thus, this paper combines the hierarchical theory algorithm with RCMFDE and proposes the hierarchical refined composite multiscale fluctuation-based dispersion entropy (HRCMFDE). This feature extraction method combines the advantages of HFDE and RCMFDE, avoids their disadvantages, and not only solves the problem of missing high-frequency signals but also has better stability. Applying it to the feature extraction part of rolling bearing fault diagnosis can describe the fault features more accurately and help to improve the accuracy of subsequent fault identification.

Fault type identification is another important part of rolling bearing fault diagnosis. It is crucial to choose a suitable high-performance classifier to ensure identification accuracy. Commonly used methods for fault type identification include k-nearest neighbor (k-NN) classification algorithm, support vector machines (SVM), artificial neural network (ANN), and recurrent neural network (RNN), etc. The KNN algorithm performs pattern recognition on the testing samples by referring to the classes of the nearest sample. Tian et al. [34] used spectral kurtosis to extract fault features and combined principal component analysis (PCA) and k-NN classification algorithm for fault diagnosis of motor bearings. Because the k-NN classification algorithm requires distance calculation for all samples at each prediction, it has the disadvantage of large computational effort and slow classification speed. Zheng et al. [35] used generalized refined composite multiscale fuzzy entropy (GRCMFE) for feature extraction and multi-cluster feature selection for supervised learning, and the gravitational search algorithm optimized support vector machine (GSA-SVM) for fault pattern recognition. Still, the fault diagnosis results are greatly influenced by the choice of kernel parameters in the classifier. Wen et al. [36] proposed a hierarchical convolutional neural network (HCNN) structure containing two classifiers, and this two-level hierarchical diagnostic network can simultaneously estimate the fault mode and severity to achieve the diagnosis of bearing faults. Still, the method has bottlenecks in terms of slow gradient learning speed and preset parameters.

Extreme learning machine (ELM) [37], as a novel classification and identification method, has the advantages of fast learning speed and good generalization performance compared with traditional classification methods. Ye et al. [38] introduced singular value vectors as fault feature vectors into ELM for the identification and classification of bearing faults, to reduce manual intervention and shorten fault diagnosis time. Mao et al. [39] proposed an online timing prediction diagnosis method for unbalanced faults based on ELM for the problem that the actual bearing fault data are unbalanced and much less in number than normal data. However, there are uncertainties in the input weights and hidden layer neuron thresholds predetermined in the ELM classifier, which require further optimization to improve the classification accuracy. Particle swarm optimization (PSO) [40] has been widely used as an effective technique to search for global minima. The PSO does not have complex evolutionary operators and requires fewer parameters to be tuned [41]. Therefore, the mixture of PSO and linear mapping has a promising application in the training of feedforward neural networks [42]. In the field of rolling bearing fault diagnosis, the PSO algorithm has been applied in both feature extraction and fault identification. He et al. [43] used the PSO algorithm to optimize the number of components and the penalty factor in VMD, so that the decomposed IMF contains richer fault information. Chen et al. [44] used the PSO algorithm to find the most suitable input weights and hidden layer neuron thresholds for ELM to obtain more accurate fault identification results. In this paper, the PSO algorithm is utilized in the latter application.

In this paper, a rolling bearing fault diagnosis method combining HRCMFDE and PSO-ELM is proposed, and its main contributions are summarized as follows:

A novel fault feature extraction method based on HRCMFDE is proposed. The method quantifies the high-frequency and low-frequency features of the measured time series by introducing the hierarchical theory algorithm, which effectively overcomes the problem of high-frequency information loss caused by the coarse-grained process. Meanwhile, HRCMFDE maps each element of the time series to different classes and generates different fluctuation dispersion patterns, which not only has strong anti-noise capability but also avoids the defect of losing amplitude information.
The performance of the proposed rolling bearing fault diagnosis method is verified using two typical rotating machinery fault datasets. The experimental results show that the proposed fault diagnosis method can not only accurately identify the fault types with varying loads, but also have a high fault identification effect even under load migration.

The rest of this paper is organized as follows. In Section 2, the basic principles of the proposed HRCMFDE rolling bearing fault feature extraction method and PSO optimized ELM fault identification method are briefly introduced. Section 3 details the algorithm flow of the proposed HRCMFDE and PSO-ELM rolling bearing fault diagnosis method. The proposed method is experimentally verified by using two typical rotating machinery fault datasets, and the experimental results are analyzed and discussed in Section 4. Finally, Section 5 summarizes the research content and results of this paper and prospects the future research content.

2. Methodologies

2.1. Feature Extraction

2.1.1. Multiscale Fluctuation-Based Dispersion Entropy (MFDE)

FDE is a method for calculating the complexity of nonlinear time series based on Shannon’s entropy and fluctuation dispersion patterns. In the calculation, FDE considers the difference of discrete spectra of adjacent elements, and can easily distinguish deterministic signals from random signals by comparing fluctuation-based dispersion patterns, so it is widely used for feature extraction of nonlinear and non-stationary signals. However, FDE can only quantify the dynamic features of the measured series on a time scale, therefore, it cannot capture long correlations. To solve this problem, Azami et al. proposed MFDE, a multiscale entropy (MSE) improvement method for FDE. MFDE has higher computational efficiency and better fault feature description capability than MSE and MFE. The basic principle of MFDE is as follows:

For a given univariate signal

u = \{u_{1}, u_{2}, \dots, u_{L}\}

of length

L

, the signal

u

is divided into a set of non-overlapping segments of length

τ

, where

τ

is the scale factor. Afterward, the average of each non-overlapping segment is calculated to derive a coarse-grained time series as follows:

u_{j}^{τ} = \frac{1}{τ} \sum_{b = (j - 1) τ + 1}^{j τ} u_{b}, 1 \leq j \leq [L / τ] = N

(1)

Finally, the FDE of each coarse-grained signal

u_{j}^{τ}

is calculated. The FDE calculation procedure for time series

u_{j}^{τ}

is as follows:

Mapping the coarse-grained time series

u_{j}^{τ} = \{x_{1}, x_{2}, \dots, x_{N}\}

of length

N

to the time series

y = \{y_{1}, y_{2}, \dots, y_{N}\}

through the normal cumulative distribution function (NCDF) in Equation (2):

y_{j} = \frac{1}{σ \sqrt{2 π}} {\int_{- \infty}^{x_{j}} e}^{\frac{- (t - μ)}{2 σ^{2}}} d t

(2)

where

σ

and

μ

are the standard deviation (SD) and mean of the time series, respectively. Then, each

y_{j}

is linearly assigned to an integer from 1 to

c

. For that reason, for each member of the mapped signal

y

, Equation (3) is used to map

y_{j}

to

z_{j}^{c}

:

z_{j}^{c} = r o u n d (c \cdot y_{j} + 0.5)

(3)

where

z_{j}^{c}

denotes the

j t h

member of the classified time series

z_{i}^{m, c}

,

c

is an integer and

r o u n d (\cdot)

is a rounding function. Time series

z_{i}^{m, c}

are defined concerning embedding dimension

m - 1

and time delay

d

according to

z_{i}^{m, c} = \{z_{i}^{c}, z_{i + d}^{c}, \dots, z_{i + (m - 1) d}^{c}\}

,

i = 1, 2, \dots, N - (m - 1) d

.

Each time series

z_{i}^{m, c}

is mapped to a fluctuation-based dispersion pattern

π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}

, where

z_{i}^{c} = v_{0}, z_{i + d}^{c} = v_{1}, \dots, z_{i + (m - 1) d}^{c} = v_{m - 1}

. The number of possible fluctuation-based dispersion patterns that can be assigned to each time series

z_{i}^{m, c}

is equal to

{(2 c - 1)}^{m - 1}

. For each

{(2 c - 1)}^{m - 1}

potential dispersion pattern

π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}

, relative frequency is obtained as follows:

p (π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}) = \frac{n u m \{i |i \leq N - (m - 1) d, z_{i}^{m, c} h a s t y p e π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}\}}{N - (m - 1) d}

(4)

where

n u m (\cdot)

means cardinality. In fact,

p (π_{ν_{0} ν_{1} \dots ν_{(m - 1)}})

shows the number of dispersion patterns of

π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}

that is assigned to

z_{i}^{m, c}

, divided by the total number of embedded signals with embedding dimension

m

.

Finally, based on Shannon’s definition of entropy, the FDE value is calculated as follows:

F D E (x, m, d, c) = - \sum_{π = 1}^{{(2 c - 1)}^{m - 1}} p (π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}) \cdot l n p (π_{ν_{0} ν_{1} \dots ν_{(m - 1)}})

(5)

2.1.2. Refined Composite Multiscale Fluctuation-Based Dispersion Entropy (RCMFDE)

The multiscale algorithm obtains a new coarse-grained sequence by cutting the original sequence equally spaced and calculating the average of short sequences. However, if the starting point is different, the entropy value will also fluctuate to a certain extent, and considering that the relationship between the new coarse-grained sequence elements cannot be neglected, further refinement of MFDE by a refined composite algorithm is required. RCMFDE replaces the traditional coarse-grained method by sliding coarse-grained data processing, effectively avoiding the instability of entropy values caused by increasing scale factors, and shortening coarse-grained time series.

The probability mean of the fluctuation dispersion pattern

π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}

in the coarse-grained series

y_{j}^{s} (1 \leq s \leq τ)

of the original time series

x = \{x_{1}, x_{2}, \dots, x_{N}\}

can be calculated as:

\bar{p} (π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}) = \frac{1}{τ} \sum_{1}^{τ} p_{s}^{τ}

(6)

where

p_{s}^{τ}

is the frequency of the dispersion mode

π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}

in the coarse-grained sequence

y_{j}^{s}

.

For the maximum scale factor

τ_{m a x}

, the RCMFDE value is defined as the Shannon entropy of the average dispersion model obtained after the time series is advected:

R C M F D E (x, m, d, c, τ_{m a x}) = - \sum_{π = 1}^{{(2 c - 1)}^{m - 1}} \bar{p} (π_{ν_{0} ν_{1} \dots ν_{(m - 1)}}) \cdot l n \bar{p} (π_{ν_{0} ν_{1} \dots ν_{(m - 1)}})

(7)

2.1.3. Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy (HRCMFDE)

Basic Principle

The multiscale analysis only captures the low-frequency information of the measured sequence. In contrast, part of the high-frequency information is discarded, and there is a loss of amplitude information, which will have a certain impact on entropy. Therefore, this paper introduces the hierarchical theory into RCMFDE, and the principle of hierarchical decomposition is shown in Figure 1. By constructing high-frequency and low-frequency operators, the high-frequency and low-frequency features of the measured sequence are quantified simultaneously. The HRCMFDE proposed in this paper evaluates the dynamic features hidden in the low-frequency and high-frequency components of the vibration signal, thus overcoming the problem of amplitude information loss due to the coarse-grained process.

Define two symbols

Q_{0}

and

Q_{1}

, which represent the averaging operator and the difference operator, respectively. For time series

x = \{x_{1}, x_{2}, \dots, x_{N}\}

of length

2^{n}

, the low-frequency features averaging operators

Q_{0} (x)

and the high-frequency operators

Q_{1} (x)

are:

Q_{0} (x) = \frac{x_{2 j} + x_{2 j + 1}}{2}, j = 1, 2, \dots, 2^{n - 1}

(8)

Q_{1} (x) = \frac{x_{2 j} - x_{2 j + 1}}{2}, j = 1, 2, \dots, 2^{n - 1}

(9)

The operator

Q_{j}

is in matrix form and can be expressed as:

Q_{j} = {[\begin{matrix} \frac{1}{2} & \frac{{(- 1)}^{j}}{2} & 0 & 0 & \dots & 0 & 0 \\ 0 & 0 & \frac{1}{2} & \frac{{(- 1)}^{j}}{2} & \dots & ⋮ & ⋮ \\ ⋮ & ⋮ & ⋮ & ⋮ & \dots & 0 & 0 \\ 0 & 0 & 0 & 0 & \dots & \frac{1}{2} & \frac{{(- 1)}^{j}}{2} \end{matrix}]}_{2^{n - 1} \times 2^{n}}

(10)

where

j \in [0, 1]

. When the layer number is

n

, construct vector

γ = [γ_{1}, γ_{2}, \dots, γ_{n}] \in {0, 1}

. The hierarchical node number

e

can be calculated by the following equation:

e = \sum_{j = 1}^{k} γ_{j} 2^{k - j}

(11)

Based on the vector

γ

, Equations (8) and (9) are repeated to obtain the hierarchical component

x_{n, e}

corresponding to the

e t h

node of the

n t h

layer.

x_{n, e} = Q_{γ_{n}}^{n} \cdot Q_{γ_{n - 1}}^{n - 1} \cdot \dots Q_{γ_{1}}^{1} \cdot x

(12)

Then, the HRCMFDE of the original signal

x

can be obtained by the following equation:

H R C M F D E (x, m, d, c, τ_{m a x}, n) = A v e r a g e (R C M F D E (x, m, d, c, τ_{m a x}))

(13)

where

A v e r a g e (\cdot)

denotes the calculation of the arithmetic mean.

The calculation flowchart of HRCMFDE is shown in Figure 2.

Parameter Selection

The performance of HRCMFDE is determined by five parameters: embedding dimension

m

, the number of classes

c

, time delay

d

, maximum scale factor

τ_{m a x}

, and number of hierarchical layers

n

. The choice of parameters directly affects the effectiveness of HRCMFDE in measuring time series.

1.: Embedding dimension $m$ .

If

m

is too small, HRCMFDE cannot accurately observe the dynamic changes of the nonlinear time series. Conversely, if

m

is too large, HRCMFDE cannot detect small changes. According to the references [26], the value range of

m

is 2–5.

2.: The number of classes $c$ .

If

c

is too small, two different amplitudes may be assigned to the same class and if

c

is too large, HRCMFDE is sensitive to noise. According to the references [27], the value range of

c

is an integer in 3–9.

3.: Time delay $d$ .

The effect of

d

on HRCMFDE is small, and some frequency information may be lost when

d > 1

, which is usually taken as the smallest positive integer 1.

4.: Maximum scale factor $τ_{m a x}$ .

If

τ_{m a x}

is too small, HRCMFDE cannot fully extract the features of the nonlinear time series. If

τ_{m a x}

is too large, it is easy to produce unstable and unreliable entropy values. In addition, a larger

τ_{m a x}

may reduce the computational efficiency of HRCMFDE. Therefore, to obtain reliable results, according to reference [28], the maximum scale factor

τ_{m a x}

is usually chosen to be 20, which is sufficient to analyze the time series efficiently.

5.: Number of hierarchical layers $n$ .

For the selection of

n

, the larger

n

is, the shorter the time series of the corresponding layers, which will lead to insufficient extraction of fault information. However, if

n

is too small, the decomposition of the original time series is incomplete, and the dimensionality of the extracted features is low and insufficient. According to the references [45,46],

n

is generally set to 3.

Simulation Signal Analysis

In this section, to discuss the sensitivity of HRCMFDE to the embedding dimension

m

and the number of classes

c

, WGN noise, and 1/f noise were used as simulation signals for parametric analysis. Each simulation signal is a randomly generated set of 20 with mean 0, variance 0, and data length 3000, respectively. The time domain waveform and frequency spectrum of a set of random signals are shown in Figure 3. As can be seen from the figure, 1/f noise is less smooth than WGN, which has higher uncertainty than 1/f noise.

To explore the effect of embedding dimension

m

and the number of classes

c

on HRCMFDE, two types of noise with different values were discussed. The mean and SD of each node were calculated with

m

and

c

as univariate variables, respectively. From the previous section, the range values of

m

is 2–4, and the three cases of

m = 2

,

m = 3

,

m = 4

are chosen; the range values of

c

is 3–9, and the three cases of

c = 3

,

c = 6

,

c = 9

are chosen.

The effect of the embedding dimension

m

is shown in Figure 4. Both in WGN and 1/f noise, the mean curve at

m = 2

is smoother, and the SD value is the smallest. Considering the effect of the degree of dispersion of the 20 sets of noisy data, the CVs (coefficients of variation) were discussed for

m

at the three cases. By calculating the ratio of SD to mean, the magnitude of CVs is inversely proportional to the computational stability, as shown in Table 1 and Table 2, which more objectively indicates the superiority of

m = 2

in HRCMFDE.

As shown in Figure 5, it can be visualized that both for WGN and 1/f noise, the mean fluctuation is the smallest when

c = 6

, whereas the SD curve is the most cluttered when

c = 3

, and the SD curves are closer when

c = 6

and

c = 9

. Further, the calculated CVs are shown in Table 3 and Table 4, respectively. When

c

is taken as 6 and 9, it is not apparent which value is superior, and it can be seen that its influence on the stability of entropy is not significant. To ensure consistency,

c = 6

is taken uniformly in the subsequent experiments of this paper.

To verify the superiority of the proposed HRCMFDE algorithm over the pre-improvement algorithm in terms of stability, four different derivative algorithms for the fluctuation dispersion entropy were also compared in this paper, and the parameter selection in the experiment is shown in Table 5.

Due to the different principles of each algorithm, the extracted feature dimensions also differ, so the experiments are divided into two groups for comparison, one group is HFDE and HRCMFDE, as shown in subplot (a) and subplot (b) in Figure 6, and the feature dimension of the horizontal coordinate is the number of hierarchical nodes; the other group is MFDE and RCMFDE, as shown in subplot (c) and subplot (d) in Figure 6, and the feature dimension of the horizontal coordinate is the scale factor size.

As shown in Figure 6, on the whole, it is not difficult to find that the fold trends for the four different entropies, both at WGN and 1/f noise signal, are roughly similar, which shows that the type of noise signal has little effect on the stability of the entropy value calculation. Observing subplot (c) and subplot (d), where the folds both show a decreasing trend, this is because the single coarse granulation process in MFDE and RCMFDE leads to incomplete information, which makes the entropy of each scale significantly different. Then, by comparing the volatility of the folds in subplot (a) and subplot (b), it can be found that the stability of HFDE is not as good as that of HRCMFDE in both noises. This is due to the introduction of sliding coarse-grained, which makes the information considered by HRCMFDE more complete and computationally more stable compared to HFDE. In addition, comparing subplots (a), (c), and subplots (b), (d), respectively, it is found that the skewness of RCMFDE is significantly enhanced compared with that of MFDE, reflecting that the stability of entropy is improved after the refined composite algorithm improvement of MFDE. However, there are still some slight deficiencies compared with the HRCMFDE proposed in this paper. This affirms, to some extent, the introduction of hierarchical theory in solving the problem of information loss caused by the coarse-grained process. In summary, it can be concluded that HRCMFDE has the lowest error rate in measuring nonlinear dynamic changes and has more considerable superiority compared with MFDE, HFDE, and RCMFDE.

2.2. Fault Identification

2.2.1. Extreme Learning Machine (ELM)

The ELM algorithm is a single-hidden layer feedforward neural network (SLFN) learning algorithm proposed by Huang et al. Its distinctive feature is that the input weights and biases of the hidden layer nodes are randomly generated using a random algorithm, and the only variable that needs to be computed is the output weight obtained by the least squares. As a result, it can learn faster than traditional intelligent algorithms and perform better generalization [47].

Generally, an ELM network includes three parts which are an input layer, a hidden layer, and an output layer. Let

m

,

l

, and

n

be the number of nodes in the input, implicit, and output layers of the network, respectively, and

g (x)

be the activation function of the neurons in the implicit layer. Set there are

N

different samples

(x_{i}, t_{i})

,

1 \leq i \leq N

, where:

x_{i} = {[x_{i 1}, x_{i 2}, \dots, x_{i m}]}^{T} \in R^{m}

(14)

t_{i} = {[t_{i 1}, t_{i 2}, \dots, t_{i n}]}^{T} \in R^{n}

(15)

The network structure of ELM is shown in Figure 7. The network model of ELM can be expressed mathematically as follows:

\sum_{j = 1}^{l} β_{j} g (ω_{j} \cdot x_{i} + b_{j}) = ο_{i}

(16)

where

j = 1, \dots, N

.

ω_{j} = {[ω_{j 1}, ω_{j 2}, \dots, ω_{j n}]}^{T}

is the input weight that connects the

j t h

hidden node with the input nodes,

β_{j} = {[β_{j 1}, β_{j 2}, \dots, β_{j m}]}^{T}

is the output weight that connects the

j t h

hidden node with the output nodes,

b_{j}

is the bias of the

j t h

hidden node, and

ω_{j} \cdot x_{i}

is the inner product of

ω_{j}

and

x_{i}

.

When the above standard SLFN infinitely approximates the

N

samples and the error is equal to 0, i.e.,

\sum_{i = 1}^{N} ‖ο_{i} - t_{i}‖ = 0

, there exist

β_{j}

,

ω_{j}

,

b_{j}

, so that:

\sum_{j = 1}^{l} β_{j} g (ω_{j} \cdot x_{i} + b_{j}) = t_{i}

(17)

The simplified form of Equation (17) is:

H β = T

(18)

where:

H (ω_{1}, \dots, ω_{l}, b_{1}, \dots, b_{l}, x_{1}, \dots, x_{N}) = {[\begin{matrix} g (ω_{1} \cdot x_{1} + b_{1}) & \dots & g (ω_{l} \cdot x_{1} + b_{l}) \\ ⋮ & ⋮ & ⋮ \\ g (ω_{1} \cdot x_{N} + b_{1}) & \dots & g (ω_{l} \cdot x_{N} + b_{l}) \end{matrix}]}_{N \times l}

(19)

β = {[\begin{matrix} β_{1}^{T} \\ ⋮ \\ β_{l}^{T} \end{matrix}]}_{l \times m}

(20)

T = {[\begin{matrix} t_{1}^{T} \\ ⋮ \\ t_{N}^{T} \end{matrix}]}_{N \times m}

(21)

where

H

is the output matrix of the hidden layer in ELM.

The difference between the ELM algorithm and the general SLFN algorithm is that the input weight

ω

and hidden layer bias

b

of the former are given randomly at the beginning of the algorithm and accordingly calculate the output matrix

H

. Then, what needs to be done is to determine the parameter

β

. Training the feedforward neural network can be regarded as seeking out the solution of

H β = T

by the least-square method. The output weight matrix

β

can be obtained by solving this equation:

β = H^{+} T

(22)

where

H^{+}

is the Moore–Penrose generalized inverse of

H

.

2.2.2. Particle Swarm Optimization-Based Extreme Learning Machine (PSO-ELM)

Although ELM has the advantages of easy parameter selection, fast learning speed, good generalization performance, and does not fall into local optimum, its classification performance will be affected because the input weights and hidden layer neuron thresholds are randomly generated during training. Therefore, in this paper, the PSO algorithm is used to optimize the network structure by seeking the input weights and hidden layer thresholds of ELM, so that the classification accuracy of ELM can be higher. The flow chart of the PSO-ELM algorithm is shown in Figure 8 below.

3. Proposed Method

In this paper, a novel rolling bearing fault diagnosis method based on the combination of HRCMFDE and PSO-ELM is proposed, and the flowchart of the proposed method is shown in Figure 9. The main processes of the rolling bearing fault diagnosis method are described below.

3.1. Data Preprocessing

Firstly, the data acquisition system is used to obtain sufficient raw vibration signals of rolling bearings in different fault states. Then, each original vibration signal is segmented by length

N

to obtain a set of equal-length samples for that fault state. Finally, a sample set is constructed using samples from different fault states, and this sample set is divided into a training sample set and a test sample set.

3.2. Training Process

Firstly, feature extraction is performed for each sample in the training set. HRCMFDE performs

n

-layer hierarchical decomposition of the sample signal to obtain

2^{n}

hierarchical decomposition signals. The RCMFDE value is obtained for each hierarchical decomposition signal, and then the signal features of that frequency band are extracted by calculating the arithmetic mean of RCMFDE. The HRCMFDE values of all hierarchical decomposition signals are calculated to form a feature vector, and the feature vectors of all samples in the training set are formed into a training sample feature set. At the same time, to ensure that the fault features extracted by HRCMFDE have better noise suppression capability, the parameters of HRCMFDE are selected using simulated interference signals. Then, the ELM classifier is trained by using the training sample feature set. The input weights and hidden layer neuron thresholds in the ELM classifier are optimized by PSO to obtain optimal fault identification accuracy. Finally, rolling bearing fault diagnosis can be achieved with the obtained PSO-ELM classifier.

3.3. Testing Process

Firstly, the HRCMFDE is used to extract the features of the test signal and obtain the feature vector of the test signal. Then, the feature vectors are input to the PSO-ELM classifier. Finally, the results of the rolling bearing fault diagnosis are output.

4. Experiments

4.1. Case 1: CWRU Dataset

4.1.1. Experiment Setup

In this case, the vibration signals collected at the driving end of the Case Western Reserve University (CWRU) [48] bearing database were selected as experimental data for the experiment. The experimental platform mainly consists of a load motor, accelerometer, torque sensor, and dynamometer. The experimental data set includes vibration signals collected in four fault types: normal (N), ball fault (B), outer race fault (O), and inner race fault (I). The signals collected in each type are classified according to the fault severity and the size of the load. The fault severity is 0.007, 0.014, 0.021, and 0.028 inches, respectively, and the load size range from 0 hp, 1 hp, 2 hp, and 3 hp (1 hp = 746 W).

In this paper, samples of various bearing states with a sampling frequency of 12 KHZ at the driving end were selected for experiments of rolling bearing fault diagnosis, with a total of 12 rolling bearing health states without differentiating the load. The samples included in the CWRU experimental dataset are shown in Table 6, where “√” means the sample exists and is selected, and “*” means the sample does not exist.

Set the data length of each subsample

N = 2400

, take the first 120,000 points of the bearing vibration signals collected in each state (sampling time is 10 s), and divide the bearing vibration signals into 50 groups of 2400 points without overlapping (sampling time is 0.2 s). Each group is a sub-sample, and each state is obtained as a 50 × 2400 data matrix, with each data matrix containing 50 sub-samples.

The preprocessed dataset was canonically named “N” for normal status and “B0070” for ball fault (B) with a fault severity of 0.007 inch. Each load condition experiment contains a total of 600 samples. The training set and testing set were divided according to the ratio of 2:3, and 240 training samples and 360 testing samples were obtained. The four bearing fault types were further divided into 12 health states according to the different fault severity (where the outer race fault is missing 0.028-inch severity samples). The specific division is shown in Table 7 below.

4.1.2. Feature Extraction

As shown in Figure 10, the time domain and spectral diagram of the original signals for the three fault types of the ball fault, outer race fault, and inner race fault with a fault severity of 0.007 inch, respectively. It can be observed from the figure that the original signals are nonlinear, non-stationary signals, in which there are a lot of impulse signals and noise interference, and the fault information exists in each frequency band, so it is difficult to extract the fault features of rolling bearings by observation method.

4.1.3. Fault Identification

In the fault identification section, a total of 12 health states were distinguished for different fault types and different fault severities at the same time. Experimentally, the feature vectors with varying loads were extracted and composed into feature sample sets according to the above feature extraction method. Then, the training set and the testing set were randomly divided in the ratio of 2:3. The number of samples in the training set is 240, containing 20 of each state; the number of samples in the testing set is 360, containing 30 of each state. Finally, the PSO-ELM classifier was used for classification. The above process was repeated ten times for independent experiments, and the average identification accuracy was used as the evaluation index for the performance of the proposed fault diagnosis algorithm. The identification results with load 2 hp are shown in Figure 11, and the average identification accuracy can reach 100%. Therefore, the rolling bearing fault diagnosis method proposed in this paper has a satisfactory fault identification result.

4.1.4. Performance Comparison

To illustrate the effectiveness of the proposed HRCMFDE fault feature extraction method, the features extracted by the HFDE, MFDE, RCMFDE, and HRCMFDE algorithms were identified using the PSO-ELM classifier, respectively, as a way to compare the fault identification results of different feature extraction methods. A set of the training set and testing set was arbitrarily selected with load 0 hp. The fault identification results of different feature extraction methods and PSO-ELM are shown in Figure 12. It can be seen that the accuracy of the HRCMFDE algorithm is the highest among the four feature extraction methods, which can reach 100%. At the same time, the features extracted using the RCMFDE algorithm show a significant improvement in fault identification accuracy over the MFDE algorithm.

Further, a set of the training set and testing set was arbitrarily selected with load 0 hp. The features extracted by the HRCMFDE algorithm were classified using ELM, kernel extreme learning machine (KELM) [49], genetic algorithm optimized extreme learning machine (GA-ELM) [50], and PSO-ELM classifier to compare the effects of different classifiers on the fault identification results when the feature vectors are the same. It is obvious from the figure that determining the kernel function and incorporating the parameter optimization algorithm both help to improve the identification accuracy of the ELM classifier, whereas using the GA and PSO algorithms to optimize the parameters in the ELM classifier can lead to more accurate fault identification results. Among them, the PSO-ELM fault identification method used in this paper has the highest accuracy, and all testing samples can be guaranteed to be correctly classified with load 0 hp.

To verify the generalization of the proposed method in this paper, the feature vectors extracted with varying loads were classified using the PSO-ELM classifier, and each group of experiments was repeated ten times. The average identification accuracy was taken as the fault diagnosis results and recorded. The fault classification results are shown in Table 8, from which it can be seen that the rolling bearing fault diagnosis method based on HRCMFDE and PSO-ELM achieves a relatively high identification accuracy, and its effectiveness and generalization are verified through several experiments.

The experiments also utilize the HFDE, MFDE, RCMFDE, and HRCMFDE algorithms to extract feature vectors with varying loads and input them into the PSO-ELM classifier for fault identification, respectively, to further compare the final result of different feature extraction algorithms for bearing fault diagnosis. The samples were selected with four varying loads, and the experiment was repeated ten times to calculate the average value of the fault identification accuracy. The identification accuracies of different feature extraction methods with four varying loads are shown in Table 9. It can be concluded that the feature vectors extracted using the HRCMFDE algorithm, followed by the PSO-ELM classifier for fault identification, have the desired fault identification accuracy with varying loads.

To further illustrate the performance of the proposed rolling bearing fault diagnosis method, the experiments also compared the average identification accuracy and fault identification time of different classification algorithms with four varying loads, and the results are shown in Table 10. From the experimental results, it can be seen that the KELM algorithm has the shortest identification time, but the lowest identification accuracy. The optimization prolongs the identification time of both GA-ELM and PSO-ELM classifiers, but the identification accuracy is somewhat improved. Among them, the identification time of the PSO-ELM algorithm has a significant advantage over GA-ELM, and the average fault identification accuracy is the highest, reaching 99.91%. The standard deviation is also the smallest, which is only 0.11%.

To verify the superiority of the proposed fault diagnosis method, this paper also compared the identification accuracy of the methods in other references, and the experimental results are shown in Table 11. It can be seen that also using the CWRU dataset for experiments, the method in this paper also has significant diagnostic results when divided into 12 fault states. To ensure the objectivity of the experiment, since the dataset used in reference [11] is different from this paper, the experiment attempted to partially reproduce its method, omitting the feature selection part of it, and conducted several experiments according to the parameters set by the authors to obtain the corresponding average accuracy of fault identification.

4.1.5. Load Migration

To avoid algorithmic limitations caused by selecting the training set and testing set with the same load, the experiments combined varying loads to form the training sets and testing sets to achieve the effect of load migration. As Figure 13 shows the fault identification accuracy of one load as the training set and another different load as the testing set. With load 1 hp as the training set and load 2 hp as the testing set, the fault identification accuracy is up to 100%. With load 0 hp as the training set and load 3 hp as the testing set, the fault identification accuracy is the worst, but it can reach 70.67%.

As Figure 14 shows the fault identification accuracy of two loads as the training set and another different load as the testing set. With load 1 hp and load 3 hp as the training set and load 2 hp as the testing set, the fault identification accuracy is up to 100%. With load 0 hp and load 1 hp as the training set and load 3 hp as the testing set, the fault identification accuracy is the worst, but it can reach 77.50%. Thus, it can be seen that the increase in the number of training set samples can effectively improve fault identification in the testing set.

As Figure 15 shows the fault identification accuracy of two loads as the training set and another two varying loads as the testing set. From the figure, it can be seen that the fault identification accuracy can reach more than 80% for both the training set and testing set matching case. Among them, the fault identification accuracy is up to 100% with load 0 hp and load 1 hp as the training set and load 2 hp and load 3 hp as the testing set. The worst fault identification accuracy is achieved in both cases with load 1 hp, load 2 hp as training sets and load 0 hp, load 3 hp as testing sets and with load 1 hp, load 3 hp as training sets and load 0 hp, load 2 hp as testing sets, but it can also reach 83.33%. Further, it can be concluded that with the increase in the number of samples in the training sets and testing sets, the rolling bearing fault identification accuracy also has significant improvement. It can also be shown that the method proposed in this paper has good generalizability.

4.2. Case 2: MFPT Fault Dataset

4.2.1. Experiment Setup

The MFPT (Machinery Fault Prevention Technology) [58] bearing fault dataset provided by the American Society for Machinery Fault Prevention Technology is another typical dataset for rolling bearing fault diagnosis and is one of the essential references to verify the performance of the proposed method. The MFPT dataset consists of three sets of experimental bearing vibration data and three actual fault data. Among them, the three sets of experimental bearing vibration data include baseline bearing data, outer race fault data with various loads, and inner race fault data with various loads.

The baseline dataset contains three files, and the data in each file are obtained by sampling at a frequency of 97,656 sps (sample per second) for 6 s with a load of 270 pounds. There are two types of outer race fault data sets, and one contains three files with the same load, sampling rate, and sampling time as the baseline dataset; the other contains seven files, which are, respectively obtained by sampling at 48,828 sps for 3 s with seven load conditions, including 25, 50, 100, 150, 200, 250, and 300 pounds. The inner race fault dataset contains seven files, respectively, obtained by sampling at 48,828 sps for 3 s with seven load conditions, including 0, 50, 100, 150, 200, 250, and 300 pounds.

The samples included in the MFPT experimental dataset are shown in Table 12. In this paper, all the vibration data of the bearing test stand are selected and divided into three classes according to the fault states and locations. The Class1 is normal (N), which contains three baseline data; the Class2 is outer race fault (O), which contains three outer race data by sampling at 97,656 sps and seven outer race data by sampling at 48,828 sps; and the Class3 is Inner race fault (I), which contains seven inner race data by sampling at 48,828 sps.

To match the other fault data, the baseline data and outer race fault data by sampling at 97,656 sps need to be resampled to 48,828 sps, thus changing the number of data points per file from 585,936 to 292,968. Therefore, the experimental data set consists of the following data points: normal has 878,904 data points, outer race fault has 1,904,292 data points, and inner race fault has 1,025,388 data points.

The data length of each subsample is set to

N = 2400

. The first 864,000 sample points in the normal, the first 1,896,000 sample points in the outer race fault state, and the first 1,020,000 sample points in the inner race fault state were taken, respectively. The bearing vibration signals of different states were neatly partitioned into non-overlapping data of length 2400. Each set was a sub-sample to obtain 360 groups of sub-samples in the normal, 790 groups of sub-samples in the outer race fault state, and 425 groups of sub-samples in the inner race fault state, and they constitute the data matrix of each state, respectively.

The bearing data in each state were randomly divided into the training set and testing set in the ratio of 2:3, and the three classes were labeled with three tags (1,2,3) to represent the different states of the bearing. The specific division is shown in Table 13 below.

4.2.2. Feature Extraction

A set of raw signals in three bearing types (normal, inner race fault, and outer race fault) were selected arbitrarily. As shown in Figure 16, the time domain and the corresponding spectrum diagram are shown.

4.2.3. Fault Identification

In the fault identification section, the feature vectors of each bearing state extracted in the previous section were formed into a feature sample set. Then, the training set and the testing set were randomly divided in the ratio of 2:3. The PSO-optimized ELM classifier was used for bearing states classification and was repeated ten times to obtain the average identification accuracy. As shown below in Figure 17, the identification results are listed, and the identification accuracy can reach 100%. It is enough to prove that the proposed method for classification can achieve the expected results.

4.2.4. Performance Comparison

As shown in Figure 18, an arbitrary set of the training set and testing set were selected, and the PSO-ELM classifier was used to identify the features extracted by the HFDE, MFDE, RCMFDE, and HRCMFDE algorithms, respectively, as a way to compare the fault identification results of different feature extraction algorithms. As seen from the figure, the accuracy of the HRCMFDE algorithm is the highest among the four feature extraction methods, which can also reach 100%.

Further, to compare the effects of different classifiers on the fault identification results when the feature vectors are the same, a set of training sets and testing sets were arbitrarily selected to classify the features extracted by the HRCMFDE algorithm using ELM, KELM, GA-ELM, and PSO-ELM classifiers, respectively. As can be seen from the figure, the classification effects of the optimized ELM classifiers are all improved, especially the PSO-ELM fault identification algorithm used in this paper which has the highest accuracy. It can guarantee that all of them are correctly classified.

Since the difference in load in the MFPT dataset was not considered, this paper only focused on the diagnosis results of different classification algorithms. The experiments were repeated ten times to take the average identification accuracy and fault identification time for comparison. The results are shown in Table 14. From the table, it can be seen that both GA-ELM and PSO-ELM classifiers can achieve a fault identification accuracy of more than 99%. Unlike the CWRU dataset, the identification accuracy of the PSO-ELM classifier is 99.43%, which is slightly lower than that of the GA-ELM classifier by 0.06%. In a comprehensive analysis, the PSO-ELM classifier still has some advantages in fault identification.

In addition, this paper also compared the identification accuracy of other rolling bearing diagnosis methods using the MFPT dataset for experiments, and the experimental results are shown in Table 15.

Comprehensively analyzing the experimental results of the above two different datasets, the rolling bearing fault diagnosis method based on the HRCMFDE feature extraction algorithm and PSO-ELM classification algorithm proposed in this paper has achieved satisfactory results. Not only the fault identification accuracy but also the fault identification time have good application prospects.

5. Conclusions

In this paper, a rolling bearing fault diagnosis method based on HRCMFDE and PSO-ELM is proposed. The HRCMFDE algorithm is proposed for the first time to extract the features of rolling bearing vibration signals. Based on the study of FDE and its variant algorithms, the introduced hierarchical theory algorithm quantifies the high-frequency and low-frequency features of the measured sequence, effectively overcoming the problem of amplitude information loss caused by the coarse-grained process. After extracting the fault features of the vibration signals using the HRCMFDE algorithm, they are inputted to the PSO-optimized ELM classifier for fault identification, which can accurately subdivide different fault types and fault conditions. In this paper, two typical rolling bearing fault datasets are used to verify the effectiveness of the proposed method, and the experimental results show that HRCMFDE has a good descriptive capability for rolling bearing fault features and the proposed fault diagnosis method has satisfactory performance. In addition, this paper further discusses the effectiveness of the proposed method for load migration diagnosis when varying loads are combined to form training sets and testing sets. The results show that the rolling bearing fault diagnosis method proposed in this paper can effectively avoid the algorithm limitations caused by selecting the training set and testing set with the same load, and has a more considerable generalization capability in the application scenarios of the method.

In summary, the rolling bearing fault diagnosis method proposed in this paper has good results. However, the following problems need to be solved in future research. Firstly, it is a challenge to make the algorithm faster and more efficient by reducing its computational complexity to meet the practicality better. Secondly, the extracted fault features can be filtered to improve the level of features and reduce the computational workload by introducing methods such as feature selection. Finally, the paper only focused on the effect of using the entropy algorithm for rolling bearing fault diagnosis. In the future, we will try to select more novel complexity indexes for comparison and research, such as the optimization method based on the LZC algorithm. Meanwhile, we expect that the HRCMFDE feature extraction method proposed in this paper can be applied not only in the field of rolling bearing fault diagnosis, but also to other fields involving nonlinear and non-stationary feature extraction.

Author Contributions

Y.C. and Z.Y. implemented the algorithm, analyzed the data, and wrote the manuscript. J.C. and K.S. investigated the project and conceived and revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work received financial support from the National Natural Science Foundation of China (No. 61803128), Heilongjiang Postdoctoral Foundation (LBH-Z19167), China Postdoctoral Science Foundation (2020M670920) and University Nursing Program for Young Scholars with Creative Talents in Heilongjiang Province (2020188).

Data Availability Statement

The data presented in this study are openly available in reference number [48,58]. The datasets used or analyzed during the current study are available from the corresponding author on reasonable request.

Acknowledgments

Thanks for the Bearing Data Center of Case Western Reserve University and American Society for Machinery Fault Prevention Technology to supply the rolling bearing data sets.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

AAPE	Amplitude aware permutation entropy
ANN	Artificial neural network
ApEn	Approximate entropy
CMFDE	Composite multiscale fluctuation-based dispersion entropy
CNN	Convolutional neural network
CVs	Coefficients of variation
CWRU	Case Western Reserve University
CWT	Continuous wavelet transform
DE	Dispersion entropy
DLZC	Dispersion Lempel-Ziv complexity
EEMD	Ensemble empirical mode decomposition
ELM	Extreme learning machine
EMD	Empirical mode decomposition
FDE	Fluctuation-based dispersion entropy
FDLZC	Fluctuation-based DLZC
GA-ELM	Genetic algorithm optimized extreme learning machine
GRCMFE	Generalized refined composite multiscale fuzzy entropy
GSA-SVM	Gravitational search algorithm optimized support vector machine
HCNN	Hierarchical convolutional neural network
HFDE	Hierarchical fluctuation-based dispersion entropy
HRCMFDE	Hierarchical refined composite multiscale fluctuation-based dispersion entropy
IFBE	Improved frequency band entropy
IMFs	Intrinsic mode functions
KELM	Kernel extreme learning machine
k-NN	K-nearest neighbor
LMD	Local mean decomposition
LSVM	Linear support vector machine
LZC	Lempel-Ziv complexity
MAAPE	Multiscale amplitude aware permutation entropy
MFDE	Multiscale fluctuation-based dispersion entropy
MFPT	Machinery Fault Prevention Technology
M-RVM	Multiclass relevance vector machine
MSE	Multiscale entropy
MSST	Multi-synchro-squeezing transform
NCDF	Normal cumulative distribution function
NPLSSMM	Non-parallel least squares support matrix machine
PCA	Principal component analysis
PE	Permutation entropy
PSO	Particle swarm optimization
PSO-ELM	Particle swarm optimization-based extreme learning machine
RCMFDE	Refined composite multiscale fluctuation-based dispersion entropy
RCMFDLZC	Refined composite multiscale FDLZC
RNN	Recurrent neural network
SD	Standard deviation
SE	Sample entropy
SFC-DL	Sparse feature coding based on dictionary learning
SLFN	Single-hidden layer feedforward neural network
sps	sample per second
STMSST	Second-order time-reassigned multi-synchro-squeezing transform
SVM	Support vector machines
VMD	Variational mode decomposition
WGN	White Gaussian noise
WPT	Wavelet packet transform
WT	Wavelet transform
WTFD	Wavelet time-frequency diagram

References

Yan, X.; Jia, M. A novel optimized SVM classification algorithm with multi-domain feature and its application to fault diagnosis of rolling bearing. Neurocomputing 2018, 313, 47–64. [Google Scholar] [CrossRef]
Chen, Y.; Zhang, T.; Zhao, W.; Luo, Z.; Lin, H. Rotating machinery fault diagnosis based on improved multiscale amplitude-aware permu-tation entropy and multiclass relevance vector machine. Sensors 2019, 19, 4542. [Google Scholar] [CrossRef] [PubMed]
Rai, A.; Upadhyay, S. A review on signal processing techniques utilized in the fault diagnosis of rolling element bearings. Tribol. Int. 2016, 96, 289–306. [Google Scholar] [CrossRef]
Liu, R.; Yang, B.; Zio, E.; Chen, X. Artificial intelligence for fault diagnosis of rotating machinery: A review. Mech. Syst. Signal Process. 2018, 108, 33–47. [Google Scholar] [CrossRef]
Yang, L.; Chen, H. Fault diagnosis of gearbox based on RBF-PF and particle swarm optimization wavelet neural network. Neural Comput. Appl. 2018, 31, 4463–4478. [Google Scholar] [CrossRef]
Li, C.; Zheng, J.; Pan, H.; Tong, J.; Zhang, Y. Refined Composite Multivariate Multiscale Dispersion Entropy and Its Application to Fault Diagnosis of Rolling Bearing. IEEE Access 2019, 7, 47663–47673. [Google Scholar] [CrossRef]
Gao, Y.; Yu, D. Total variation on horizontal visibility graph and its application to rolling bearing fault diagnosis. Mech. Mach. Theory 2020, 147, 103768. [Google Scholar] [CrossRef]
Lempel, A.; Ziv, J. On the Complexity of Finite Sequences. IEEE Trans. Inf. Theory 1976, 22, 75–81. [Google Scholar] [CrossRef]
Jiao, S.; Geng, B.; Li, Y.; Zhang, Q.; Wang, Q.; Li, Y.; Wang, W. Comparative study of feature extraction and classification based on dispersion Lempel-Ziv complexity. Int. J. Numer. Model. Electron. Netw. Devices Fields 2021, 35, e2949. [Google Scholar] [CrossRef]
Mao, X.; Shang, P.; Xu, M.; Peng, C.-K. Measuring time series based on multiscale dispersion Lempel–Ziv complexity and dispersion entropy plane. Chaos Solit. Fract. 2020, 137, 109868. [Google Scholar] [CrossRef]
Li, Y.; Jiao, S.; Geng, B. Refined composite multiscale fluctuation-based dispersion Lempel–Ziv complexity for signal analysis. ISA Trans. 2022, in press. [CrossRef] [PubMed]
Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Wang, S.; Xiang, J.; Tang, H.; Liu, X.; Zhong, Y. Minimum entropy deconvolution based on simulation-determined band pass filter to detect faults in axial piston pump bearings. ISA Trans. 2018, 88, 186–198. [Google Scholar] [CrossRef] [PubMed]
Ali, J.B.; Fnaiech, N.; Saidi, L.; Chebel-Morello, B.; Fnaiech, F. Application of empirical mode decomposition and artificial neural network for automatic bearing fault diagnosis based on vibration signals. Appl. Acoust. 2015, 89, 16–27. [Google Scholar]
Li, H.; Liu, T.; Wu, X.; Chen, Q. Application of EEMD and improved frequency band entropy in bearing fault feature extraction. ISA Trans. 2019, 88, 170–185. [Google Scholar] [CrossRef]
Liu, H.; Han, M. A fault diagnosis method based on local mean decomposition and multi-scale entropy for roller bearings. Mech. Mach. Theory 2014, 75, 67–78. [Google Scholar] [CrossRef]
Chen, Y.; Zhang, T.; Luo, Z.; Sun, K. A Novel Rolling Bearing Fault Diagnosis and Severity Analysis Method. Appl. Sci. 2019, 9, 2356. [Google Scholar] [CrossRef]
Ding, J.; Xiao, D.; Li, X. Gear Fault Diagnosis Based on Genetic Mutation Particle Swarm Optimization VMD and Probabilistic Neural Network Algorithm. IEEE Access 2020, 8, 18456–18474. [Google Scholar] [CrossRef]
Li, Y.; Wang, X.; Liu, Z.; Liang, X.; Si, S. The Entropy Algorithm and Its Variants in the Fault Diagnosis of Rotating Machinery: A Review. IEEE Access 2018, 6, 66723–66741. [Google Scholar] [CrossRef]
Wang, Z.; Yao, L.; Chen, G.; Ding, J. Modified multiscale weighted permutation entropy and optimized support vector machine method for rolling bearing fault diagnosis with complex signals. ISA Trans. 2021, 114, 470–484. [Google Scholar] [CrossRef]
Pincus, S.M. Approximate entropy (ApEn) as a complexity measure. Chaos Interdiscip. J. Nonlinear Sci. 1995, 5, 110–117. [Google Scholar] [CrossRef] [PubMed]
Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol. 2000, 278, H2039–H2049. [Google Scholar] [CrossRef] [PubMed]
Bandt, C.; Pompe, B. Permutation Entropy: A Natural Complexity Measure for Time Series. Phys. Rev. Lett. 2002, 88, 174102. [Google Scholar] [CrossRef] [PubMed]
Azami, H.; Escudero, J. Amplitude-aware permutation entropy: Illustration in spike detection and signal segmentation. Comput. Methods Programs Biomed. 2016, 128, 40–51. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Zhang, T.; Zhao, W.; Luo, Z.; Sun, K. Fault Diagnosis of Rolling Bearing Using Multiscale Amplitude-Aware Permutation Entropy and Random Forest. Algorithms 2019, 12, 184. [Google Scholar] [CrossRef]
Rostaghi, M.; Azami, H. Dispersion Entropy: A Measure for Time-Series Analysis. IEEE Signal Process. Lett. 2016, 23, 610–614. [Google Scholar] [CrossRef]
Azami, H.; Escudero, J. Amplitude- and Fluctuation-Based Dispersion Entropy. Entropy 2018, 20, 210. [Google Scholar] [CrossRef]
Azami, H.; Arnold, S.E.; Sanei, S.; Chang, Z.; Sapiro, G.; Escudero, J.; Gupta, A.S. Multiscale fluctuation-based dispersion entropy and its applications to neurological dis-eases. IEEE Access 2019, 7, 68718–68733. [Google Scholar] [CrossRef]
Wu, S.D.; Wu, C.W.; Lee, K.Y.; Lin, S.G. Modified multiscale entropy for short-term time series analysis. Phys. A Stat. Mech. Its Appl. 2013, 392, 5865–5873. [Google Scholar] [CrossRef]
Gan, X.; Lu, H.; Yang, G. Fault Diagnosis Method for Rolling Bearings Based on Composite Multiscale Fluctuation Dispersion Entropy. Entropy 2019, 21, 290. [Google Scholar] [CrossRef]
Zhou, F.; Yang, X.; Shen, J.; Liu, W. Fault Diagnosis of Hydraulic Pumps Using PSO-VMD and Refined Composite Multiscale Fluctuation Dispersion Entropy. Shock Vib. 2020, 2020, 8840676. [Google Scholar] [CrossRef]
Jiang, Y.; Peng, C.-K.; Xu, Y. Hierarchical entropy analysis for biological signals. J. Comput. Appl. Math. 2011, 236, 728–742. [Google Scholar] [CrossRef]
Ke, Y.; Yao, C.; Song, E.; Dong, Q.; Yang, L. An early fault diagnosis method of common-rail injector based on improved CYCBD and hierarchical fluctuation dispersion entropy. Digit. Signal Process. 2021, 114, 103049. [Google Scholar] [CrossRef]
Tian, J.; Morillo, C.; Azarian, M.H.; Pecht, M. Motor Bearing Fault Detection Using Spectral Kurtosis-Based Feature Extraction Coupled With K-Nearest Neighbor Distance Analysis. IEEE Trans. Ind. Electron. 2015, 63, 1793–1803. [Google Scholar] [CrossRef]
Zheng, J.; Pan, H.; Tong, J.; Liu, Q. Generalized refined composite multiscale fuzzy entropy and multi-cluster feature selection based intelligent fault diagnosis of rolling bearing. ISA Trans. 2021, 123, 136–151. [Google Scholar] [CrossRef]
Wen, L.; Li, X.; Gao, L. A New Two-Level Hierarchical Diagnosis Network Based on Convolutional Neural Network. IEEE Trans. Instrum. Meas. 2019, 69, 330–338. [Google Scholar] [CrossRef]
Huang, G.-B.; Zhu, Q.-Y.; Siew, C.-K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
Tian, Y.; Ma, J.; Lu, C.; Wang, Z. Rolling bearing fault diagnosis under variable conditions using LMD-SVD and extreme learning machine. Mech. Mach. Theory 2015, 90, 175–186. [Google Scholar] [CrossRef]
Mao, W.; He, L.; Yan, Y.; Wang, J. Online sequential prediction of bearings imbalanced fault diagnosis by extreme learning machine. Mech. Syst. Signal Process. 2017, 83, 450–473. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Net-works, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
Langdon, W.B.; Poli, R. Evolving Problems to Learn About Particle Swarm Optimizers and Other Search Algorithms. IEEE Trans. Evol. Comput. 2007, 11, 561–578. [Google Scholar] [CrossRef]
Han, F.; Yao, H.-F.; Ling, Q.-H. An improved evolutionary extreme learning machine based on particle swarm optimization. Neurocomputing 2013, 116, 87–93. [Google Scholar] [CrossRef]
Chen, S.; Shang, Y.; Wu, M. Application of PSO-ELM in electronic system fault diagnosis. In Proceedings of the 2016 IEEE International Con-ference on Prognostics and Health Management (ICPHM), Ottawa, ON, Canada, 20–22 June 2016; pp. 1–5. [Google Scholar]
He, C.; Wu, T.; Gu, R.; Jin, Z.; Ma, R.; Qu, H. Rolling bearing fault diagnosis based on composite multiscale permutation entropy and reverse cog-nitive fruit fly optimization algorithm–extreme learning machine. Measurement 2021, 173, 108636. [Google Scholar] [CrossRef]
Yan, X.; Liu, Y.; Huang, D.; Jia, M. A new approach to health condition identification of rolling bearing using hierarchical dispersion entropy and improved Laplacian score. Struct. Health Monit. 2020, 20, 1169–1195. [Google Scholar] [CrossRef]
Zhou, F.; Han, J.; Yang, X. Multivariate hierarchical multiscale fluctuation dispersion entropy: Applications to fault diagnosis of rotating machinery. Appl. Acoust. 2021, 182, 108271. [Google Scholar] [CrossRef]
Rao, D.; Shi, X.; Zhou, J.; Yu, Z.; Gou, Y.; Dong, Z.; Zhang, J. An Expert Artificial Intelligence Model for Discriminating Microseismic Events and Mine Blasts. Appl. Sci. 2021, 11, 6474. [Google Scholar] [CrossRef]
The Case Western Reserve University Bearing Data Center. Available online: https://csegroups.case.edu/bearingdatacenter/pages/download-data-file (accessed on 17 June 2021).
Huang, G.B.; Zhou, H.; Ding, X.; Zhang, R. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2011, 42, 513–529. [Google Scholar] [CrossRef]
Whitley, D. A genetic algorithm tutorial. Stat. Comput. 1994, 4, 65–85. [Google Scholar] [CrossRef]
Zheng, J.; Dong, Z.; Pan, H.; Ni, Q.; Liu, T.; Zhang, J. Composite multi-scale weighted permutation entropy and extreme learning machine based intelligent fault diagnosis for rolling bearing. Measurement 2019, 143, 69–80. [Google Scholar] [CrossRef]
Yang, Z.; Ce, L.; Lian, L. Electricity price forecasting by a hybrid model, combining wavelet transform, ARMA and kernel-based extreme learning machine methods. Appl. Energy 2017, 190, 291–305. [Google Scholar] [CrossRef]
Meng, L.; Liu, M.; Wei, P.; Qin, H. Rolling Bearing Fault Diagnosis Based on Improved VMD And GA-ELM. In Proceedings of the 2021 40th Chinese Control Conference (CCC), Shanghai, China, 26–28 July 2021; pp. 4414–4419. [Google Scholar]
Li, X.; Yang, Y.; Pan, H.; Cheng, J.; Cheng, J. Non-parallel least squares support matrix machine for rolling bearing fault diagnosis. Mech. Mach. Theory 2019, 145, 103676. [Google Scholar] [CrossRef]
Chen, P.; Zhao, X.; Zhu, Q. A novel classification method based on ICGOA-KELM for fault diagnosis of rolling bearing. Appl. Intell. 2020, 50, 2833–2847. [Google Scholar] [CrossRef]
Yuan, L.; Lian, D.; Kang, X.; Chen, Y.; Zhai, K. Rolling Bearing Fault Diagnosis Based on Convolutional Neural Network and Support Vector Machine. IEEE Access 2020, 8, 137395–137406. [Google Scholar] [CrossRef]
Liu, X.; Chen, Y.; Yang, J. A novel fault diagnosis method for rolling bearing based on EEMD-PE and multiclass relevance vector machine. In Proceedings of the 2017 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Turin, Italy, 22–25 May 2017; pp. 1–6. [Google Scholar]
Machinery Failure Prevention Technology (MFPT). Available online: https://mfpt.org/fault-data-sets (accessed on 26 October 2021).
Zuo, L.; Zhang, L.; Zhang, Z.-H.; Luo, X.-L.; Liu, Y. A spiking neural network-based approach to bearing fault diagnosis. J. Manuf. Syst. 2020, 61, 714–724. [Google Scholar] [CrossRef]
Wang, Z.; Shangguan, W.; Peng, C.; Cai, B. A fault diagnosis method based on data feature reconstruction and deep transfer learning. In Proceedings of the 2022 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), Dalian, China, 14–16 April 2022; pp. 1–5. [Google Scholar]
Sun, G.; Gao, Y.; Lin, K.; Hu, Y. Fine-Grained Fault Diagnosis Method of Rolling Bearing Combining Multisynchrosqueezing Transform and Sparse Feature Coding Based on Dictionary Learning. Shock Vib. 2019, 2019, 1531079. [Google Scholar] [CrossRef]
Sun, G.; Gao, Y.; Xu, Y.; Feng, W. Data-Driven Fault Diagnosis Method Based on Second-Order Time-Reassigned Multisynchrosqueezing Transform and Evenly Mini-Batch Training. IEEE Access 2020, 8, 120859–120869. [Google Scholar] [CrossRef]

Figure 1. The schematic diagram of hierarchical decomposition at layer n = 3.

Figure 2. The calculation flowchart of HRCMFDE.

Figure 3. Time domain waveforms and spectrograms of WGN and 1/f noise.

Figure 4. Mean and SD of HRCMFDE for WGN and 1/f noise at different

m

.

Figure 4. Mean and SD of HRCMFDE for WGN and 1/f noise at different

m

.

Figure 5. Mean and SD of HRCMFDE for WGN and 1/f noise at different

c

.

Figure 5. Mean and SD of HRCMFDE for WGN and 1/f noise at different

c

.

Figure 6. Four different entropy values for WGN and 1/f noise: (a,b) HFDE and HRCMFDE; (c,d) MFDE and RCMFDE.

Figure 7. The network structure of ELM.

Figure 8. The algorithm flow chart of PSO-ELM.

Figure 9. The flowchart of the rolling bearing fault diagnosis method based on HRCMFDE and PSO-ELM.

Figure 10. The time domain and spectral diagram of rolling bearing vibration signals with different types in the CWRU dataset.

Figure 11. The fault identification results of the proposed method with load 2 hp in the CWRU dataset.

Figure 12. The fault identification results of different feature extraction methods and PSO-ELM with load 0 hp in the CWRU dataset.

Figure 13. The fault identification accuracy of one load as the training set and another different load as the testing set in the CWRU dataset.

Figure 14. The fault identification accuracy of two loads as the training set and another different load as the testing set in the CWRU dataset.

Figure 15. The fault identification accuracy of two loads as the training set and another two varying loads as the testing set in the CWRU dataset.

Figure 16. The time domain and spectral diagram of rolling bearing vibration signals with different types in the MFPT dataset.

Figure 17. The fault identification results of the proposed method in the MFPT dataset.

Figure 18. The fault identification results of different feature extraction methods and PSO-ELM in the MFPT dataset.

Table 1. CVs of each node of WGN for

m

at the three cases.

Table 1. CVs of each node of WGN for

m

at the three cases.

n	1	2	3	4	5	6	7	8
m = 2	0.0211	0.0042	0.0066	0.0051	0.0140	0.0134	0.0105	0.0036
m = 3	0.0217	0.0045	0.0070	0.0057	0.0156	0.0132	0.0096	0.0038
m = 4	0.0211	0.0037	0.0067	0.0065	0.0169	0.0137	0.0101	0.0042

Table 2. CVs of each node of 1/f noise for

m

at the three cases.

Table 2. CVs of each node of 1/f noise for

m

at the three cases.

n	1	2	3	4	5	6	7	8
m = 2	0.0951	0.0061	0.0045	0.0048	0.0112	0.0162	0.0116	0.0112
m = 3	0.0972	0.0063	0.0044	0.0051	0.0115	0.0175	0.0134	0.0121
m = 4	0.0970	0.0052	0.0045	0.0055	0.0115	0.0180	0.0140	0.0129

Table 3. CVs of each node of WGN for

c

at the three cases.

Table 3. CVs of each node of WGN for

c

at the three cases.

n	1	2	3	4	5	6	7	8
c = 3	0.0301	0.0061	0.0102	0.0074	0.0950	0.1109	0.4438	0.4776
c = 6	0.0211	0.0042	0.0066	0.0051	0.0140	0.0134	0.0105	0.0036
c = 9	0.0176	0.0038	0.0061	0.0047	0.0108	0.0115	0.0127	0.0123

Table 4. CVs of each node of 1/f noise for

c

at the three cases.

Table 4. CVs of each node of 1/f noise for

c

at the three cases.

n	1	2	3	4	5	6	7	8
c = 3	0.1147	0.0085	0.0096	0.0062	0.0259	0.0533	0.1814	0.3336
c = 6	0.0951	0.0061	0.0045	0.0048	0.0112	0.0162	0.0116	0.0112
c = 9	0.0492	0.0046	0.0056	0.0037	0.0131	0.0208	0.0123	0.0119

Table 5. Parameter selection for HFDE, MFDE, RCMFDE, and HRCMFDE.

Entropy	Embedding Dimension	Number of Classes	Time Delay	Number of Hierarchical Layers	Maximum Scale Factor
HFDE	m = 2	c = 6	d = 1	n = 3	\
MFDE	m = 2	c = 6	d = 1	\	τ _max = 20
RCMFDE	m = 2	c = 6	d = 1	\	τ _max = 20
HRCMFDE	m = 2	c = 6	d = 1	n = 3	τ _max = 20

Table 6. Samples included in the CWRU experimental dataset.

Fault Types	Severity (Inch)	Load (hp)
Fault Types	Severity (Inch)	0	1	2	3
Normal	-	√	√	√	√
Ball fault	0.007	√	√	√	√
	0.014	√	√	√	√
	0.021	√	√	√	√
	0.028	√	√	√	√
Inner race fault	0.007	√	√	√	√
	0.014	√	√	√	√
	0.021	√	√	√	√
	0.028	√	√	√	√
Outer race fault	0.007	√	√	√	√
	0.014	√	√	√	√
	0.021	√	√	√	√
	0.028	*	*	*	*

Table 7. Sample division and label details of the training set and testing set in the CWRU dataset.

Labels	Fault Types	Abbreviations	Severity (inch)	Number of Training/Testing Samples
1	Normal	N	-	20/30
2	Ball fault	B007	0.007	20/30
3		B014	0.014	20/30
4		B021	0.021	20/30
5		B028	0.028	20/30
6	Inner race fault	I007	0.007	20/30
7		I014	0.014	20/30
8		I021	0.021	20/30
9		I028	0.028	20/30
10	Outer race fault	O007	0.007	20/30
11		O014	0.014	20/30
12		O021	0.021	20/30

Table 8. The accuracy of bearing fault classification with varying loads in the CWRU dataset.

Load	Fault Types	Number of Training Samples	Number of Testing Samples	Average Accuracy
0 hp	N B IR OR	20 80 80 60	30 120 120 90	99.86%
1 hp	N B IR OR	20 80 80 60	30 120 120 90	99.78%
2 hp	N B IR OR	20 80 80 60	30 120 120 90	100%
3 hp	N B IR OR	20 80 80 60	30 120 120 90	100%

Table 9. The identification accuracy of different feature extraction algorithms with varying loads in the CWRU dataset.

Feature Extraction	Fault Identification	Load 0 hp	Load 1 hp	Load 2 hp	Load 3 hp	Average Accuracy
HFDE [33]	PSO-ELM	99.72%	100%	100%	99.81%	99.88%
MFDE [28]	PSO-ELM	94.03%	96.69%	99.61%	99.42%	97.44%
RCMFDE [31]	PSO-ELM	98.36%	99.39%	100%	99.92%	99.42%
HRCMFDE	PSO-ELM	99.86%	99.78%	100%	100%	99.91%

Table 10. The accuracy and identification time for different fault identification algorithms with varying loads in the CWRU dataset.

Num.	Feature Extraction	Fault Identification	Accuracy (%)	Training Time (s)	Testing Time (s)
1	HRCMFDE	ELM [51]	99.56 ± 0.53	0.0285	0.0085
2	HRCMFDE	KELM [52]	97.94 ± 2.87	0.0046	0.0013
3	HRCMFDE	GA-ELM [53]	99.86 ± 0.16	231.3569	28.6741
4	HRCMFDE	PSO-ELM	99.91 ± 0.11	11.3320	1.2427

Table 11. The identification accuracy of other rolling bearing diagnosis methods in the CWRU dataset.

Literature	Feature Extraction	Fault Identification	Number of Classes	Average Accuracy (%)
[54]	WTFD	NPLSSMM	10	99.64
[55]	VMD+MPE	KPCA+CGOA-KELM	4	99.67
[56]	CWT	CNN-SVM	12	98.75
[25]	MAAPE	RF	10	96.00
[57]	EEMD+PE	M-RVM	4	99.58
[11]	RCMFDLZC	DAC	12	96.08
This paper	HRCMFDE	PSO-ELM	12	99.91

Table 12. Samples included in the MFPT experimental dataset.

Fault Classes	Fault Types	Load (lb)	Sample Rate (sps)	Sample Time (s)	Data Points
Normal	Baseline	270	97,656	6	585,936
Outer Race Fault	Outer Race Fault	270	97,656	6	585,936
	More Outer Race Fault	25	48,828	3	146,484
		50	48,828	3	146,484
		100	48,828	3	146,484
		150	48,828	3	146,484
		200	48,828	3	146,484
		250	48,828	3	146,484
		300	48,828	3	146,484
Inner Race Fault	Inner Race Fault	0	48,828	3	146,484
		50	48,828	3	146,484
		100	48,828	3	146,484
		150	48,828	3	146,484
		200	48,828	3	146,484
		250	48,828	3	146,484
		300	48,828	3	146,484

Table 13. Sample division and label details of the training set and testing set in the MFPT dataset.

Labels	Fault Classes	Number of Total Samples	Number of Training Samples	Number of Testing Samples
1	Normal	360	144	216
2	Outer Race Fault	790	316	474
3	Inner race fault	425	170	255

Table 14. The accuracy and identification time for different fault identification algorithms in the MFPT dataset.

Num.	Feature Extraction	Fault Identification	Accuracy (%)	Training Time (s)	Testing Time (s)
1	HRCMFDE	ELM	97.90 ± 0.49	0.0318	0.0198
2	HRCMFDE	KELM	97.71 ± 0.49	0.0371	0.0293
3	HRCMFDE	GA-ELM	99.49 ± 0.48	318.0357	150.4896
4	HRCMFDE	PSO-ELM	99.43 ± 0.38	17.6277	5.1201

Table 15. The identification accuracy of other rolling bearing diagnosis methods in the MFPT dataset.

Literature	Feature Extraction	Fault Identification	Number of Classes	Average Accuracy (%)
[56]	CWT	CNN-SVM	3	98.89
[59]	LMD	SNN	3	99.31
[60]	WT	IGoogLeNet	3	99.40
[61]	MSST+SFC-DL	LSVM	3	95.83
[62]	STMSST	CNN	3	98.67
[11]	RCMFDLZC	DAC	3	96.05
This paper	HRCMFDE	PSO-ELM	3	99.43

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Y.; Yuan, Z.; Chen, J.; Sun, K. A Novel Fault Diagnosis Method for Rolling Bearing Based on Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy and PSO-ELM. Entropy 2022, 24, 1517. https://doi.org/10.3390/e24111517

AMA Style

Chen Y, Yuan Z, Chen J, Sun K. A Novel Fault Diagnosis Method for Rolling Bearing Based on Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy and PSO-ELM. Entropy. 2022; 24(11):1517. https://doi.org/10.3390/e24111517

Chicago/Turabian Style

Chen, Yinsheng, Zichen Yuan, Jiahui Chen, and Kun Sun. 2022. "A Novel Fault Diagnosis Method for Rolling Bearing Based on Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy and PSO-ELM" Entropy 24, no. 11: 1517. https://doi.org/10.3390/e24111517

APA Style

Chen, Y., Yuan, Z., Chen, J., & Sun, K. (2022). A Novel Fault Diagnosis Method for Rolling Bearing Based on Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy and PSO-ELM. Entropy, 24(11), 1517. https://doi.org/10.3390/e24111517

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Fault Diagnosis Method for Rolling Bearing Based on Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy and PSO-ELM

Abstract

1. Introduction

2. Methodologies

2.1. Feature Extraction

2.1.1. Multiscale Fluctuation-Based Dispersion Entropy (MFDE)

2.1.2. Refined Composite Multiscale Fluctuation-Based Dispersion Entropy (RCMFDE)

2.1.3. Hierarchical Refined Composite Multiscale Fluctuation-Based Dispersion Entropy (HRCMFDE)

2.2. Fault Identification

2.2.1. Extreme Learning Machine (ELM)

2.2.2. Particle Swarm Optimization-Based Extreme Learning Machine (PSO-ELM)

3. Proposed Method

3.1. Data Preprocessing

3.2. Training Process

3.3. Testing Process

4. Experiments

4.1. Case 1: CWRU Dataset

4.1.1. Experiment Setup

4.1.2. Feature Extraction

4.1.3. Fault Identification

4.1.4. Performance Comparison

4.1.5. Load Migration

4.2. Case 2: MFPT Fault Dataset

4.2.1. Experiment Setup

4.2.2. Feature Extraction

4.2.3. Fault Identification

4.2.4. Performance Comparison

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI