Gearbox Fault Diagnosis Based on Hierarchical Instantaneous Energy Density Dispersion Entropy and Dynamic Time Warping

The accurate fault diagnosis of gearboxes is of great significance for ensuring safe and efficient operation of rotating machinery. This paper develops a novel fault diagnosis method based on hierarchical instantaneous energy density dispersion entropy (HIEDDE) and dynamic time warping (DTW). Specifically, the instantaneous energy density (IED) analysis based on singular spectrum decomposition (SSD) and Hilbert transform (HT) is first applied to the vibration signal of gearbox to acquire the IED signal, which is designed to reinforce the fault feature of the signal. Then, the hierarchical dispersion entropy (HDE) algorithm developed in this paper is used to quantify the complexity of the IED signal to obtain the HIEDDE as fault features. Finally, the DTW algorithm is employed to recognize the fault types automatically. The validity of the two parts that make up the HIEDDE algorithm, i.e., the IED analysis for fault features enhancement and the HDE algorithm for quantifying the information of signals, is numerically verified. The proposed method recognizes the fault patterns of the experimental data of gearbox accurately and exhibits advantages over the existing methods such as multi-scale dispersion entropy (MDE) and refined composite MDE (RCMDE).


Introduction
As an important component for transmitting energy and adjusting speed, gearbox is used in a wide variety of industrial machinery, such as wind turbines, aircraft engines, and automobiles [1,2]. Its failure is one of the main factors that cause the machinery to stop [3]. Thus, the accurate fault diagnosis of gearbox is highly required.
Generally, the existing fault diagnosis approaches can be classified into two types: Model-based and data-driven [4][5][6]. The model-based approach depends on an expansive system and dynamic knowledge [6]. The data-driven approach does not require an exhaustive system or dynamic knowledge, which is more conducive to intelligent diagnosis. Thus, the data-driven approaches have been widely used in gearbox fault diagnosis. Fault feature extraction is a key step in the data driven fault diagnosis methods and has a decisive impact on the final result [7]. At present, the typical and mature fault feature extraction algorithms are mainly based on processing the measurements such as vibration signal [8][9][10], acoustic signal [11,12], and temperature [13]. Among them, the algorithms based on vibration signal processing are favored by researchers because of their obvious advantages, such as low sensor price, simple signal acquisition process, and rich information contained in the signal. The complicated transmission path and running environment contribute to the nonlinear-nonstationary property and noise interference of the gearbox vibration signal, which increases the difficulty of fault required to form the template signal [45]. A novel gearbox fault diagnosis framework based on HIEDDE and DTW is formulated. Finally, the proposed fault diagnosis framework is tested by the experimental data of gearbox.
The outline of this study is as follows. In Section 2, the details of the proposed HIEDDE algorithm based on the IED analysis and the HDE algorithm are introduced. Section 3 briefly illustrates the principle of DTW. Section 4 introduces the specific implementation procedures of the proposed method. Section 5 presents the experimental test. Finally, a few conclusions are presented in Section 6.

IED Analysis
The vibration signal of gearbox is usually accompanied by interferences. Thus, a novel IED analysis algorithm based on SSD and HT is introduced to reinforce the fault feature signal. The SSD algorithm is used to obtain the useful potential components. Then, HT is employed to detect the time-frequency information of these components. Finally, the IED signal is obtained based on the time-frequency information.

SSD
SSD is a parameter-less signal-processing algorithm that can reach the signal decomposition adaptively. It is regarded as a modified version of SSA as it achieves the adaptive selection of the embedding dimension and can complete the automatic principal components grouping. The SSD decomposition for a composite signal is mainly composed of four steps as follows [29].
For a discrete signal a(n) that has N samples, a(n) = (a 1 , a 2 , . . . , a N ), the form of its trajectory matrix A is closely related to the selection of embedding dimension. If the embedding dimension is selected as K, A will be a K × N matrix, whose k-th row is a k = (a k , . . . , a N , a 1 , . . . , a k−1 ). That is, A = a T 1 , a T 2 , · · · , a T K T . How to choose the embedding dimension reasonably is a difficult problem of the original SSA algorithm. The following procedures are adopted in SSD to adaptively choose the embedding dimension at iteration j: First, perform the calculation of the power spectral density (PSD) of the residual component v j (n) for iteration j, v j (n) = a(n) − j−1 k=1 v k (n) (v 0 (n)=a(n)). Then, the frequency f max at the maximum peak of the calculated PSD is distinguished. When j = 1, the residual component is regarded as a trend item if f max is smaller than a given threshold, which is set as 0.01F s in this paper, (F s represents the sampling frequency). In this situation, K is recommended as N/3. In other situations, K = 1.2F s /f max .
The trajectory matrix A obtained in Step 1 carries the mixed information of all component signals. Thus, it is subjected to singular valued decomposition (SVD): where P∈R K×K , ∈R K×N , Q∈R N×N . {α 1 , α 2 , . . . , α K } are the singular values in matrix , A i = α i p i q i represents the i-th principal component of A.
Step 3: Principal components grouping and reconstruction of the j-th SSC.
As represented in Equation (1), the trajectory matrix is divided into K principal components by SVD. The key to the successful separation of a potential component is to sort these principal components reasonably. The following criterion is used in SSD for selecting principal components to reconstruct the SSC at the j-th iteration: For j = 1, if a trend item is identified, only the first principal component A 1 is chosen as the useful principal component, and the first SSC is retrieved through the diagonal averaging of A 1 .
Otherwise, and for j > 1, a subset principal components, whose left eigenvectors show a primary frequency in [f max −f 1 , f max + f 1 ] (where f 1 is estimated through the Gaussian interpolation of the PSD of v j (n)), are combined to form a matrix A'. The j-th SSC is retrieved through the diagonal averaging of A'.
Once a component SSC j (n) is retrieved, a corresponding residual signal can be obtained as: v j + 1 (n) is then employed as the input to retrieve the (j + 1)-th SSC. If the result of the equation below is smaller than a given threshold, the iteration will be stopped.

IED Calculation Based on SSD and HT
After the SSD process, the composite signal a(t) can be expressed as the sum of k SSCs and a residual signal: where SSC j (t) represents the j-th SSC and r(t) is the residual signal. Previous studies have indicated that SSD has stronger signal decomposition capability than the traditional signal decomposition approaches such as EMD and EEMD [30][31][32]. Thus, it is integrated with HT to make the time-frequency analysis. HT is used to estimate the instantaneous amplitude (IA) and instantaneous frequency (IF) of the decomposed components by referring to Reference [18]. The SSD-HT spectrogram can be constructed by lumping the IAs and IFs of all SSCs: where k represents the number of the decomposed components, a j (t) and ω j (t) are the IA and IF of the j-th SSC, respectively. We can further obtain the IED signal with the following form: The IED reflects the energy fluctuation of the signal over time. However, some of the decomposed SSCs may be false components because the vibration signal of gearbox contains interferences. In this paper, the correlation coefficient between the original signal a(t) and the decomposed components is used to discard the false components [46]: where ρ j is the correlation coefficient between a(t) and the j-th SSC, COV(SSC j (t), a(t)) represents the covariance of SSC j (t) and a(t), and σ j and σ a , respectively, correspond to the variance of SSC j (t) and a(t).
If ρ j is larger than a threshold (which is set to 0.2 max (ρ 1 , ρ 1 , . . . , ρ k ) in this paper), SSC j (t) is seen as a real component. All the real SSCs are selected out to calculate the IED to enhance the fault signatures of gearbox.

Verification of the Fault Feature Enhancement Capability of IED
A mixed signal containing the distributed fault signal of gear and noise is tested to examine the fault feature enhancement capability of the IED analysis. The fault signal is generated using the model proposed by McFadden [47]: where L is the harmonic orders, S k and ϕ k are the amplitude and phase of the k-th meshing harmonic, z and f r represent the number of teeth and rotating frequency of the faulty gear, a k (t) is the amplitude modulation (AM) item and b k (t) represents the phase modulation (PM) item. The AM and PM items are, respectively, defined as: A km cos(2πm f r t + a km ), where M is the total orders of the k-th meshing harmonic, A km and a km denote the amplitude and phase of AM, B km and b km represent the amplitude and phase of PM. The other parameters are adopted as: L = 3, S k = 1, z = 30, f r = 12 Hz, ϕ k = 0, A km = 0.9, a km = 0, B km = 0.1, b km = 0. The meshing frequency can be calculated as: f m = 30 × 12 = 360 Hz. In order to meet the requirement of frequency spectrum analysis and sampling theorem, the sampling frequency should be at least 3-5 times the meshing frequency; we select the sampling frequency as 10,000 Hz. Moreover, the signal length is set as 8192 to ensure a higher accuracy of Fourier transform. Equation (11) is adopted to produce the noise: where N is the signal length. By adding the noise to the fault signal, the tested signal is yielded as shown in Figure 1a. From its amplitude spectrum as represented in Figure 1b, there are three harmonics of meshing frequency. However, the rotating frequency of the faulty gear cannot be located.
where N is the signal length. By adding the noise to the fault signal, the tested signal is yielded as shown in Figure 1a. From its amplitude spectrum as represented in Figure 1b, there are three harmonics of meshing frequency. However, the rotating frequency of the faulty gear cannot be located. Then, SSD is used to decompose the tested signal, and four SSCs are obtained as shown in Figure  2a. Table 1 shows the correlation coefficients between the tested signal and the SSCs. Based on the selection criterion as introduced in Section 2.2, SSC2 is determined as a false signal. The other three Then, SSD is used to decompose the tested signal, and four SSCs are obtained as shown in Figure 2a. Table 1 shows the correlation coefficients between the tested signal and the SSCs. Based on the selection criterion as introduced in Section 2.2, SSC 2 is determined as a false signal. The other three SSCs are demodulated by HT to obtain the SSD-HT spectrogram, as shown in Figure 2b. Obviously, the SSD-HT spectrogram reflects three frequency components, i.e., f m , 2f m and 3f m . Figure 2c represents the IED signal. From its amplitude spectrum as shown in Figure 2d, the first and second harmonics of the rotating frequency of the faulty gear and the meshing frequency are clearly located. The fault feature enhancement capability of the IED analysis is verified. SSCs are demodulated by HT to obtain the SSD-HT spectrogram, as shown in Figure 2b. Obviously, the SSD-HT spectrogram reflects three frequency components, i.e., fm, 2fm and 3fm. Figure 2c represents the IED signal. From its amplitude spectrum as shown in Figure 2d, the first and second harmonics of the rotating frequency of the faulty gear and the meshing frequency are clearly located. The fault feature enhancement capability of the IED analysis is verified.  For comparison, the simulated gear fault signal is also analyzed by the IED analysis based on EMD and HT. Twelve components are generated after the EMD process, as shown in Figure 3a. From the waveforms of these components, it can be seen that the fifth to twelfth components are false components. Thus, the first four components retrieved by EMD are analyzed by HT to acquire the EMD-HT spectrogram, as depicted in Figure 3b. Figure 3c,d represent the corresponding IED signal and its amplitude spectrum, respectively. Only the rotating frequency of the faulty gear appears in Figure 3d, while its harmonics and the meshing frequency are invisible. The superiority of the IED analysis based on SSD is demonstrated.  For comparison, the simulated gear fault signal is also analyzed by the IED analysis based on EMD and HT. Twelve components are generated after the EMD process, as shown in Figure 3a. From the waveforms of these components, it can be seen that the fifth to twelfth components are false components. Thus, the first four components retrieved by EMD are analyzed by HT to acquire the EMD-HT spectrogram, as depicted in Figure 3b.

Dispersion Entropy
DE is a novel entropy index for quantifying the certainty of signals. Its advantages over the traditional entropy methods have been investigated by many scholars [38][39][40]. For a signal that has N sampling points, a(n) = (a1, a2,…, aN), its DE can be calculated through the following procedures [37]: , is obtained by applying normal cumulative distribution functions (NCDF) to the original signal a(n):

Dispersion Entropy
DE is a novel entropy index for quantifying the certainty of signals. Its advantages over the traditional entropy methods have been investigated by many scholars [38][39][40]. For a signal that has N sampling points, a(n) = (a 1 , a 2 , . . . , a N ), its DE can be calculated through the following procedures [37]: , is obtained by applying normal cumulative distribution functions (NCDF) to the original signal a(n): where σ 2 and u indicate the variance and mean of a(n), respectively.
Previous studies [36][37][38][39][40] recommend that the parameters selection for the DE algorithm should be: m = 2, c = 6, τ = 1. Thus, we take this suggestion in all the calculation of DE in this paper.

Hierarchical Dispersion Entropy
In order to better detect the regularity of the complex signals, some multi-scale forms of DE, such as MDE and RCMDE, have been developed. However, previous studies reveal that the multi-scale analysis algorithms only consider the low-frequency components of the signal and cannot completely detect the hidden fault features [41][42][43]. Thus, a hierarchical DE (HDE) algorithm is proposed by integrating the hierarchical decomposition and DE to quantify the information of the signal more comprehensively.
For a 1 d signal A = (a 1 , a 2 , . . . , a N ) with the length of N (N = 2 n , n represents a positive integer), its calculation procedures of HDE include: (1) An average operator q 0 and a difference operator q 1 are respectively constructed as follows: where q 0 (A) and q 1 (A) carry the low-frequency and high-frequency features of A at scale 2, respectively.
(2) The matrix expression of the operators q j (j = 0 or 1) is obtained as the following form: (3) In order to perform the hierarchical analysis on the signal A, the above operators have to be employed iteratively. Let p ∈ N, a vector [β 1 , β 2 , . . . , β p ]∈{0, 1} can be designed to describe the integer d: It can be inferred that for a given integer E, there is a unique vector [β 1 , β 2 , . . . , β p ] corresponding to it.
(4) The hierarchical component A k , d (where k and d are the layer number and node number, respectively) is expressed as: (5) The DE of the hierarchical component at node d and layer k is calculated to obtain the HDE: The hierarchical decomposition of signal A with three layers is displayed in Figure 4 to better illustrate its principle. It can be found that there are 2 k hierarchical components at layer k. Figure 5 represents the calculation flowchart of HDE. In this paper, the HDE in layer 3 (which contains 8 nodes), which is used to characterize the information of the signal.
(3) In order to perform the hierarchical analysis on the signal A, the above operators have to be employed iteratively. Let p ∈ N, a vector [β1, β2,…, βp]∈{0, 1} can be designed to describe the integer d: It can be inferred that for a given integer E, there is a unique vector [β1, β2,…, βp] corresponding to it.
(4) The hierarchical component Ak,d (where k and d are the layer number and node number, respectively) is expressed as: (5) The DE of the hierarchical component at node d and layer k is calculated to obtain the HDE: The hierarchical decomposition of signal A with three layers is displayed in Figure 4 to better illustrate its principle. It can be found that there are 2 k hierarchical components at layer k. Figure 5 represents the calculation flowchart of HDE. In this paper, the HDE in layer 3 (which contains 8 nodes), which is used to characterize the information of the signal.

Effectiveness Evalution of HDE
A Gaussian white noise (GWN) with 4096 sampling points, as depicted in Figure 6a is studied to compare the effectiveness of HDE and MDE. From its magnitude spectrum, as shown in Figure 6b, we can deduce that the complexity of GWN in different frequency bands is almost the same. The HDE (with three layers) and MDE (with eight scales) of the tested GWN are computed, as displayed in Figure 6c. The HDE values of eight hierarchical components are almost equal. Whereas, the MDE value gradually decreases as the scale increases. In theory, the DE values that correspond to different frequency bands should not change much. The HDE more realistically reflects the complexity of the studied GWN signal compared to the MDE. In addition, 100 independent sets of GWN (each set comprises 4096 samples) are tested to reveal the advantage of HDE over MDE. The median and standard deviation (SD) of HDE and MDE are calculated to obtain the error bars as shown in Figure 7. The SD of HDE in different nodes is very tiny, which illustrates the HDE algorithm has strong stability. However, the SD of MDE increases monotonically with scale. Starting from sequence number 2, the SD value of MDE is significantly larger than the SD value of HDE, which implies that the HDE algorithm is more stable.  In addition, 100 independent sets of GWN (each set comprises 4096 samples) are tested to reveal the advantage of HDE over MDE. The median and standard deviation (SD) of HDE and MDE are calculated to obtain the error bars as shown in Figure 7. The SD of HDE in different nodes is very tiny, which illustrates the HDE algorithm has strong stability. However, the SD of MDE increases monotonically with scale. Starting from sequence number 2, the SD value of MDE is significantly larger than the SD value of HDE, which implies that the HDE algorithm is more stable.
In addition, 100 independent sets of GWN (each set comprises 4096 samples) are tested to reveal the advantage of HDE over MDE. The median and standard deviation (SD) of HDE and MDE are calculated to obtain the error bars as shown in Figure 7. The SD of HDE in different nodes is very tiny, which illustrates the HDE algorithm has strong stability. However, the SD of MDE increases monotonically with scale. Starting from sequence number 2, the SD value of MDE is significantly larger than the SD value of HDE, which implies that the HDE algorithm is more stable.

Hierarchical Instantaneous Energy Density Dispersion Entropy
Integrating the merits of IED and HDE, a novel fault signatures algorithm named hierarchical instantaneous energy density dispersion entropy (HIEDDE) is put forward. In this algorithm, the IED analysis is first used for fault features enhancement. The HDE process is then employed to quantify the information of the IED signal to get the HIEDDE. Figure 8 shows the flowchart of the HIEDDE algorithm. Total five procedures are needed to implement the HIEDDE algorithm: (1) Apply the SSD algorithm to the vibration signal to generate k SSCs.
(2) Calculate the correlation coefficient between the original signal and the j-th SSC, ρ j (j = 1, 2, . . . , k). If ρ j is larger than a threshold, as given in Section 2.2, SSC j (t) is seen as a real component. Integrating the merits of IED and HDE, a novel fault signatures algorithm named hierarchical instantaneous energy density dispersion entropy (HIEDDE) is put forward. In this algorithm, the IED analysis is first used for fault features enhancement. The HDE process is then employed to quantify the information of the IED signal to get the HIEDDE. Figure 8 shows the flowchart of the HIEDDE algorithm. Total five procedures are needed to implement the HIEDDE algorithm: (1) Apply the SSD algorithm to the vibration signal to generate k SSCs.
(2) Calculate the correlation coefficient between the original signal and the j-th SSC, ρj (j = 1, 2,…, k). If ρj is larger than a threshold, as given in Section 2.2, SSCj(t) is seen as a real component. Otherwise, SSCj(t) is excluded as a false component.
(3) Apply HT to all real SSCs identified in step (2) to obtain the SSD-HT spectrogram.

Dynamic Time Warping
DTW was originally developed for speech classification [44]. In recent years, it has been introduced to the field of mechanical fault diagnosis [45]. The advantage of DTW over existing classifiers is that it requires only a small number of samples to create the template signal while

Dynamic Time Warping
DTW was originally developed for speech classification [44]. In recent years, it has been introduced to the field of mechanical fault diagnosis [45]. The advantage of DTW over existing classifiers is that it requires only a small number of samples to create the template signal while ensuring high classification accuracy. It is used to qualify the similarity between two time sequences based on the DTW distance, which is calculated by finding the optimal alignment. The smaller DTW distance between the two signals represents the higher similarity. The following gives the specific description of DTW: Given two time series P = {p i }, I = 1, . . . , m and Q = {q i }, I = 1, . . . , n, where m and n represent the length of P and Q, respectively. A m×n distance matrix D is firstly constructed with the elements d(p i , q j ), which are the Euclidean distance between the points p i and q j . Then, a path through the matrix D, which minimizes the cumulative distance between the two sequences can be found. DTW distance corresponds to the path with minimal warping cost: where w k is the matrix element and also belongs to the k-th element of a warping path W. The warping path needs to reach the following three constraints: (a) Boundary condition: the starting point is w 1 = (1,1) and the ending point is w K = (m, n).
The Warping path can be determined by using dynamic programming as follows: where γ(i, j) is the sum of d(p i , q j ) and the minimum cumulative distance of three adjacent elements.

The Proposed Fault Diagnosis Method
A novel fault diagnosis method of gearbox based on the HIEDDE and DTW is proposed in this study. Figure 9 represents its implementation process, which is mainly composed of three steps: (1) Collect the vibration data of gearbox under different health conditions. The vibration data of each condition is divided into some non-overlapping data groups. For each state, one data group is used for template sample and the other groups are used as testing samples. The number of the template samples is equal to the number of fault types.
(2) Calculate the HIEDDE of the vibration data as the fault features. The hierarchical layer is set to three so that the HIEDDE of each data sample contains the information of eight hierarchical components.
(3) Employ DTW for fault pattern recognition. Based on the HIEDDE fault features, the DTW distance between the testing samples and template samples are calculated, and the fault type of the testing sample can be recognized based on the label of the template sample with the smallest distance. template samples is equal to the number of fault types.
(2) Calculate the HIEDDE of the vibration data as the fault features. The hierarchical layer is set to three so that the HIEDDE of each data sample contains the information of eight hierarchical components.
(3) Employ DTW for fault pattern recognition. Based on the HIEDDE fault features, the DTW distance between the testing samples and template samples are calculated, and the fault type of the testing sample can be recognized based on the label of the template sample with the smallest distance.

Experimental Analysis
In this section, the presented fault diagnosis approach is employed to analyze the experimental data of gearbox under different states to test its validity. The data is obtained through the QPZZ-II rotating machinery fault simulation experiment platform. From the schematic diagram of the experiment platform, as depicted in Figure 10a, it can be found that the platform for gearbox faults simulation mainly consist of electric motor, belt drive, shaft support, gearbox, and power twister. The gearbox contains a small gear with 55 teeth installed on the input shaft and a big gear with 75 teeth mounted on the output gear. Table 2 represents the parameters illustration of the two gears. Four accelerometers were mounted on the housing of gearbox to measure the vertical vibration and two eddy current sensors are responsible for monitoring the radial vibration of the input bearing. The installation positions of the sensors are marked in Figure 10a. Figure 10b shows the sensor distribution from two perspectives. As displayed, the four accelerometers are numbered 1 ~4 , respectively. The rotating speed of the input shaft in the experiment is 880 r/min. The meshing frequency of the small gear and big gear is 807 Hz. The sampling frequency is 5120 Hz. accelerometers were mounted on the housing of gearbox to measure the vertical vibration and two eddy current sensors are responsible for monitoring the radial vibration of the input bearing. The installation positions of the sensors are marked in Figure 10a. Figure 10b shows the sensor distribution from two perspectives. As displayed, the four accelerometers are numbered ①~④, respectively. The rotating speed of the input shaft in the experiment is 880 r/min. The meshing frequency of the small gear and big gear is 807 Hz. The sampling frequency is 5120 Hz.  (2) a pitting tooth (PT) on the output gear; (3) a broken tooth (BT) on the output gear; (4) some wear teeth (WT) on the input gear; and (5) a compound fault condition with a BT on the output gear and a WT on the input gear (BT-WT). Figure 11 represents the damaged gears with different fault types employed in the test. For each state, the experimental signal is segmented into 41 data groups with the length of 4096. In order to ensure the independence of each group and test the effectiveness of the proposed method more reasonably and strictly, the data groups are nonoverlapping. One data group of each state is selected as the template sample and the other 40 groups are regarded as the testing samples. Table 3 represents the details of the experimental signals.  Experimental signals were collected under four health conditions including: (1) Normal condition; (2) a pitting tooth (PT) on the output gear; (3) a broken tooth (BT) on the output gear; (4) some wear teeth (WT) on the input gear; and (5) a compound fault condition with a BT on the output gear and a WT on the input gear (BT-WT). Figure 11 represents the damaged gears with different fault types employed in the test. For each state, the experimental signal is segmented into 41 data groups with the length of 4096. In order to ensure the independence of each group and test the effectiveness of the proposed method more reasonably and strictly, the data groups are non-overlapping. One data group of each state is selected as the template sample and the other 40 groups are regarded as the testing samples. Table 3 represents the details of the experimental signals.    One testing sample of each health condition is used to illustrate the process of the proposed fault diagnosis approach. Figure 12 shows these samples under different health conditions. Despite the impacts of the data samples under the four fault states (PT, BT, WT, and BT-WT) being more prominent than the normal state, the fault states cannot be recognized because their impact features are somewhat similar.   Normal  1  1  40  PT  2  1  40  BT  3  1  40  WT  4  1  40  BT-WT  5  1  40 One testing sample of each health condition is used to illustrate the process of the proposed fault diagnosis approach. Figure 12 shows these samples under different health conditions. Despite the impacts of the data samples under the four fault states (PT, BT, WT, and BT-WT) being more prominent than the normal state, the fault states cannot be recognized because their impact features are somewhat similar. Then, the IED analysis is applied to the five selected data samples and the acquired corresponding IED signals are shown in Figure 13. Compared to the original signals, the impacts of Then, the IED analysis is applied to the five selected data samples and the acquired corresponding IED signals are shown in Figure 13. Compared to the original signals, the impacts of the IED signal are more obvious as the noise is inhibited by the IED analysis. The HDE of the IED signals are calculated to obtain the HIEDDE fault features, as illustrated in Figure 14. The HIEDDE fault features under the four health conditions can be distinguished easily. The HIEDDE values of the eight nodes under health state are bigger than the three fault states. This implies that the signals under fault states are more regular. Based on the HIEDDE fault features, the DTW distances between the testing samples and the template samples are calculated and the results are represented in Figure 15. The labels of the five template samples correspond to the normal condition, PT fault, BT fault, WT and BT-WT faults are 1, 2, 3, 4, and 5, respectively. Figure 15a shows that the DTW distances between the testing sample under the normal condition and the five template samples have the minimum value at template label 1, which indicates that the health condition type of the normal state is accurately recognized. At the same time, we can find that the fault types of the testing samples under the four fault states are all identified based on the results shown in Figure 15b-e. Figure 16 shows the classification rate of the total 200 groups of testing data samples under the four health conditions by using the proposed method (HIEDDE-DTW). The classification rate is 100%.

Fault Type Label Number of Template Signal Number of Testing Samples
fault features under the four health conditions can be distinguished easily. The HIEDDE values of the eight nodes under health state are bigger than the three fault states. This implies that the signals under fault states are more regular. Based on the HIEDDE fault features, the DTW distances between the testing samples and the template samples are calculated and the results are represented in Figure  15. The labels of the five template samples correspond to the normal condition, PT fault, BT fault, WT and BT-WT faults are 1, 2, 3, 4, and 5, respectively. Figure 15a shows that the DTW distances between the testing sample under the normal condition and the five template samples have the minimum value at template label 1, which indicates that the health condition type of the normal state is accurately recognized. At the same time, we can find that the fault types of the testing samples under the four fault states are all identified based on the results shown in Figure 15b-e. Figure 16 shows the classification rate of the total 200 groups of testing data samples under the four health conditions by using the proposed method (HIEDDE-DTW). The classification rate is 100%.    HIEDDE is the combination of these two algorithms, i.e., the IED analysis and the proposed HDE algorithm. In order to illustrate the significance of the combination of the two algorithms, the HDE algorithm is directly applied to the five original data samples, as shown in Figure 12, and the corresponding HDE fault features are displayed in Figure 17. The HDE values of some nodes under different health conditions are almost coincidental. The difference of the HDE fault feature under the four health conditions shown in Figure 17 is not as high as that of the HIEDDE fault features under the five health conditions displayed in Figure 14, which illustrates that the IED analysis can help to enhance the fault features. Figure 18 shows the classification rate of all testing samples by using the HDE fault features. Five testing samples of the PT fault are judged as the BT fault, and five testing samples of the BT-WT fault are misclassified, which enables the classification rate to be 95%. HIEDDE is the combination of these two algorithms, i.e., the IED analysis and the proposed HDE algorithm. In order to illustrate the significance of the combination of the two algorithms, the HDE algorithm is directly applied to the five original data samples, as shown in Figure 12, and the corresponding HDE fault features are displayed in Figure 17. The HDE values of some nodes under different health conditions are almost coincidental. The difference of the HDE fault feature under the four health conditions shown in Figure 17 is not as high as that of the HIEDDE fault features under the five health conditions displayed in Figure 14, which illustrates that the IED analysis can help to enhance the fault features. Figure 18 shows the classification rate of all testing samples by using the HDE fault features. Five testing samples of the PT fault are judged as the BT fault, and five testing samples of the BT-WT fault are misclassified, which enables the classification rate to be 95%. different health conditions are almost coincidental. The difference of the HDE fault feature under the four health conditions shown in Figure 17 is not as high as that of the HIEDDE fault features under the five health conditions displayed in Figure 14, which illustrates that the IED analysis can help to enhance the fault features. Figure 18 shows the classification rate of all testing samples by using the HDE fault features. Five testing samples of the PT fault are judged as the BT fault, and five testing samples of the BT-WT fault are misclassified, which enables the classification rate to be 95%.   ( Test accuracy = 95%) Figure 17. The HDE fault features of the four data samples shown in Figure 11. different health conditions are almost coincidental. The difference of the HDE fault feature under the four health conditions shown in Figure 17 is not as high as that of the HIEDDE fault features under the five health conditions displayed in Figure 14, which illustrates that the IED analysis can help to enhance the fault features. Figure 18 shows the classification rate of all testing samples by using the HDE fault features. Five testing samples of the PT fault are judged as the BT fault, and five testing samples of the BT-WT fault are misclassified, which enables the classification rate to be 95%.   To explain the superiority of the proposed method, the MDE and RCMDE algorithms are integrated with the IED analysis to analyze the experimental signals. Figures 19 and 20 illustrate the DTW classification results of the IED-MDE and IED-RCMDE approaches, respectively. The classification rate of the IED-MDE and IED-RCMDE approaches are 90.5% and 93%, respectively. The superiority of the proposed method is emphasized by the comparing results. To explain the superiority of the proposed method, the MDE and RCMDE algorithms are integrated with the IED analysis to analyze the experimental signals. Figures 19 and 20 illustrate the DTW classification results of the IED-MDE and IED-RCMDE approaches, respectively. The classification rate of the IED-MDE and IED-RCMDE approaches are 90.5% and 93%, respectively. The superiority of the proposed method is emphasized by the comparing results.

Conclusions
To make an accurate fault diagnosis of gearbox, a novel fault diagnosis framework is put forward by combing HIEDDE and DTW. The major innovations can be summarized as follows: (1) The IED analysis based on SSD and HT is introduced for fault feature enhancement of gearbox signals; (2) a novel entropy algorithm named HDE, which is able to quantify the information of the low-frequency and high-frequency components of signals, is presented; and (3) the advantages of IED and HDE are integrated to develop the novel fault feature extraction algorithm, i.e., HIEDDE.
The analysis results of a simulated gear fault signal illustrate the fault feature enhancement ability of the IED analysis. The analysis results of the GWN signal demonstrate that the HDE algorithm is more stable and has higher accuracy compared with the MDE algorithm. The proposed method is used to identify the experimental fault patterns of gearbox. The results indicate the proposed method is able to recognize the fault patterns of gearbox accurately and has advantages over the existing methods.
It should be noticed that the validity of the proposed method is only studied by the experimental data obtained on the laboratory test bench. While satisfactory results have been obtained, the validity of the proposed method for practical engineering data needs to be further studied. The degree of fault difference between fault samples and the complexity of the fault may affect the accuracy of the proposed method. Moreover, the number of fault types that the proposed method can diagnose depends on the number of fault templates in the sample library. Only by establishing a rich sample library can the proposed method better solve the problem of gearbox fault diagnosis in engineering equipment.

Conclusions
To make an accurate fault diagnosis of gearbox, a novel fault diagnosis framework is put forward by combing HIEDDE and DTW. The major innovations can be summarized as follows: (1) The IED analysis based on SSD and HT is introduced for fault feature enhancement of gearbox signals; (2) a novel entropy algorithm named HDE, which is able to quantify the information of the low-frequency and high-frequency components of signals, is presented; and (3) the advantages of IED and HDE are integrated to develop the novel fault feature extraction algorithm, i.e., HIEDDE.
The analysis results of a simulated gear fault signal illustrate the fault feature enhancement ability of the IED analysis. The analysis results of the GWN signal demonstrate that the HDE algorithm is more stable and has higher accuracy compared with the MDE algorithm. The proposed method is used to identify the experimental fault patterns of gearbox. The results indicate the proposed method is able to recognize the fault patterns of gearbox accurately and has advantages over the existing methods.
It should be noticed that the validity of the proposed method is only studied by the experimental data obtained on the laboratory test bench. While satisfactory results have been obtained, the validity of the proposed method for practical engineering data needs to be further studied. The degree of fault difference between fault samples and the complexity of the fault may affect the accuracy of the proposed method. Moreover, the number of fault types that the proposed method can diagnose depends on the number of fault templates in the sample library. Only by establishing a rich sample library can the proposed method better solve the problem of gearbox fault diagnosis in engineering equipment.