Multiscale Entropy Feature Extraction Method of Running Power Equipment Sound

The equipment condition monitoring based on computer hearing is a new pattern recognition approach, and the system formed by it has the advantages of noncontact and strong early warning abilities. Extracting effective features from the sound data of the running power equipment help to improve the equipment monitoring accuracy. However, the sound of running equipment often has the characteristics of serious noise, non-linearity and instationary, which makes it difficult to extract features. To solve this problem, a feature extraction method based on the improved complementary ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) and multiscale improved permutation entropy (MIPE) is proposed. Firstly, the ICEEMDAN is utilized to obtain a group of intrinsic mode functions (IMFs) from the sound of running power equipment. The noise IMFs are then identified and eliminated through mutual information (MI) and mean mutual information (meanMI) of IMFs. Next, the normalized mutual information (norMI) and MIPE are calculated respectively, and norMI is utilized to weigh the corresponding MIPE result. Finally, based on the separability criterion, the weighted MIPE results are feature-dimensionally reduced to obtain the multiscale entropy feature of the sound. The experimental results show that the classification accuracies of the method under the conditions of no noise and 5 dB reach 96.7% and 89.9%, respectively. In practice, the proposed method has higher reliability and stability for the sound feature extraction of the running power equipment.


Introduction
As the basic unit of a power system, the power equipment is of great importance and its operation status affects the safety, stability, and efficiency of the power grid. With the increase of power equipment, the possibility of equipment failure and the resulting loss rises accordingly. Therefore, it has become an important research issue to study safer and more effective state monitoring methods, so as to improve the monitoring sensitivity, realize fault warning, and improve the generalization of fault diagnosis systems [1][2][3].
In recent years, the state monitoring and fault diagnosis technology for power equipment with modern information technology has been vigorously promoted, and many kinds of equipment operation information have been applied to various fields [4]. The researchers studied different forms of equipment condition monitoring and fault warning from the aspects of equipment images [5][6][7][8], electromagnetic waves [9], ultrasonic waves [10,11], temperature [12,13], pressure [14] and vibration operator to symbolize the signal, the algorithm not only improves the measurement accuracy of signal complexity but also makes noise interference more robust. The signals generated from complex systems usually exhibit multiscale structures [49], but the entropy algorithms above are all single-scale based. On account of this, Costa et al. [49] proposed a coarse-graining process, and it can be combined with an arbitrary entropy estimation algorithm for multiscale analysis. Combining the coarse-graining process with IPE, the multiscale improved permutation entropy (MIPE) can be obtained [48].
Since the mode decomposition-based and entropy-based techniques have a lot of advantages in processing complex time series, in this paper, a feature extraction method combining normalized mutual information (norMI) and separability criterion is proposed based on ICEEMDAN and MIPE to process the sound of running power equipment in this paper. The proposed method firstly eliminates the noise of the acoustic signals. Compared with the existing feature extraction technologies, it mines the multiscale characteristics of the sound based on ICEEMDAN and MIPE. Moreover, norMI is employed to weight MIPE results, which takes the importance of signal IMFs into account and makes use of the separability criterion to extract effective features and reduce dimensions from the weighted MIPE results. In such a way, the aim of revealing the inherent nonlinear feature of the sound without any noise interference can be achieved.
The rest of the paper is organized as follows: the proposed feature extraction theory is described in Section 2; the simulation and experimental results are provided in Sections 3 and 4 respectively; and the paper is concluded in Section 5.

ICEEMDAN
The flowchart of the ICEEMDAN algorithm is shown in Figure 1. Given a composite signal x(t), t represents the sampling sequence of the composite signal. Let E k (·) be the kth IMF obtained by EMD and define M(·) as the operator to calculate the local mean of a signal, and the ICEEMDAN algorithm is described as follows: where ω (i) denotes ith added white Gaussian noise with zero mean and unit variance, I is the predefined ensemble sizes denoting the number of added ω (i) , and M x i (t) denotes the action of averaging throughout the realizations of M x i (t) , i = 1, 2, . . . , I.
The first mode can be written as IMF 1 = x − r 1 .
Calculated the second residue by r 2 = M r 1 + β 1 E 2 ω (i) , and the second mode can be expressed as IMF 2 = r 1 − r 2 .
The constant β j−1 is selected to adjust the signal-to-noise ratio (SNR) between the residual noise and the added noise. For j = 1 , β 0 = ε 0 std(x)/std(E 1 (ω (i) )), where std(·)represents the operator of standard deviation (SD) and ε 0 is the reciprocal of the expected SNR between the input signal x(t) and the first added noise [43]. For j ≥ 2, β j = ε 0 std(r j ).

Mutual Information, Mean Mutual Information and Normalized Mutual Information
The mutual information (MI) between two discrete random variables X and Y can be expressed as: where p(x) and p(y) represent the probability density function of X and Y respectively, and p(x, y) denotes their joint probability density function. The mutual information is a measure of the correlation between two random variables [50]. If X and Y are independent, MI(X, Y) = 0. Generally, the signal IMFs are more correlated with the original signal than the noise IMFs. Therefore, the mutual information can be used as an indicator to distinguish the signal IMFs from IMFs. The proposed method adopts the mean mutual information (meanMI) as a standard to compare the correlation between IMF and the original signal, which can be calculated as: where J denotes the number of IMFs decomposed from the original signal. If the MI of signal IMFs is larger than meanMI, it can be called a signal IMF. In order to calculate the weighted MIPE (refer to Section 2.5), the norMI of each signal IMF is defined as: where P denotes the number of signal IMFs.

MIPE
Given a time series x(i), 1 ≤ i ≤ N and the time scale factor s, the subsequence y s i can be calculated by the coarse-graining process, expressed as: Given the embedding dimension m and the time delay t, the embedding vectors z s of y s i are defined as: z s = y s j , y s j+t , · · · , y s j+(m−1)t , where j = 1, 2, . . . , N/s − (m − 1)t. Then, the embedded vector is symbolized based on the uniform quantization (UQ) operator in Reference [51]. The UQ operator can be represented as: where x is the input series, x min and x max the minimum and maximum values of x respectively, L is the predefined discretization level, and the procedure parameter ∆ = (x max − x min )/L. Obviously, for the input series x, the UQ operator can produce an integer sequence ranging from 0 to L − 1.
Based on Equation (6), the initial column vector of the embedded vector z s is symbolized to obtain the symbolized sequence sym s 1 , and then all column vectors of the embedded vector z s are symbolized based on the following equation: Next, similar to PE, each sym s l (1 ≤ l ≤ L m ) corresponds to the pattern π l , and the probability distribution p l of π l can be calculated. Here π l denotes lth permutations of all permutations. These permutations are considered as the possible order types of L different numbers. The normalized IPE on scale factor s is written as: where 1 ≤ h ≤ L m , and ln(L m ) is the maximum value of H s IPE which will be reached only when the patterns have a uniform distribution.
The MIPE vector can be computed by traversing all scale factors from 1 to s. The pseudo-code of MIPE is given in Algorithm 1.

Algorithm 1: Extract multiscale improved permutation entropy (MIPE) vector form time series.
Input: X: time series; m: embedding dimension; t: time delay; L: discretization level; s: scale factor Output: MIPE of X 1 Initialize N ← the sample number of X(N, 1);

Separability Criterion
Suppose the finite feature vectors contain N samples belonging to C classes, and the sample number of the i-th class is N i , 1 ≤ i ≤ C. The interclass discrete matrix S B and the intra-class discrete matrix S W can be represented by Equations (9) and (10), respectively, and the separable criterion value (SC) can be expressed by Equation (11): where m i is the mean value of the i-th-class sample feature vectors, m is the mean of all sample feature vectors, E i is the covariance matrix of the i-th-class sample feature vectors, and tr(·) is an operator to compute the trace of a matrix. S B and S W respectively represent the inter-class distance and intra-class distance of the sample feature vectors. The larger the inter-class distance and the smaller the intra-class distance, the larger SC is; that is, the classes are more separable. By quantifying the separability criterion, the effective information in the feature vectors can be effectively identified and reduced [52].

Proposed Feature Extraction Method
The detailed flowchart of the proposed feature extraction method is shown in Figure 2. The specific steps of the algorithm are as follows: (1) Decompose the input signal by the ICEEMDAN algorithm to obtain a group of IMFs. (2) Calculate MI and meanMI of IMFs, and the signal IMFs can be identified by comparing MI of each IMF with meanMI (MI > meanMI).
Calculate norMI and MIPE of each signal IMF, respectively. (4) Use NorMI as the weight coefficient to weigh the corresponding MIPE results, and the weighted MIPE results are defined as the weighted sum. (5) Extract the multiscale entropy feature vector by applying the separability criterion for dimension reduction to the weighted MIPE results.

Analysis on Artificial Signal Based on ICEEMDAN
In this section, the performance of ICEEMDAN on an artificial signal will be analyzed. The signal was generated by Equation (12), where the number of sampling points of each signal was 1000 (1 ≤ n ≤ 1000). s 1 (n) and s 2 (n) are the two source signals to be decomposed, s 3 (n) are the cosine noise, S(n) are the mixed artificial signals, and ω(n) denotes the white Gaussian noise with zero mean and unit variance.
The source signal waveform of the artificial signal is shown in Figure 3a. The analysis results of ICEEMDAN, CEEMDAN, EEMD, and EMD are shown in Figure 3b-e, respectively. To evaluate the algorithm, the added noise amplitude ε 0 and the ensemble sizes I were selected as 0.2 and 100, respectively; the maximum number of iterations was 10,000. The following studies will use the same parameters. Due to the limitations of EMD [34], it can be seen from Figure 3e that the mode mixing occurred in EMD. From IMF3 to IMF8, each IMF had no fixed central frequency. Comparing Figure 3d with Figure 3e, the mode mixing problem of EEMD is improved, but the mode mixing still exists at some signal mutation positions (n = 200, 350, 500, and 750). As can be seen from Figure 3c that although the IMFs (IMF1 and IMF4) obtained by CEEMDAN were similar to the source signal, it generated spurious modes such as IMF2 and IMF3. The decomposition results of ICEEMDAN were consistent with the actual situation, which shows the superiority of this algorithm. Therefore, ICEEMDAN is used to decompose the acoustic signal of the electric equipment in this paper.

Analysis on Chaotic Signals Based on MIPE
In order to verify the effectiveness of the method proposed in Section 2.3, the entropy feature extraction was carried out on four typical chaotic signals, the Lorenz chaotic system, namely the Rossler chaotic system, the Duffing chaotic system, and the Chen chaotic system. For the given appropriate parameters, these four systems have chaotic characteristics. The Lorenz system can be expressed as: where a = 10, b = 8/3, and c = 28.
The Rossler system can be expressed as: where a = 0.2, b = 0.2, and c = 5.7.
The Duffing system can be expressed as: where a = 0.82 and b = −0.5. The Chen system can be expressed as: where a = 35, b = 3, and c = 28. The random number generator was used to set the initial values of each chaotic system, and these equations were integrated by using a fourth-order Runge-Kutta method with a fixed step size of 0.01. The X component signal with a length of 2048 points was selected as the chaotic signal. The time-domain waveforms of Lorenz, Rossler, Duffing, and Chen are shown in Figure 4.  For comparison, the multiscale sample entropy (MSE) [49], multiscale fuzzy entropy (MFE) [53], multiscale replacement entropy (MPE) [54] and MIPE were selected to analyze these four chaotic systems. According to [48], the parameters were set as follows: the embedding dimension m = 4, the time delay t = 1, the discrete level L = 4, and the time scale s = 10. The following studies used the same parameters. The analysis results are shown in Figure 5. The entropy results tended to increase with the increase of the time scale, which is reasonable. With the increase of the coarse-graining process, the change in the unit data length rose, increasing the data randomness. The comparison in Figure 5 shows that the MSE is difficult to distinguish the Lorenz, Rossler, and Duffing systems. For a small time scale, MFE and MPE cannot distinguish the Rossler system from Duffing chaotic systems, while MIPE can distinguish all chaotic signals. MIPE has a strong ability to classify multiscale complexity signals. Therefore, MIPE was utilized to extract the entropy characteristics of running power equipment sound in this paper.

Feature Extraction of Power Equipment Sound
In this section, the proposed feature extraction algorithm is used to analyze the sound of actual power equipment. The audio data used in this paper came from four types of key power equipment measured in a power plant under the same conditions. Type A, B, C, and D respectively represent the feeder connecting shaft, induced fan blade, coal grinder, and circulating water pump. For each type of electrical equipment, we acquired 50 samples of 1.5 s duration. The sampling frequency of each sample is 16 kHz. The sound waveforms of four power equipment are shown in Figure 6.

Feature Extraction Based on ICEEMDAN and MIPE
The effectiveness of the proposed feature extraction method is verified and demonstrated in this subsection. As shown in Figure 7, ICEEMDAN was used to decompose randomly selected experimental data to produce a group of IMFs with different center frequencies. For clarity, only the first ten IMFs are demonstrated. Figure 8 shows the MI of each IMF and the original signal, where the dashed line of the corresponding color represents the corresponding meanMI to distinguish the signal IMFs.  After eliminating the noise IMFs, the norMI and MIPE of each signal IMFs were calculated. Then norMI was used to weight the corresponding MIPE results. The weighted MIPE results took into account the importance of each signal IMFs. The multiscale MIPE results of the four power devices obtained through an average of 50 sets of experimental data are shown in Figure 9. The scale factor was selected as s = 1-30, and the error bars represent the SD of the weighted MIPE values. The separability criterion was used to analyze each dimension of the weighted MIPE results. The distribution of SC is shown in Figure 10, where the black dashed line represents the mean value of 30 SCs. For the separability criterion larger than the mean of separability criterion value, the feature can be defined as an effective one. Comparing Figure 9 with Figure 10, it can be seen that the separability criterion distribution had a positive correlation with the dispersion degree of the weighted MIPE result. The SC was large when these four types of weighted MIPE results were significantly different. Hence, it is obvious that the dimension reduction extraction based on the separability criterion is effective.
The multiscale entropy feature vector was obtained after the feature dimension reduction, as shown in Figure 11. For different types of running power equipment sound, the entropy feature vectors are clearly different, indicating that the extracted features are effective for power equipment classification. Moreover, the error bar on each scale factor is low, proving the reliability of the proposed method.
For comparison, the same experimental data were analyzed using the feature extraction method in [48,55]. The feature extraction results of MIPE [48] are shown in Figure 12a, where the mean value of IPE and its standard deviation error is drawn. The IPE values of type B and C overlap in contrast with that in Figure 11. In particular, the IPE values of the four types of signals all overlap and are difficult to distinguish when s > 6. The reason lies in the lack of noise elimination before the MIPE calculation, and the results are influenced by the noise, which becomes larger with the coarse-graining process. The analysis results of CEEMDAN-PE [55] are demonstrated in Figure 12b, where the abscissa represents the different types of samples and the ordinate represents the PE value. The parameters of the algorithm were set as same as those in [55]. Because CEEMDAN-PE was based on a single scale, the relationship between entropy and time scale was discarded. It can be seen from Figure 12b that the PE values of type B and type C are close to each other. The results of type A overlap with those of other types, making them difficult to be distinguished.  To study the effect of the feature extraction method above under a noisy condition, the white Gaussian noise was added into the running power equipment sound to produce different SNR conditions. As shown in Figure 13, the methods mentioned above were applied to extract features for the sound signal under the condition of 0 dB. Compared with the effect shown in Figures 11 and 12, CEEMDAN-PE, which is based on a single scale, becomes invalid due to noise interference. The results of type A, type B, and type C overlap more, greatly increasing the difficulty and accuracy of classification. Similarly, due to the lack of effective de-noising process, the performance of MIPE declined rapidly, and the differentiation of MIPE of each type decreased. Especially when s = 1-5, SD of the sample data increased significantly, which reduces the robustness of the classification. On the contrary, the added noise has little influence on results obtained by the proposed method, which proves its credibility.

Classification of Power Equipment
In order to verify the feature extraction method, a widely applied classifier-support vector machine (SVM)-was used to process the features extracted in Section 4.1. For each type of power equipment, 25 zero noise samples were randomly selected for training and the remaining 25 samples were used for validation. We repeated this ten times, and took the average of the results as the classification results, as listed in Table 1. To study the role of the separability criterion in the proposed algorithm, the classification accuracy of the weighted MIPE feature vectors without feature dimensionality reduction was calculated. As shown in Table 1, the classification accuracy of the weighted MIPE feature vector and the proposed method was higher than other feature extraction methods, whether the noise was added or not. Moreover, the separability criterion can significantly improve the classification accuracy and enhance the robustness of the algorithm under serious-noise conditions.
As can be seen from Table 1, the classification accuracy is in good agreement with the feature extraction results in Section 4.1. For a clean signal, the proposed method achieved an accuracy rate of 96.7%, 12.1%, and 27.8% higher than that of MIPE and CEEMDAN-PE, respectively. The SNR decreasing to 5 dB, the classification accuracy of the proposed method declined to 89.9%, while the classification accuracy of MIPE and CEEMDAN-PE dropped to 52.2% and 65.8%, respectively. As the SNR dropped to 0 dB, the classification accuracy of MIPE and CEEMDAN-PE algorithms dropped to 46.6% and 64.9% respectively. In contrast, the proposed method still had a classification accuracy of 92.7%.
Comparing the classification results of MIPE and CEEMDAN-PE, the MIPE performance declined rapidly under high noise conditions due to the lack of the de-noising process. Similarly, for the classification rate of a clean signal, the effective entropy algorithm helps to extract the nonlinear features in the signal and significantly improves the classification accuracy. The results above prove the validity of the algorithm from another angle.  [48] 84.6% 52.2% 46.6% CEEMDAN-PE [55] 68.9% 65.8% 64.9%

Conclusions
A separability criterion is proposed to extract effective features from the sound of running power equipment. Compared with the existing feature extraction technology, it is based on multiple scales and removes noise before calculating MIPE. Moreover, the proposed method uses norMI to weight MIPE results, which takes into account the importance of signal IMFs and utilizes the separability criterion to extract effective features and reduce dimensions of the weighted MIPE results. In this way, the robustness and accuracy of the equipment classification are improved significantly. The validation of the proposed algorithm is proved by analyzing acoustics signals measured from four types of electrical equipment, . which demonstrates more precise identifications and higher sensitivity than the MIPE and CEEMDAN-PE. Therefore, the proposed method is more reliable and suitable for the entropy feature extraction for the running power equipment sound in practice, and it provides a supplement to monitoring techniques for electrical equipment.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: