Partial Discharge Fault Diagnosis Based on Multi-Scale Dispersion Entropy and a Hypersphere Multiclass Support Vector Machine

Partial discharge (PD) fault analysis is an important tool for insulation condition diagnosis of electrical equipment. In order to conquer the limitations of traditional PD fault diagnosis, a novel feature extraction approach based on variational mode decomposition (VMD) and multi-scale dispersion entropy (MDE) is proposed. Besides, a hypersphere multiclass support vector machine (HMSVM) is used for PD pattern recognition with extracted PD features. Firstly, the original PD signal is decomposed with VMD to obtain intrinsic mode functions (IMFs). Secondly proper IMFs are selected according to central frequency observation and MDE values in each IMF are calculated. And then principal component analysis (PCA) is introduced to extract effective principle components in MDE. Finally, the extracted principle factors are used as PD features and sent to HMSVM classifier. Experiment results demonstrate that, PD feature extraction method based on VMD-MDE can extract effective characteristic parameters that representing dominant PD features. Recognition results verify the effectiveness and superiority of the proposed PD fault diagnosis method.


Introduction
Partial discharge (PD) is an important symptom of insulation degradation for electrical equipment. PD fault diagnosis plays an irreplaceable role in the evaluation of insulation condition [1]. PD feature extraction is an important step in insulation fault diagnosis. The common methods include statistical atlas (SA) [2], wave analysis (WA) [3] and wavelet transform (WT) [4]. However, SA has the limitations of high request of sampling rate, large data size and slow speed of data processing which are not suitable for on-line monitoring. Besides, it is difficult to extract PD phase information during statistical atlas construction. WA is easily influenced by electromagnetic interference. WT has some inherent limitations such as the difficulty of selection of the wavelet basis, wavelet thresholds, decomposition levels, and so on [5].
Empirical mode decomposition (EMD), as an adaptive signal processing method that decomposes a time series into some limited intrinsic mode functions (IMFs). It is widely used in the areas of fault detection, signal processing and data compression [6][7][8]. However, due to the problems of ending effects and mode mixing in non-stationary signal decomposition, EMD is limited in practical applications. Variational mode decomposition (VMD) is a new signal decomposition method, which is widely applied in electrical fault feature extraction [9]. It is a non-recursive variational decomposition model. In VMD, the central frequency and bandwidth of each mode are determined by searching the experimental setup used to generate PD signals. In Section 4 we show the results with their validation. The paper ends with conclusions in Section 5.

VMD Algorithm
VMD decomposes one real signal into K independent sub-signal u k , which has specific sparsity. This procedure gets the minimum bandwidth estimation of each modal [31]. The procedure of signal decomposition is to solve the variational problem. The variational model with constraint condition is as follows: where {u k } = {u 1 , u 2 , · · · , u K } demonstrates the modal components, {w k } = {w 1 , w 2 , · · · , w K } is the center frequency of each modal component, δ(t) represents impulse function, ∂ t means the partial derivatives of t, and f is the original signal.
In order to obtain the optimal solution of such constrained variational problem, Lagrangian multiplier λ(t) is introduced.
The constrained variational problem is transformed into non-constrained problem: where α is the quadratic penalty factor. Alternate direction method of multipliers (ADMM) is introduced to obtain the saddle point of such Lagrangian function, which is the optimal solution. The procedure of VMD can be summarized in the following steps: (1) Initialize each modal component u 1 k , center frequency ω 1 k and operators λ 1 . Set n = 0. (2) Update u k in non-negative frequency intervals: (4) Update λ in non-negative frequency intervals: < ε, then stop iteration. Otherwise, return to (2).

Dispersion Entropy
For a univariate signal x = x 1 , x 2 , · · · , x N , dispersion entropy method can be described in following steps [32]: (1) Map x j (j = 1, 2, · · · , N) into y = {y 1 , y 2 , · · · , y N } from 0 to 1 with the normal cumulative distribution function: where σ and µ represent the standard deviation and mean of x, respectively. (2) Assign each y j to an integer from Label 1 to c using a linear algorithm. The mapped signal can be defined as follows: (3) Define embedding vector z m,c i with embedding dimension m and time delay d as: Each time series z m,c i is mapped to a dispersion pattern π v 0 v 1 ···v m−1 , where: (4) For each dispersion pattern, the relative frequency can be obtained as: where p(π v 0 v 1 ···v m−1 ) represent the number of dispersion pattern π v 0 v 1 ···v m−1 , which is assigned to z m,c i divided by the total number of embedding signals with embedding dimension m.

Multiscale Dispersion Entropy
Multiscale Dispersion Entropy (MDE) is the combination of the coarse-graining with dispersion entropy. In MDE, the original signal x = x 1 , x 2 , · · · , x N of length N is first divided into non-overlapping scale factor τ. Then the new coarse-grained signals can be shown as follows: Calculate the entropy value of each coarse-grained signal of length N/τ with dispersion entropy method:  Figure 1. HMSVM can classify the samples directly. Each type of samples needs only one-hypersphere training. All training samples are mapped into high-dimension space. Each type of training samples searches for one hypersphere that has small radius and more target samples. HMSVM classification model is shown in Figure 1. For an M-class problem, a collection of elements Xm (m = 1, 2, …, M) is given. Assume that each Xm contains m-dimension sample xmi, i = 1, 2…lm, which represents i-th element in m-class.
Assign one hypersphere (am,Rm) for each sample Xm, where am is the center of sphere, Rm is the radius of suprasphere. The objective function of m-th suprasphere can be defined as follows: where Cm is the penalty factor, representing the trade-off between Rm and target samples. ξm,i is the slack variable of HMSVM allowing remote samples staying outside the sphere. Lagrange function can be obtained after Lagrange multiplier is introduced: The derivative operation of Equation (14) is processed to obtain the dual optimization problem as follows: The restricting condition that the target function should satisfy is shown as follows:  For an M-class problem, a collection of elements X m (m = 1, 2, . . . , M) is given. Assume that each X m contains m-dimension sample x mi , i = 1, 2 . . . l m , which represents i-th element in m-class.
Assign one hypersphere (a m ,R m ) for each sample X m , where a m is the center of sphere, R m is the radius of suprasphere. The objective function of m-th suprasphere can be defined as follows: where C m is the penalty factor, representing the trade-off between R m and target samples. ξ m,i is the slack variable of HMSVM allowing remote samples staying outside the sphere. Lagrange function can be obtained after Lagrange multiplier is introduced: The derivative operation of Equation (14) is processed to obtain the dual optimization problem as follows: The restricting condition that the target function should satisfy is shown as follows: For an unknown fault sample d, we first calculate the square of the distance between d and a m using the formula below: The radius of the suprasphere is defined as R m = D(x i ), where x i represents the support vector. Therefore, the category assigned to the unknown sample d can be determined according to the comparison between R m and D(d).

Kernel Function Selection
Due to the complexity among different PD fault samples, the spherical distribution will not appear in low-dimensional space. PD fault samples need to be mapped into high-dimension space using kernel functions to obtain the optimal hypersphere. In recent time, the common kernel functions include radial basic function (RBF) [33], polynomial kernel function and sigmoid function. After repeating tests, RBF shows outstanding performance. Therefore, RBF is selected as the kernel function for HMSVM. It can be defined in Equation (18):

PD Fault Diagnosis Based on VMD-MDE and HMSVM
In this paper, the proposed PD fault diagnosis method combines feature extraction and pattern recognition. Firstly, the original PD signal is decomposed using VMD to obtain the intrinsic mode functions. Secondly MDE value of each intrinsic mode function is calculated. And then principal component analysis (PCA) [34] is introduced to select principal components of MDE as PD feature vectors. Finally, the extracted vectors are sent to HMSVM pattern classifier to recognize different PD faults. The fault diagnosis procedure is as follows: Step 1: Extract different types of PD signals in experimental environment, including floating discharge (FD), needle-surface discharge (ND), ball-surface discharge (BD) and corona discharge (CD).
Step 2: Select proper initial number of IMF according to the center frequency observation and decompose PD signals using VMD into intrinsic mode functions with different characteristic scales.
Step 3: Calculate the correlation coefficients between each IMF and original PD signal to select effective IMFs [35,36]. If the coefficient is greater than the threshold value, then keep the IMF as effective one. Otherwise, abandon the IMF. In this paper, the threshold value of the correlation coefficient is set to 0.3.
Step 4: Fix the decomposition scale for IMF and calculate the MDE value of extracted IMFs as original PD feature vectors.
Step 5: Analyze the PD vectors by PCA and extract fewer representative principal components as PD characteristic parameters.
Step 6: Send extracted PD characteristic parameters into HMSVM classifier to diagnose different PD fault modes and obtain the final diagnosis result.
The flow chart of PD fault diagnosis with proposed method is shown in Figure 2.

Experimental Setup
Different PD types can produce different effects in insulation materials, but the range may be diverse. To analyze the characteristics of different PD types, PD signals of different models are extracted in the laboratory [37]. According to the inner insulation structure of power transformers, there are four possible different PD types, including FD, ND, BD and CD. PD models are shown in Figure 3. The experimental setup is shown in Figure 4.

Experimental Setup
Different PD types can produce different effects in insulation materials, but the range may be diverse. To analyze the characteristics of different PD types, PD signals of different models are extracted in the laboratory [37]. According to the inner insulation structure of power transformers, there are four possible different PD types, including FD, ND, BD and CD. PD models are shown in Figure 3. The experimental setup is shown in Figure 4.

Experimental Setup
Different PD types can produce different effects in insulation materials, but the range may be diverse. To analyze the characteristics of different PD types, PD signals of different models are extracted in the laboratory [37]. According to the inner insulation structure of power transformers, there are four possible different PD types, including FD, ND, BD and CD. PD models are shown in Figure 3. The experimental setup is shown in Figure 4.   PD signals are detected in the simulated transformer tank in the laboratory. The pulse current is collected by a current sensor with a 500 kHz-16 MHz bandwidth. The UHF signal is extracted by a UHF sensor with a 10-1000 MHz bandwidth. The signal received is imported into the PD analyzer. The test condition is shown in Table 1 and the experimental connection diagram is shown in Figure 5. Table 1. Test condition of PD models.  PD signals are detected in the simulated transformer tank in the laboratory. The pulse current is collected by a current sensor with a 500 kHz-16 MHz bandwidth. The UHF signal is extracted by a UHF sensor with a 10-1000 MHz bandwidth. The signal received is imported into the PD analyzer. The test condition is shown in Table 1 and the experimental connection diagram is shown in Figure 5.  PD signals are detected in the simulated transformer tank in the laboratory. The pulse current is collected by a current sensor with a 500 kHz-16 MHz bandwidth. The UHF signal is extracted by a UHF sensor with a 10-1000 MHz bandwidth. The signal received is imported into the PD analyzer. The test condition is shown in Table 1 and the experimental connection diagram is shown in Figure 5.

Signal Extraction
In this paper, four different types of PD signals are extracted with above experimental setup. The extracted PD waveforms are shown in Figure 6.

Signal Extraction
In this paper, four different types of PD signals are extracted with above experimental setup. The extracted PD waveforms are shown in Figure 6. PD signals are detected in the simulated transformer tank in the laboratory. The pulse current is collected by a current sensor with a 500 kHz-16 MHz bandwidth. The UHF signal is extracted by a UHF sensor with a 10-1000 MHz bandwidth. The signal received is imported into the PD analyzer. The test condition is shown in Table 1 and the experimental connection diagram is shown in Figure 5.

Signal Extraction
In this paper, four different types of PD signals are extracted with above experimental setup. The extracted PD waveforms are shown in Figure 6.

VMD Decomposition
In this paper, float discharge is taken as an example for VMD decomposition. The number of IMFs, represented as K, is determined according to the central frequency observation. The central frequency of IMF with the variation of K is shown in Table 2.  Table 2 shows that the IMFs with similar central frequency arise from K = 5, which means excessive decomposition. Therefore K = 4 is selected as the number of IMF. In this paper, the balancing parameter α = 2000 and bandwidth parameter τ = 0.1. The decomposition results with EMD and VMD are shown in Figures 7 and 8.

VMD Decomposition
In this paper, float discharge is taken as an example for VMD decomposition. The number of IMFs, represented as K, is determined according to the central frequency observation. The central frequency of IMF with the variation of K is shown in Table 2.  Table 2 shows that the IMFs with similar central frequency arise from K = 5, which means excessive decomposition. Therefore K = 4 is selected as the number of IMF. In this paper, the balancing parameter α = 2000 and bandwidth parameter τ = 0.1. The decomposition results with EMD and VMD are shown in Figures 7 and 8.

VMD Decomposition
In this paper, float discharge is taken as an example for VMD decomposition. The number of IMFs, represented as K, is determined according to the central frequency observation. The central frequency of IMF with the variation of K is shown in Table 2.  Table 2 shows that the IMFs with similar central frequency arise from K = 5, which means excessive decomposition. Therefore K = 4 is selected as the number of IMF. In this paper, the balancing parameter   Figure 8 describes the results of VMD decomposition. It can be seen from this figure that the modal components in VMD approach to the real signal. Figure 7 and 8 verify the effectiveness of VMD and the superiority over EMD. It can be concluded that VMD is more suitable for PD signal decomposition.

IMF Selection
In order to obtain the effective IMF, the correlation coefficient (CC) between each IMF and original PD signal is calculated. Given a threshold T, if the CC is greater than T, the IMF will be selected as effective component; otherwise it will be regarded as false component and abandoned. In this work, T is set to 0.3. The CC values of IMF for VMD and EMD are shown as Table 3.  Table 3 shows that the CC value of first three IMFs is larger than the given threshold, which means these IMFs could represent the real components of PD signals. Therefore, the first three IMFs are selected and analyzed for VMD decomposition. Similarly, we can see that the CC value is smaller than the threshold from the fourth IMF, which means these IMFs contain less information of PD signals. Consequently, the first four IMFs are kept for EMD decomposition.

Feature Extraction
In this paper four different types of PD signals are decomposed using VMD method. The VMD decomposition parameters are shown in Table 4. Ks is the number of effective IMFs calculated as described in Section 4.2.   Figure 8 describes the results of VMD decomposition. It can be seen from this figure that the modal components in VMD approach to the real signal. Figures 7 and 8 verify the effectiveness of VMD and the superiority over EMD. It can be concluded that VMD is more suitable for PD signal decomposition.

IMF Selection
In order to obtain the effective IMF, the correlation coefficient (CC) between each IMF and original PD signal is calculated. Given a threshold T, if the CC is greater than T, the IMF will be selected as effective component; otherwise it will be regarded as false component and abandoned. In this work, T is set to 0.3. The CC values of IMF for VMD and EMD are shown as Table 3.  Table 3 shows that the CC value of first three IMFs is larger than the given threshold, which means these IMFs could represent the real components of PD signals. Therefore, the first three IMFs are selected and analyzed for VMD decomposition. Similarly, we can see that the CC value is smaller than the threshold from the fourth IMF, which means these IMFs contain less information of PD signals. Consequently, the first four IMFs are kept for EMD decomposition.

Feature Extraction
In this paper four different types of PD signals are decomposed using VMD method. The VMD decomposition parameters are shown in Table 4. K s is the number of effective IMFs calculated as described in Section 4.2. Using the above parameters, the corresponding IMFs of different types of PD are obtained by VMD decomposition. Then the MDE value of each IMF is calculated. During MDE calculation, some preset parameters need to be given, including scale factor s, number of classification c, time delay d and embedded dimension m. But considering that aliasing may occur when d > 1, d is set to 1 as recommended. In order to avoid the trivial case of only one dispersion pattern, c is set to 2. For better detection on dynamic change of signals, m is set to 6. To analyze the variation of MDE values with different scales, s is set to 20. With above parameters, MDE values of four different types of PD signals extracted in the laboratory are calculated. For each type of PD, MDE values are averaged with different IMFs, shown in Figure 9.   Figure 9.  Figure 9 shows that different types of PD signals have diverse MDE values with variations of scale factors. The reason is that the randomness of PD signals is changing when PD fault occurs, which could change the MDE values. It also indicates that a single scale cannot completely reflect all the signal information and much more important information distributes in other scales. MDE can effectively detect the dynamic variation of PD signals which represent the fault characteristics with different scales. It can be found from the figure that MDE values start to level off after Scale 12. Therefore, the scale factor is set to 12 in this paper. In the case of FD, MDE values of IMFs using VMD and EMD are shown in Figure 10.   Figure 9 shows that different types of PD signals have diverse MDE values with variations of scale factors. The reason is that the randomness of PD signals is changing when PD fault occurs, which could change the MDE values. It also indicates that a single scale cannot completely reflect all the signal information and much more important information distributes in other scales. MDE can effectively detect the dynamic variation of PD signals which represent the fault characteristics with different scales. It can be found from the figure that MDE values start to level off after Scale 12. Therefore, the scale factor is set to 12 in this paper. In the case of FD, MDE values of IMFs using VMD and EMD are shown in Figure 10.
the signal information and much more important information distributes in other scales. MDE can effectively detect the dynamic variation of PD signals which represent the fault characteristics with different scales. It can be found from the figure that MDE values start to level off after Scale 12. Therefore, the scale factor is set to 12 in this paper. In the case of FD, MDE values of IMFs using VMD and EMD are shown in Figure 10.    Table 5. Table 5. Initial feature vectors.

PCA-Based Dimension Reduction
Due to the high dimension of extracted feature vectors, it will cause big burden for pattern classifiers which can directly affect the recognition accuracy. In this paper, the PCA method is employed for dimension reduction of initial feature vectors. In the case of K 1 , the covariance matrix is constructed to obtain the principal components. The eigenvalue and eigenvector of the covariance matrix are solved for linear transformation of original vectors. To achieve the goal of dimension reduction, those factors whose eigenvalues are greater than 1 are selected as principal components. The eigenvalue and corresponding contribution rates of the covariance matrix are shown in Table 6.  Table 6 shows that first two eigenvalues are greater than 1, and the accumulated contribution rate is larger than 90%. The contribution rate changes with the variation of principle components, shown in Figure 11.  Table 6 shows that first two eigenvalues are greater than 1, and the accumulated contribution rate is larger than 90%. The contribution rate changes with the variation of principle components, shown in Figure 11. Figure 11. The variation of contribution rate with principle components. Figure 11. The variation of contribution rate with principle components.
It can be concluded from above figure that, the contribution rate from the third principle component starts to level off. In addition, the contribution rates are decreasing gradually which can be ignored. Therefore, first two principle components are suitable for further analysis which represent most of the vector information. To do so, the original 12 indicators are reduced to 2 new ones. With a similar method, the principle components of K 2 , K 3 and K 4 can be obtained, shown in Table 7. It can be seen from Table 7 that nine principle components factors are extracted from 48 feature vectors. And the contribution rate in each IMF is greater than 80%. Given the above, the dimension of feature vectors is reduced to nine after dimension reduction using PCA. Similarly, with above procedure, the calculated PD parameters of different PD types are shown in Table 8. Table 8. Principle components with different IMFs.

PD Pattern Recognition
In this paper, 400 PD samples, including FD, ND, BD and CD, are extracted in the laboratory containing 100 samples in each PD type. MDE values of four different PD types are calculated and 50 samples in each type constitute the initial feature vectors. To verify the effectiveness and superiority of the proposed method, the feature extraction methods based on multi-scale sample entropy (MSE) and multi-scale permutation entropy (MPE) are introduced. The calculation method of MSE and MPE is similar with that of MDE. Firstly, PD signals are decomposed using EMD or VMD. After that MSE or MPE values of extracted IMFs are calculated. Finally, PCA is applied to dimension reduction. The parameters during signal decomposition are shown in Table 9. PD feature vectors extracted with the above three methods are sent to the HMSVM classifier. Due to the big impact on the fault diagnosis result, HMSVM parameters need optimal configuration with PSO. In the case of VMD-MDE method, first of all, PD samples are divided into training and testing samples. After multiple experimental trials, the number of particle population is set to 20, w = 1, c 1 = 2, c 2 = 2, the maximum number of iterations N = 200. The penalty parameter C is between 1/n and 1, while the searching range of the kernel parameter σ is between 1 and 100. The optimum fitness reaches the maximum value of 96.98% after 19 iterations, when σ = 12.26 and C = 0.35. Similarly, HMSVM parameters with different feature extraction methods are obtained as follows.
Using the parameters in Table 10, HMSVM classifier is constructed for fault diagnosis of three different PD features. The recognition results with EMD and VMD decomposition are shown in Figures 12 and 13.   Using the parameters in Table 10, HMSVM classifier is constructed for fault diagnosis of three different PD features. The recognition results with EMD and VMD decomposition are shown in Figures 12 and 13.   Figures 12 and 13 demonstrate that the recognition result using EMD decomposition is significantly different with that using VMD decomposition. Figure 12 illustrates that the recognition accuracy in each PD type is not less than 80% but no more than 90%, which means, using EMD decomposition, extracted PD features cannot represent most of signal characteristics. In contrary, Figure 13 shows that the recognition accuracy in each PD type is no less than 90%. Moreover, in each PD type, there's no misjudged sample with MDE. This means that, with VMD decomposition, PD   Using the parameters in Table 10, HMSVM classifier is constructed for fault diagnosis of three different PD features. The recognition results with EMD and VMD decomposition are shown in Figures 12 and 13.   Figures 12 and 13 demonstrate that the recognition result using EMD decomposition is significantly different with that using VMD decomposition. Figure 12 illustrates that the recognition accuracy in each PD type is not less than 80% but no more than 90%, which means, using EMD decomposition, extracted PD features cannot represent most of signal characteristics. In contrary, Figure 13 shows that the recognition accuracy in each PD type is no less than 90%. Moreover, in each PD type, there's no misjudged sample with MDE. This means that, with VMD decomposition, PD features can effectively represent most of signal information. Besides, from above two figures, it gets  Figures 12 and 13 demonstrate that the recognition result using EMD decomposition is significantly different with that using VMD decomposition. Figure 12 illustrates that the recognition accuracy in each PD type is not less than 80% but no more than 90%, which means, using EMD decomposition, extracted PD features cannot represent most of signal characteristics. In contrary, Figure 13 shows that the recognition accuracy in each PD type is no less than 90%. Moreover, in each PD type, there's no misjudged sample with MDE. This means that, with VMD decomposition, PD features can effectively represent most of signal information. Besides, from above two figures, it gets a satisfactory result with MDE parameters.

EMD-MSE EMD-MPE EMD-MDE VMD-MSE VMD-MPE VMD-MDE
To compare the diagnosis results of PD features with different classifiers, artificial neural network (ANN) [38] and support vector machine (SVM) classifiers are introduced for PD pattern recognition. In ANN, back-propagation network is employed as the recognition model, which trains the weight with differentiable nonlinear functions. The classifier parameters are shown in Table 11. σ is the kernel parameter of RBF and C is the penalty factor in SVM. With the parameters shown in Tables 10 and 11, ANN, SVM and HMSVM classifiers are constructed for PD pattern recognition. Using diverse classifiers, the recognition result with VMD-MDE can be seen in Figure 14. With the parameters shown in Tables 10 and 11, ANN, SVM and HMSVM classifiers are constructed for PD pattern recognition. Using diverse classifiers, the recognition result with VMD-MDE can be seen in Figure 14. Table 12 shows the integrative result using different PD features, in which running time means the time used for PD fault diagnosis.  As can be illustrated in Figure 14, using the same PD feature extraction method, the recognition results with different classifiers are significantly different. The average classification accuracy achieved using HMSVM is 100.00%. HMSVM shows great advantages over ANN and SVM. Table 12 shows diverse diagnostic results with different PD features. Compared with different PD feature types, VMD-MDE gives less running time and higher recognition accuracy. It means parameters using VMD-MDE can represent most of PD signal components. The quadratic programming  As can be illustrated in Figure 14, using the same PD feature extraction method, the recognition results with different classifiers are significantly different. The average classification accuracy achieved using HMSVM is 100.00%. HMSVM shows great advantages over ANN and SVM. Table 12 shows diverse diagnostic results with different PD features. Compared with different PD feature types, VMD-MDE gives less running time and higher recognition accuracy. It means parameters using VMD-MDE can represent most of PD signal components. The quadratic programming calculation of HMSVM is less than that of SVM, which causes shorter training and testing time. In addition, HMSVM shows better classification ability than other two classifiers, ANN and SVM.

Conclusions
In this paper, a novel PD fault diagnosis method is proposed. This method combines PD feature extraction based on VMD-MDE and PD pattern recognition based on HMSVM. First of all, four types of PD signals are extracted in the experimental environment, including FD, ND, BD and CD. Then VMD is employed for PD signal decomposition. Secondly, proper IMFs are selected according to central frequency observation and MDE values in each IMF are calculated. Afterwards PCA is introduced to select effective principle components in MDE as final PD characteristic parameters. Finally, the extracted principle factors are used as PD features and sent to the HMSVM classifier. Experiment results show the following advantages: the proposed method can extract effective IMFs according to VMD decomposition. PD feature information in IMFs can be quantified successfully with MDE. Using PCA, the principle components which represent prominent characteristics are effectively selected. With small data size and low computational complexity, this approach overcomes the limitations in traditional PD feature extraction methods. Compared with PD feature extraction methods based on EMD-MSE, EMD-MPE, EMD-MDE, VMD-MSE and VMD-MPE, this proposed approach based on VMD-MDE achieves higher recognition accuracy and needs less running time, which can improve the diagnosis efficiency to satisfy real time requirements.
HMSVM uses one hypersphere for pattern recognition. HMSVM can not only separate two different classes, but also divide the sample space into two different parts. Using HMSVM, the classification of multi-classes was realized directly. Compared with ANN and SVM classifiers, HMSVM obtains higher recognition rate and improves the accuracy and efficiency in PD fault diagnosis. On the whole, this proposed method provided a new scheme for PD fault diagnosis. For further consideration, the proposed fault diagnosis method can be employed in PD on-line monitoring and diagnosis.