A New Denoising Method for UHF PD Signals Using Adaptive VMD and SSA-Based Shrinkage Method

Noise suppression is one of the key issues for the partial discharge (PD) ultra-high frequency (UHF) method to detect and diagnose the insulation defect of high voltage electrical equipment. However, most existing denoising algorithms are unable to reduce various noises simultaneously. Meanwhile, these methods pay little attention to the feature preservation. To solve this problem, a new denoising method for UHF PD signals is proposed. Firstly, an automatic selection method of mode number for the variational mode decomposition (VMD) is designed to decompose the original signal into a series of band limited intrinsic mode functions (BLIMFs). Then, a kurtosis-based judgement rule is employed to select the effective BLIMFs (eBLIMFs). Next, a singular spectrum analysis (SSA)-based thresholding technique is presented to suppress the residual white noise in each eBLIMF, and the final denoised signal is synthesized by these denoised eBLIMFs. To verify the performance of our method, UHF PD data are collected from the computer simulation, laboratory experiment and a field test, respectively. Particularly, two new evaluation indices are designed for the laboratorial and field data, which consider both the noise suppression and feature preservation. The effectiveness of the proposed approach and its superiority over some traditional methods is demonstrated through these case studies.


Introduction
Power systems constitute critical societal infrastructure, and play an increasingly important role in the development of the national economy and the improvement of the population's living standard [1]. Accompanying inevitable technological advances, the probability of failure of electrical equipment and the resulting damage will also increase greatly. Therefore, timely and accurate monitoring of the operating status of electrical equipment plays a vital role to ensure its safe and stable operation, and to prevent large-scale blackouts. In most cases, high-voltage electrical equipment is accompanied by partial discharge (PD) phenomenon before catastrophic failure [2][3][4]. Among various PD detection methods, the ultra-high frequency (UHF) technique has been widely used due to its superior performance in location [5], type recognition [6] and severity evaluation of PD faults [7]. However, the complicated and unpredictable electromagnetic environment in a substation may cause the PD signals to be contaminated by noise, leading to an incorrect diagnosis. Hence, a reliable noise suppression method is a prerequisite for the accurate detection and diagnosis of PD [8]. (iii) For each selected eBLIMF, the dominant singular values (DSVs) are retained at first. Then, they will be used to reconstruct PD signal by diagonal averaging. Next, the rescaling thresholding technique [9] is applied to further remove the residual white noise in each eBLIMF. Finally, the denoised UHF PD signal is obtained by adding up all these denoised eBLIMFs.
The rest of this paper is organized as follows: Section 2 reviews the mathematical background. The proposed method is detailed in Section 3. Simulative case, laboratorial case and field case are respectively analyzed in Sections 4-6. Main contributions and some open questions of our paper is discussed in Section 7. Finally, the conclusion is drawn in Section 8.

Variational Mode Decomposition
In VMD, the principle modes of a signal are redefined as a set of AM-FM functions, which can be expressed as: where A k (t) and the derivative of φ k (t) are non-negative. The VMD algorithm mainly includes the following two steps: (i) Construction of the variational problem The purpose of this step is to obtain K mode functions (i.e., u k (t) in Equation (1)) that minimize the summation of the bandwidths of all modes, under the constraint that the sum of all modes is equal to the original signal. This problem can be formulated as the following constrained variational problem: where u k is the kth BLIMF, and w k is the center frequency of u k . Notation 2 2 denotes the squared L 2 norm, ∂ t is derivative operator, and δ(t) is the Dirac function.
(ii) Solving the above variational problem To solve the Equation (2), the constrained variational problem should be transformed into unconstrained first. This can be achieved by introducing the Lagrangian multiplier λ and quadratic penalty term α. The new unconstrained problem can be formulated as: Through a series of mathematical derivations based on the Alternate Direction Method of Multipliers (ADMM) and Parseval/Plancherel Fourier isometry, the above problem can be solved in the spectral domain as follows:û Based on the above, the complete process of the VMD method is summarized as follows: Sensors 2019, 19, 1594 4 of 23 Step 1: Initialize the parameters of the first loop u 1 k , w 1 k , λ 1 , k = 1, 2, . . . , K, and K is the predefined number of decomposed modes. In addition, set the cycle index n = 0; Step 2: Let n = n + 1, then begin the outer loop; Step 3: Execute the first inner loop according to Equation (4) to update the K BLIMFs in the spectral domain û n+1 k (w) ; Step 4: Execute the second inner loop according to Equation (5) to update the center frequencies of all BLIMFs in the spectral domain w n+1 k .
Step 5: Update the Lagrangian multiplier by the following expression: Step 6: Repeat the algorithm from Step 2 to Step 5 until the following condition is satisfied:

Singular Spectrum Analysis
For a discrete finite-duration signal X = {x n , n = 1...N}, SSA is usually applied to decompose X into a set of physically interpretable components. These components may contain different features of the X, thus SSA is considered to be a power tool for signal denoising or classification [22]. The main process of SSA can be summarized as follows: Step 1: Embedding In this step, the original signal X will be mapped into a trajectory matrix. Many matrix forms can be adopted to build a trajectory matrix, such as Toeplitz matrix, cycle matrix, or Hankel matrix. Among them, the Hankel matrix is most widely used due to its zero-phase shift property [23]. It is made up of M column vectors of length L, as shown in Equation (8).
where M = N − L + 1. One of the important features of Hankel matrix is that it has equal diagonal elements. In addition, it should be noted that the parameter L will have great impact on the results of SSA, thus it should be carefully selected for specific application.
Step 2: Decomposition By using SVD, the Hankel matrix will be decomposed into a sum of sub-space matrices, which are orthogonal with each other. This process can be formulated as: (9) where ∑ = diag(σ 1 , . . . , σ i , . . . σ R ), R = min(L, M) is the rank of the Hankel matrix, and σ i is the ith SV in descending order. In addition u i and v i are the column vectors of orthogonal matrices U and V, respectively.
Step 3: Grouping This step aims at dividing the matrices {A 1 , A 2 , . . . , A R } into r disjoint groups I m (m = 1, . . . , r) according to the characteristics of the sub-components within the raw signal. Therefore, summation Sensors 2019, 19, 1594 5 of 23 of the matrices in I m can be denoted as A I m = ∑ i∈I m A i . Adding all A I m together will reconstruct the original Hankel matrix A.
Step 4: Diagonal averaging A common way to recover the signal X is called the diagonal averaging method, which uses the average of the diagonal elements of reconstructed Hankel matrix as the element of X.

Adaptive VMD
As discussed before, the performance of VMD is heavily dependent on the choice of its parameters, especially for the number of modes K. Some related studies have discussed the parameter selection problem of VMD, which mainly focus on optimizing by heuristic optimization algorithms. For example, a searching method for quadratic penalty term α and mode number K is presented in [24] based on the artificial fish swarm (AFS) algorithm. Besides, the particle swarm optimization (PSO) algorithm has also been applied to VMD in [8]. Although these methods can obtain appropriate parameters to some extent, they may need to go through many iterations before convergence. Moreover, one may have to set initial parameters of heuristic optimization algorithms manually, making the VMD method more complicated.
In this paper, the quadratic penalty term a and time-step of the dual ascent τ are respectively set to 2000 and 0, and we focus on the optimization method of the mode number K. The basic idea behind this adaptive VMD (AVMD) method is simple and straight: increase the value of K (starting from 2) step by step and obtain the BLIMFs by VMD during each step. When there is no mode mixing happened in all BLIMFs for a certain K, then this value is considered as optimal. Based on this concept, the schematic diagram of the proposed method is demonstrated in Figure 1. Step 4: Diagonal averaging A common way to recover the signal X is called the diagonal averaging method, which uses the average of the diagonal elements of reconstructed Hankel matrix as the element of X .

Adaptive VMD
As discussed before, the performance of VMD is heavily dependent on the choice of its parameters, especially for the number of modes K . Some related studies have discussed the parameter selection problem of VMD, which mainly focus on optimizing by heuristic optimization algorithms. For example, a searching method for quadratic penalty term α and mode number K is presented in [24] based on the artificial fish swarm (AFS) algorithm. Besides, the particle swarm optimization (PSO) algorithm has also been applied to VMD in [8]. Although these methods can obtain appropriate parameters to some extent, they may need to go through many iterations before convergence. Moreover, one may have to set initial parameters of heuristic optimization algorithms manually, making the VMD method more complicated.
In this paper, the quadratic penalty term a and time-step of the dual ascent τ are respectively set to 2000 and 0, and we focus on the optimization method of the mode number K. The basic idea behind this adaptive VMD (AVMD) method is simple and straight: increase the value of K (starting from 2) step by step and obtain the BLIMFs by VMD during each step. When there is no mode mixing happened in all BLIMFs for a certain K, then this value is considered as optimal. Based on this concept, the schematic diagram of the proposed method is demonstrated in Figure 1. As can be seen from Figure 1, the key of this method is to determine whether there is mode mixing occurred in decomposed BLIMFs. For this purpose, the first important task is to find out the local maximum points (LMPs) of BLIMF spectrum, which are considered to belong to the potential effective components. This can be achieved by the following two steps: (i) Preliminary screening To begin with, the frequency spectrum is divided into several consecutive segments, and each segment has the same length 1 L (the last segment may not). Then, the maximal point in each segment will be picked out to form a new sequence  As can be seen from Figure 1, the key of this method is to determine whether there is mode mixing occurred in decomposed BLIMFs. For this purpose, the first important task is to find out the local maximum points (LMPs) of BLIMF spectrum, which are considered to belong to the potential effective components. This can be achieved by the following two steps: (i) Preliminary screening To begin with, the frequency spectrum is divided into several consecutive segments, and each segment has the same length L 1 (the last segment may not). Then, the maximal point in each segment will be picked out to form a new sequence segmax = seg_1 max , . . . , seg_n max , . . . , seg_ N max , where seg_n max denotes the maximal point of the nth segment, and N max is the number of segments.
(ii) Peaks confirmation After the previous process, it is easy to find that the LMPs of the frequency spectrum must be the points within segmax. Therefore, the LMPs can be obtained easily by searching the peak points of segmax, remarked as pks.
Due to the influences of potential multiple components or interfering noise, the number of pks is likely to be more than one. Consequently, the remaining problem is to determine whether other points in pks except for the maximum are active components. Denote the maximal point of pks as max_temp, then the proposed rules for mode-mixing determination is given as follows: for any point in pks except for max_temp, if the distance between max_temp and this point is larger than a pre-set value, and in the meanwhile, the amplitude of this point is higher than a pre-set value, then the corresponding frequency component at this point is considered as active. If the point satisfying the above conditions exists, it will imply that there are at least two active components in the current BLIMF (including max_temp). Obviously, this means mode-mixing has occurred. The pseudocode of this judgment algorithm is detailed in Algorithm 1. (ii) Peaks confirmation After the previous process, it is easy to find that the LMPs of the frequency spectrum must be the points within segmax. Therefore, the LMPs can be obtained easily by searching the peak points of segmax, remarked as pks.
Due to the influences of potential multiple components or interfering noise, the number of pks is likely to be more than one. Consequently, the remaining problem is to determine whether other points in pks except for the maximum are active components. Denote the maximal point of pks as max_temp, then the proposed rules for mode-mixing determination is given as follows: for any point in pks except for max_temp, if the distance between max_temp and this point is larger than a pre-set value, and in the meanwhile, the amplitude of this point is higher than a pre-set value, then the corresponding frequency component at this point is considered as active. If the point satisfying the above conditions exists, it will imply that there are at least two active components in the current BLIMF (including max_temp). Obviously, this means mode-mixing has occurred. The pseudocode of this judgment algorithm is detailed in Algorithm 1.
Follow the workflow in Figure 1 and the decision rule in Algorithm 1, the optimal mode number K can be obtained in just a few steps.

Effective BLIMF Selection
Based on the AVMD algorithm presented in the previous section, the PD signal is decomposed into K BLIMFs, and each BLIMF contains only one active component. However, not all these active components are PD components. In fact, the generating mechanism of PD electromagnetic signal and its propagation path determine that it exhibits characteristics of damped oscillation and steep rising edge in the time domain, while narrowband periodic noise or white noise does not show such characteristic. Therefore, in this paper, BLIMF with following features is defined as effective BLIMF (abbreviate as eBLIMF): (i) with central frequency in UHF band (i.e., higher than 0.3GHz); (ii) exhibits damped oscillation in time domain; (iii) has steep rising edge. Based on the above analysis, the kurtosis operator which is sensitive to abrupt change is selected as the indicator, and the decision rule is formulated as:  Figure 1 and the decision rule in Algorithm 1, the optimal mode number K can be obtained in just a few steps.

Effective BLIMF Selection
Based on the AVMD algorithm presented in the previous section, the PD signal is decomposed into K BLIMFs, and each BLIMF contains only one active component. However, not all these active components are PD components. In fact, the generating mechanism of PD electromagnetic signal and its propagation path determine that it exhibits characteristics of damped oscillation and steep rising edge in the time domain, while narrowband periodic noise or white noise does not show such characteristic. Therefore, in this paper, BLIMF with following features is defined as effective BLIMF (abbreviate as eBLIMF): (i) with central frequency in UHF band (i.e., higher than 0.3GHz); (ii) exhibits damped oscillation in time domain; (iii) has steep rising edge. Based on the above analysis, the kurtosis operator which is sensitive to abrupt change is selected as the indicator, and the decision rule is formulated as: eBLI MF = u k s.t. kurtosis(u k ) > ε 1 and f 0 (u k ) > 0.3GHz (10) where ε 1 is a pre-set threshold. f 0 ( ) is the central frequency operator, and kurtosis( ) is the kurtosis operator, which is calculated as: where X is a discrete signal defined in Section 2.2, and x denotes its mean value.

SSA-based Shrinkage method
In this section, a denoising algorithm based on SSA and shrinkage method is designed. For each eBLIMF, the Hankel matrix is constructed and be decomposed into several sub-space matrices A i by SVD. Since there is only one dominant component in each eBLIMF, the grouping step described in Section 2.2 becomes easy. Assume the SVs are sorted in descending order, and the ratio of each SV to the sum of all SVs are denoted as {q 1 , q 2 , . . . , q R } (R is the rank of Hankel matrix), a common way to choose DSVs is to compute the cumulative sum of q until its value reached a proper threshold ε 2 . This process can be expressed as: Therefore, the first N 1 SVs are decided as DSVs. For each DSV, its corresponding sub-space matrices A i is computed by Equation (9), then the reconstructed signalX i based on A i can be obtained by the diagonal averaging method. In order to further suppress the white noise inX i , the shrinkage technology typically used in WT-based denoising is employed [9]. Specifically, the multiplicative threshold rescaling scheme with sqtwolog rule is adopted, which is expressed as: where n i is the length ofX i , R DSVs is the set of DSVs, ζ i is the threshold ofX i . Afterwards, the hard threshold function is applied to denoise the recovered signalX i , given by: wherex ij is the jth element ofX i . At last, for all DSVs, the denoisedx ij are added to form the denoised eBLIMF: In Equation (15),X j is the jth element of the denoised eBLIMF, and N 1 has been explained in Equation (12). The whole procedure of SSA-based shrinkage method is illustrated in Algorithm 2.

Implementation Procedure of Proposed AVMDSSA Method
Based on the above descriptions, the schematic diagram of AVMDSSA method is demonstrated in Figure 2, and the overall process is summarized as follows: (i) Optimization of the number of modes K by gradually increasing its value and judging whether there is mode-mixing happened in each BLIMF at every step.
(ii) Decompose the UHF PD signal into a set of BLIMFs by VMD with the optimal K parameter, then a kurtosis-based selection method is employed to pick out the eBLIMFs.
(iii) For each eBLIMF, the SSA-based Shrinkage denoising method is applied to suppress the white noise, and summation of all denoised eBLIMFs will recover the denoised UHF PD signal.

Synthetic UHF PD Signal
/* diagonal averaging to recover signal */ denoised_sig i =shrinkage(sig i ); /* shrinkage denoising by Equation (13) Adding all of the denoised eBLIMFs together to reconstruct the denoised UHF PD signal

Implementation Procedure of Proposed AVMDSSA Method
Based on the above descriptions, the schematic diagram of AVMDSSA method is demonstrated in Figure 2, and the overall process is summarized as follows: (i) Optimization of the number of modes K by gradually increasing its value and judging whether there is mode-mixing happened in each BLIMF at every step. (ii) Decompose the UHF PD signal into a set of BLIMFs by VMD with the optimal K parameter, then a kurtosis-based selection method is employed to pick out the eBLIMFs. (iii) For each eBLIMF, the SSA-based Shrinkage denoising method is applied to suppress the white noise, and summation of all denoised eBLIMFs will recover the denoised UHF PD signal.

Implementation Procedure of Proposed AVMDSSA Method
Based on the above descriptions, the schematic diagram of AVMDSSA method is demonstrated in Figure 2, and the overall process is summarized as follows: (i) Optimization of the number of modes K by gradually increasing its value and judging whether there is mode-mixing happened in each BLIMF at every step.
(ii) Decompose the UHF PD signal into a set of BLIMFs by VMD with the optimal K parameter, then a kurtosis-based selection method is employed to pick out the eBLIMFs.
(iii) For each eBLIMF, the SSA-based Shrinkage denoising method is applied to suppress the white noise, and summation of all denoised eBLIMFs will recover the denoised UHF PD signal.

Synthetic UHF PD Signal
To verify the ability of the proposed method to deal with complex signal, a synthetic UHF PD signal with multiple PD components and various noises is simulated. Here, we use the Single Exponential Decay Oscillating (SEDO) pulse and Double Exponential Decay Oscillating (DEDO) pulse to simulate the PD signal. Their respective mathematical expressions are as follows [6]: where B is the amplitude, τ 1 and τ 2 are attenuation coefficients, f c is the center frequency. The synthetic UHF PD signal combines one SEDO pulse, two DEDO pulses and white noise. In addition, two periodic narrowband noises (PNN) are added, which is formulated as: Detailed parameters of this simulative signal are listed in Table 1, and its Signal to Noise Ratio (SNR) is −5.576 dB. Figure 3 shows the time-domain waveforms and frequency spectra of the pure and noisy PD signal, respectively. One can see from Figure 3c that the original components are almost unrecognizable from the noisy signal. Here, the proposed denoising method is applied to suppress the complex noise, and some relevant parameters are shown in Table 2. To verify the ability of the proposed method to deal with complex signal, a synthetic UHF PD signal with multiple PD components and various noises is simulated. Here, we use the Single Exponential Decay Oscillating (SEDO) pulse and Double Exponential Decay Oscillating (DEDO) pulse to simulate the PD signal. Their respective mathematical expressions are as follows [6]: where B is the amplitude, 1 τ and 2 τ are attenuation coefficients, c f is the center frequency. The synthetic UHF PD signal combines one SEDO pulse, two DEDO pulses and white noise. In addition, two periodic narrowband noises (PNN) are added, which is formulated as: Detailed parameters of this simulative signal are listed in Table 1, and its Signal to Noise Ratio (SNR) is -5.576 dB. Figure 3 shows the time-domain waveforms and frequency spectra of the pure and noisy PD signal, respectively. One can see from Figure 3c that the original components are almost unrecognizable from the noisy signal. Here, the proposed denoising method is applied to suppress the complex noise, and some relevant parameters are shown in Table 2.

Denoising Results
Starting with K = 2, the process for denoising of the PD signal shown in Figure 3c is depicted in Figure 4. It is observed from Figure 4b,d that the local maximums (dashed green line with square markers) and their peak points (red asterisks) can reflect well the active components in frequency spectrum. According to Algorithm 1 and the parameters listed in Table 2, Point A and Point B in Figure 4b are judged as belonging to two different components, respectively (criterion 1: 301−145 > 80 and criterion 2: 126.9 > 233.2 × 0.1 are satisfied simultaneously). Similarly, Point C and Point D in Figure 4d are decided to represent diverse components. The above results show that when K = 2, the mode-mixing occurs in the decomposed BLIMFs. In fact, when K = 2, the mode number of the VMD algorithm is much smaller than the number of the real components in original signal (i.e., 5), leading to the so-called undersegmentation phenomenon [18]. Therefore, according to the proposed algorithm, K should be increased.

Denoising Results
Starting with K = 2, the process for denoising of the PD signal shown in Figure 3c is depicted in Figure 4. It is observed from Figure 4b,d that the local maximums (dashed green line with square markers) and their peak points (red asterisks) can reflect well the active components in frequency spectrum. According to Algorithm 1 and the parameters listed in Table 2, Point A and Point B in Figure 4b are judged as belonging to two different components, respectively (criterion 1: 301−145 > 80 and criterion 2: 126.9 > 233.2 × 0.1 are satisfied simultaneously). Similarly, Point C and Point D in Figure 4d are decided to represent diverse components. The above results show that when K = 2, the mode-mixing occurs in the decomposed BLIMFs. In fact, when K = 2, the mode number of the VMD algorithm is much smaller than the number of the real components in original signal (i.e., 5), leading to the so-called undersegmentation phenomenon [18]. Therefore, according to the proposed algorithm, K should be increased. For each K value, the mode-mixing judgement of BLIMF is carried out. Considering the limited space of our paper, we omit the graphic illustration of every optimization process, and provide only the judgement results in Table 3. As can be seen, when K increases to 8, there is no mode-mixing happened in all BLIMFs, indicating that the noisy signal has been decomposed completely. This can be seen from Figure 5, which shows the optimizing procedure when K = 8. Comparing Figure 5d with Figure 3d intuitively, one can see that all the active components in original signal are separated well. In addition, by using the decision rule in Algorithm 1, only one dominant component is identified in every BLIMF, that is, there is no mode-mixing in all decomposed modes. Other subgraphs of Figure  5 demonstrate these decision processes. For each K value, the mode-mixing judgement of BLIMF is carried out. Considering the limited space of our paper, we omit the graphic illustration of every optimization process, and provide only the judgement results in Table 3. As can be seen, when K increases to 8, there is no mode-mixing happened in all BLIMFs, indicating that the noisy signal has been decomposed completely. This can be seen from Figure 5, which shows the optimizing procedure when K = 8. Comparing Figure 5d with Figure 3d intuitively, one can see that all the active components in original signal are separated well. In addition, by using the decision rule in Algorithm 1, only one dominant component is identified in every BLIMF, that is, there is no mode-mixing in all decomposed modes. Other subgraphs of Figure 5 demonstrate these decision processes.    The second step of AVMDSSA is to pick out the eBLIMFs. To obtain an appropriate kurtosis threshold, we simulate six different types of signals, and each type contains one hundred randomly generated signals. Details of these signal are given below: (1) WGN: the white gaussian noise (WGN) The second step of AVMDSSA is to pick out the eBLIMFs. To obtain an appropriate kurtosis threshold, we simulate six different types of signals, and each type contains one hundred randomly generated signals. Details of these signal are given below: (1) WGN: the white gaussian noise (WGN) with power of 0 dBw; (2) Pulse1+WGN/ Pulse2+WGN/ Pulse3+WGN: PD signal Pulse1, Pulse2, Pulse3 added by WGN, respectively, and SNR = 0 dB; (3) PNN1/PNN2: the noise signal PNN1, PNN2 with arbitrary amplitude between 0.1 mV and 0.2 mV. The kurtosis results of all these signals are given in Figure 6. As can be readily seen, sequences containing PD signals have kurtosis values greater than those without PD signals. Particularly, kurtosis values of sequences containing PD signals are all larger than 10. Therefore, we choose this value as the threshold. It should be noted that when extremely severe noise is mixed in PD signal, the kurtosis value will decrease, thus this threshold should be reduced accordingly. However, considering that the SNR of BLIMF after VMD is not likely too low, the threshold 10 is suitable.
Sensors 2019, 19 FOR PEER REVIEW 12 with power of 0 dBw; (2) Pulse1+WGN/ Pulse2+WGN/ Pulse3+WGN: PD signal Pulse1, Pulse2, Pulse3 added by WGN, respectively, and SNR = 0 dB; (3) PNN1/PNN2: the noise signal PNN1, PNN2 with arbitrary amplitude between 0.1 mV and 0.2 mV. The kurtosis results of all these signals are given in Figure 6. As can be readily seen, sequences containing PD signals have kurtosis values greater than those without PD signals. Particularly, kurtosis values of sequences containing PD signals are all larger than 10. Therefore, we choose this value as the threshold. It should be noted that when extremely severe noise is mixed in PD signal, the kurtosis value will decrease, thus this threshold should be reduced accordingly. However, considering that the SNR of BLIMF after VMD is not likely too low, the threshold 10 is suitable. After obtaining the eBLIMFs, the SSA-based Shrinkage method is applied, and final results are depicted in Figure 7. Comparing with Figure 3, one can easily observe that the various noises in the noisy PD signal have been well suppressed. Moreover, those effective PD components in original signal have been completely reserved. This example shows the capability of the proposed method in not only the noise reduction, but also the feature preservation.

Noise Robustness
Before discussing the robustness of the proposed method, two common evaluation indices are introduced to evaluate the denoising performance, which are the normalized correlation coefficient (NCC) and SNR. Detailed calculating formulas of these indices can be found in [10,25]. In this subsection, noisy UHF PD signals with different SNR levels are generated by adding WGN to the pure signal shown in Figure 3a. One hundred signals are simulated in each SNR level, and the values After obtaining the eBLIMFs, the SSA-based Shrinkage method is applied, and final results are depicted in Figure 7. Comparing with Figure 3, one can easily observe that the various noises in the noisy PD signal have been well suppressed. Moreover, those effective PD components in original signal have been completely reserved. This example shows the capability of the proposed method in not only the noise reduction, but also the feature preservation.
Sensors 2019, 19 FOR PEER REVIEW 12 with power of 0 dBw; (2) Pulse1+WGN/ Pulse2+WGN/ Pulse3+WGN: PD signal Pulse1, Pulse2, Pulse3 added by WGN, respectively, and SNR = 0 dB; (3) PNN1/PNN2: the noise signal PNN1, PNN2 with arbitrary amplitude between 0.1 mV and 0.2 mV. The kurtosis results of all these signals are given in Figure 6. As can be readily seen, sequences containing PD signals have kurtosis values greater than those without PD signals. Particularly, kurtosis values of sequences containing PD signals are all larger than 10. Therefore, we choose this value as the threshold. It should be noted that when extremely severe noise is mixed in PD signal, the kurtosis value will decrease, thus this threshold should be reduced accordingly. However, considering that the SNR of BLIMF after VMD is not likely too low, the threshold 10 is suitable. After obtaining the eBLIMFs, the SSA-based Shrinkage method is applied, and final results are depicted in Figure 7. Comparing with Figure 3, one can easily observe that the various noises in the noisy PD signal have been well suppressed. Moreover, those effective PD components in original signal have been completely reserved. This example shows the capability of the proposed method in not only the noise reduction, but also the feature preservation.

Noise Robustness
Before discussing the robustness of the proposed method, two common evaluation indices are introduced to evaluate the denoising performance, which are the normalized correlation coefficient (NCC) and SNR. Detailed calculating formulas of these indices can be found in [10,25]. In this subsection, noisy UHF PD signals with different SNR levels are generated by adding WGN to the pure signal shown in Figure 3a. One hundred signals are simulated in each SNR level, and the values

Noise Robustness
Before discussing the robustness of the proposed method, two common evaluation indices are introduced to evaluate the denoising performance, which are the normalized correlation coefficient (NCC) and SNR. Detailed calculating formulas of these indices can be found in [10,25]. In this subsection, noisy UHF PD signals with different SNR levels are generated by adding WGN to the pure signal shown in Figure 3a. One hundred signals are simulated in each SNR level, and the values of SNR are as follows: −10 dB, −8 dB, −6 dB, −4 dB, −2 dB, 0 dB, 2 dB, 4 dB, 6 dB, 8 dB, 10 dB, 12 dB, 14 dB, 16 dB. Denoising results of these signals by AVMDSSA are depicted in Figure 8.
As can be seen from Figure 8a, the SNR value has been greatly improved after denoising. Even when the SNR of raw signal drops to −10 dB, it can still be raised to about 7.46 dB after denoising. In addition, with the increasement of SNR of the raw signals, the SNR results of the denoised signals also improve drastically, and gradually converge to their mean value. Similar denoising performance can also be found in Figure 8b by another index NCC. Therefore, we can conclude that the AVMDSSA method can achieve satisfactory denoising performance under different noise level. In the meanwhile, we should admit that the denoising performance will fluctuate to some extent at low SNR scenario, but still within the acceptable range.

Comparison with Traditional Denoising Methods
In this part, four other denoising algorithms are employed to compare with the proposed method, which are as follows.  As can be seen from Figure 8a, the SNR value has been greatly improved after denoising. Even when the SNR of raw signal drops to −10 dB, it can still be raised to about 7.46 dB after denoising. In addition, with the increasement of SNR of the raw signals, the SNR results of the denoised signals also improve drastically, and gradually converge to their mean value. Similar denoising performance can also be found in Figure 8b by another index NCC. Therefore, we can conclude that the AVMDSSA method can achieve satisfactory denoising performance under different noise level. In the meanwhile, we should admit that the denoising performance will fluctuate to some extent at low SNR scenario, but still within the acceptable range.

Comparison with Traditional Denoising Methods
In this part, four other denoising algorithms are employed to compare with the proposed method, which are as follows.    [15] is unable to remove the narrowband noise of UHF PD signal. This is mainly because the grouping scheme of singular components in this method is energy-oriented, and the periodic narrowband noise is exactly energy-rich. Thus, ASVD is failed to denoise this kind of noise. From Figure 9e,h, one can easily see that both of Method 2 and Method 3 loss the 5 GHz component (i.e., Pulse 1). Meanwhile, both methods introduce strong low frequency interferences. In addition, there are many small oscillations in Figure 9g which called the Pseudo-Gibbs phenomenon, resulting in severe high-frequency interferences in its spectrum. In contrast, Method 4 shows better performance in preserving PD components. However, slight noise is still existed in its spectrum. Comparing these results with Figure 7, there is no doubt that the presented method AVMDSSA shows superior performance not only in noise suppression, but also in feature preservation.
To quantitatively evaluate the performance of the above algorithms, noisy UHF PD signals in five different SNR levels are simulated by adding WGN and PNN to the pure signal shown in Figure  3a. To obtain a more credible result, one hundred signals are generated for each SNR level, and the average results of SNR and NCC after denoising are depicted in Figure 10. One can readily observe that the AVMDSSA algorithm exhibits better denoising performance over other methods in either SNR or NCC, even in an extremely noisy circumstance.   [15] is unable to remove the narrowband noise of UHF PD signal. This is mainly because the grouping scheme of singular components in this method is energy-oriented, and the periodic narrowband noise is exactly energy-rich. Thus, ASVD is failed to denoise this kind of noise. From Figure 9e,h, one can easily see that both of Method 2 and Method 3 loss the 5 GHz component (i.e., Pulse 1). Meanwhile, both methods introduce strong low frequency interferences. In addition, there are many small oscillations in Figure 9g which called the Pseudo-Gibbs phenomenon, resulting in severe high-frequency interferences in its spectrum. In contrast, Method 4 shows better performance in preserving PD components. However, slight noise is still existed in its spectrum. Comparing these results with Figure 7, there is no doubt that the presented method AVMDSSA shows superior performance not only in noise suppression, but also in feature preservation.
To quantitatively evaluate the performance of the above algorithms, noisy UHF PD signals in five different SNR levels are simulated by adding WGN and PNN to the pure signal shown in Figure 3a.
To obtain a more credible result, one hundred signals are generated for each SNR level, and the average results of SNR and NCC after denoising are depicted in Figure 10. One can readily observe that the AVMDSSA algorithm exhibits better denoising performance over other methods in either SNR or NCC, even in an extremely noisy circumstance.

Laboratorial PD Measurement Setup
To obtain the measured UHF PD data under different insulation defects, PD experiments were carried out in the high voltage laboratory of the China Electric Power Research Institute. Experimental setup and necessary descriptions of its accessories are given in Figure 11. Additionally, Figure 12 shows the physical drawings of all artificial defect models. During the test, the PD signals were coupled by the built-in UHF sensor firstly, then they were sampled and stored by a high-speed digital storage oscilloscope (LeCroy WaveRunner 204Xi-A, 10GS/s, 2GHz).  In our experiments, all defects were tested, and we denote the UHF PD signals from the floating discharge model, protrusion discharge model, particle discharge model and air-gap discharge model as Type1, Type 2, Type 3 and Type 4, respectively. Typical waveforms of these discharge signals are

Laboratorial PD Measurement Setup
To obtain the measured UHF PD data under different insulation defects, PD experiments were carried out in the high voltage laboratory of the China Electric Power Research Institute. Experimental setup and necessary descriptions of its accessories are given in Figure 11. Additionally, Figure 12 shows the physical drawings of all artificial defect models. During the test, the PD signals were coupled by the built-in UHF sensor firstly, then they were sampled and stored by a high-speed digital storage oscilloscope (LeCroy WaveRunner 204Xi-A, 10GS/s, 2GHz).

Laboratorial PD Measurement Setup
To obtain the measured UHF PD data under different insulation defects, PD experiments were carried out in the high voltage laboratory of the China Electric Power Research Institute. Experimental setup and necessary descriptions of its accessories are given in Figure 11. Additionally, Figure 12 shows the physical drawings of all artificial defect models. During the test, the PD signals were coupled by the built-in UHF sensor firstly, then they were sampled and stored by a high-speed digital storage oscilloscope (LeCroy WaveRunner 204Xi-A, 10GS/s, 2GHz).  In our experiments, all defects were tested, and we denote the UHF PD signals from the floating discharge model, protrusion discharge model, particle discharge model and air-gap discharge model as Type1, Type 2, Type 3 and Type 4, respectively. Typical waveforms of these discharge signals are

Laboratorial PD Measurement Setup
To obtain the measured UHF PD data under different insulation defects, PD experiments were carried out in the high voltage laboratory of the China Electric Power Research Institute. Experimental setup and necessary descriptions of its accessories are given in Figure 11. Additionally, Figure 12 shows the physical drawings of all artificial defect models. During the test, the PD signals were coupled by the built-in UHF sensor firstly, then they were sampled and stored by a high-speed digital storage oscilloscope (LeCroy WaveRunner 204Xi-A, 10GS/s, 2GHz).  In our experiments, all defects were tested, and we denote the UHF PD signals from the floating discharge model, protrusion discharge model, particle discharge model and air-gap discharge model as Type1, Type 2, Type 3 and Type 4, respectively. Typical waveforms of these discharge signals are In our experiments, all defects were tested, and we denote the UHF PD signals from the floating discharge model, protrusion discharge model, particle discharge model and air-gap discharge model as Type1, Type 2, Type 3 and Type 4, respectively. Typical waveforms of these discharge signals are shown in Figure 13. As can be seen, these PD signals are polluted by different noise. Furthermore, one can observe that there are multiple PD components in Figure 13b,h, making the denoising process more difficult.
Sensors 2019, 19 FOR PEER REVIEW 16 shown in Figure 13. As can be seen, these PD signals are polluted by different noise. Furthermore, one can observe that there are multiple PD components in Figure 13b,h, making the denoising process more difficult.

New Evaluation Indices for Practical Situation
Considering that the pure PD signal is not available in practice, it is impossible to calculate the evaluation index such as SNR or NCC. In addition, existing indicators only focus on the degree of noise reduction, while ignoring the preservation of the signal features after denoising. To give a more comprehensive evaluation of the denoising algorithm under practical conditions, two new indices are designed in this paper: (i) Index 1: The ratio of the Shannon entropy of the signal before and after denoising As we all know, the measured PD signals in practice are usually mixed with randomly distributed, uncertain noise. This uncertainty will decrease as the noise is suppressed, and the more thoroughly the noise is removed, the less the uncertainty will be. Therefore, the Shannon entropy, a measure of uncertainty, is employed to quantify this kind of uncertainty. The designed Index 1 can be calculated by the following formula: entropy nor denoised sig Index entropy nor noisy sig = (19) where denoised_sig and noisy_sig denote the UHF PD signal after and before denoising, respectively. nor( ) and entropy( ) are the operators of normalization and Shannon entropy. Obviously, lower Index 1 implies better denoising performance (ii) Index 2: The ratio of the summation of eBLIMFs' maximum frequency values before and after denoising Another key issue of denoising is the feature preservation. The basic idea behind this index is that the more the features are retained after denoising, the less the spectrum amplitude of effective components will decrease. Therefore, we adopt the criterion in Section 3.2 to decide the eBLIMFs of noisy_sig and denoised_sig, expressed as eBLIMFs1 and eBLIMFs2 respectively. Then, the maximum

New Evaluation Indices for Practical Situation
Considering that the pure PD signal is not available in practice, it is impossible to calculate the evaluation index such as SNR or NCC. In addition, existing indicators only focus on the degree of noise reduction, while ignoring the preservation of the signal features after denoising. To give a more comprehensive evaluation of the denoising algorithm under practical conditions, two new indices are designed in this paper: (i) Index 1: The ratio of the Shannon entropy of the signal before and after denoising As we all know, the measured PD signals in practice are usually mixed with randomly distributed, uncertain noise. This uncertainty will decrease as the noise is suppressed, and the more thoroughly the noise is removed, the less the uncertainty will be. Therefore, the Shannon entropy, a measure of uncertainty, is employed to quantify this kind of uncertainty. The designed Index 1 can be calculated by the following formula: where denoised_sig and noisy_sig denote the UHF PD signal after and before denoising, respectively. nor( ) and entropy( ) are the operators of normalization and Shannon entropy. Obviously, lower Index 1 implies better denoising performance (ii) Index 2: The ratio of the summation of eBLIMFs' maximum frequency values before and after denoising Another key issue of denoising is the feature preservation. The basic idea behind this index is that the more the features are retained after denoising, the less the spectrum amplitude of effective components will decrease. Therefore, we adopt the criterion in Section 3.2 to decide the eBLIMFs of noisy_sig and denoised_sig, expressed as eBLIMFs 1 and eBLIMFs 2 respectively. Then, the maximum frequency values of each eBLIMF in eBLIMFs 1 are added to form the MF 1 . Following the same way, we can calculate the other parameter MF 2 . Then, the Index 2 is obtained by the ratio of MF 2 to MF 1 . This process can be formulated as follows: where fft( ) is the Fast Fourier Transform (FFT) operator. And clearly, a larger Index 2 means the superior performance of the denoising algorithm in feature preservation. Figure 14 gives the denoising results of the UHF PD signals shown in Figure 13 by using the proposed AVMDSSA algorithm. Comparing with Figure 13, one can see that the low-frequency noise and communication carrier noise in each raw signal are suppressed completely, especially for Type 1 and Type 4, in which multiple PD components are existed. More importantly, all possible PD components marked in Figure 13 are well retained in denoised signals.

Denoising Results and Comparisons
Sensors 2019, 19 FOR PEER REVIEW 17 frequency values of each eBLIMF in eBLIMFs1 are added to form the MF1. Following the same way, we can calculate the other parameter MF2. Then, the Index 2 is obtained by the ratio of MF2 to MF1. This process can be formulated as follows: where fft( ) is the Fast Fourier Transform (FFT) operator. And clearly, a larger Index 2 means the superior performance of the denoising algorithm in feature preservation. Figure 14 gives the denoising results of the UHF PD signals shown in Figure 13 by using the proposed AVMDSSA algorithm. Comparing with Figure 13, one can see that the low-frequency noise and communication carrier noise in each raw signal are suppressed completely, especially for Type 1 and Type 4, in which multiple PD components are existed. More importantly, all possible PD components marked in Figure 13 are well retained in denoised signals. One may have noticed that one possible PD component in Figure 13h is lost in Figure 14h. In fact, it is easy to observe from the decomposed BLIMFs shown in Figure 15 (only plot the 4th, 5th and 6th BLIMF for example) that the missed component (i.e., the 4th BLIMF) is one kind of narrowband noise, rather a PD component. On the contrary, the time-domain waveform of the frequency component in Figure 15f shows obvious characteristics of PD (according to the decision criteria of eBLIMF discussed in Section 3.2), thus it can be identified as a real PD component. The above results demonstrate the effectiveness of our method in noise suppression and feature preservation for different kinds of measured UHF PD data. One may have noticed that one possible PD component in Figure 13h is lost in Figure 14h. In fact, it is easy to observe from the decomposed BLIMFs shown in Figure 15 (only plot the 4th, 5th and 6th BLIMF for example) that the missed component (i.e., the 4th BLIMF) is one kind of narrowband noise, rather a PD component. On the contrary, the time-domain waveform of the frequency component in Figure 15f shows obvious characteristics of PD (according to the decision criteria of eBLIMF discussed in Section 3.2), thus it can be identified as a real PD component. The above results demonstrate the effectiveness of our method in noise suppression and feature preservation for different kinds of measured UHF PD data. To compare the AVMDSSA with traditional methods quantitatively, the evaluation indices are computed by Equations (19) and (20). For each PD type, one hundred signals are used as the input, and the average results of the two indices are listed in Table 4. It can be observed that Method 1 has the worst denoising results, indicating that the ASVD method may not be suitable for processing complex UHF PD signals. In contrast, the AVMDSSA method shows the best results in either Index1 or Index2, showing that it has superior performance over traditional methods in both noise suppression and feature retention. In addition, it should be noted that although Method 1 has the largest Index2 among all methods, it does not mean that it has the best feature retention capability. In fact, the main reason for this is that it fails to remove the noise very well, causing a relatively smaller decrease of the amplitude than other methods.

Field Case Study
In this section, the field UHF PD data from a 220 kV current transformer are analyzed. The suspected PD signal radiates outward through the resin sprue on the basin insulator, and then be coupled by the UHF sensor and sent to the oscilloscope (25 GS/s, 6GHz) for sampling and storage. The picture of the field test is given in Figure 16, and the typical time-domain waveform and its spectrum are depicted in Figure 17. As can be seen, although the metal shell of GIS can shield most To compare the AVMDSSA with traditional methods quantitatively, the evaluation indices are computed by Equations (19) and (20). For each PD type, one hundred signals are used as the input, and the average results of the two indices are listed in Table 4. It can be observed that Method 1 has the worst denoising results, indicating that the ASVD method may not be suitable for processing complex UHF PD signals. In contrast, the AVMDSSA method shows the best results in either Index1 or Index2, showing that it has superior performance over traditional methods in both noise suppression and feature retention. In addition, it should be noted that although Method 1 has the largest Index2 among all methods, it does not mean that it has the best feature retention capability. In fact, the main reason for this is that it fails to remove the noise very well, causing a relatively smaller decrease of the amplitude than other methods.

Field Case Study
In this section, the field UHF PD data from a 220 kV current transformer are analyzed. The suspected PD signal radiates outward through the resin sprue on the basin insulator, and then be coupled by the UHF sensor and sent to the oscilloscope (25 GS/s, 6GHz) for sampling and storage. The picture of the field test is given in Figure 16, and the typical time-domain waveform and its spectrum are depicted in Figure 17. As can be seen, although the metal shell of GIS can shield most of the external interferences, the captured signal still has a complex composition, which makes the denoising work more challenging.
Sensors 2019, 19 FOR PEER REVIEW 19 of the external interferences, the captured signal still has a complex composition, which makes the denoising work more challenging. By using the proposed method, the mode number of VMD is optimized first, and the optimal value of K is 8 in this example. Figure 18 shows the decomposed BLIMFs of the above signal by applying the optimal VMD. A comparison between Figure 17b and Figure 18b shows that there are no active frequency components mixing or missing, which means the decomposition is complete. Next, the eBLIMFs will be decided according to Equation (10) by calculating their central frequencies and kurtosis values. Results of this step are listed in Table 5, in which can we see that only BLIMF2 and BLIMF3 are judged as eBLIMFs. At last, the SSA-based Shrinkage scheme is used to further remove the white noise in each eBLIMF. The final denoising result is shown in Figure 19.
Again, we use the newly designed indices to evaluate the performance of AVMDSSA under this field situation. The average computation results of fifty measured signals are as follow: Index1: 0.4860; Index2: 0.9723. This result demonstrates that the AVMDSSA method can effectively reduce the uncertainty of the measured data. And what's more, there is almost no loss of the characteristic components after the denoising process, which is critical to the subsequent processing and diagnosis. Sensors 2019, 19 FOR PEER REVIEW 19 of the external interferences, the captured signal still has a complex composition, which makes the denoising work more challenging. By using the proposed method, the mode number of VMD is optimized first, and the optimal value of K is 8 in this example. Figure 18 shows the decomposed BLIMFs of the above signal by applying the optimal VMD. A comparison between Figure 17b and Figure 18b shows that there are no active frequency components mixing or missing, which means the decomposition is complete. Next, the eBLIMFs will be decided according to Equation (10) by calculating their central frequencies and kurtosis values. Results of this step are listed in Table 5, in which can we see that only BLIMF2 and BLIMF3 are judged as eBLIMFs. At last, the SSA-based Shrinkage scheme is used to further remove the white noise in each eBLIMF. The final denoising result is shown in Figure 19.
Again, we use the newly designed indices to evaluate the performance of AVMDSSA under this field situation. The average computation results of fifty measured signals are as follow: Index1: 0.4860; Index2: 0.9723. This result demonstrates that the AVMDSSA method can effectively reduce the uncertainty of the measured data. And what's more, there is almost no loss of the characteristic components after the denoising process, which is critical to the subsequent processing and diagnosis. By using the proposed method, the mode number of VMD is optimized first, and the optimal value of K is 8 in this example. Figure 18 shows the decomposed BLIMFs of the above signal by applying the optimal VMD. A comparison between Figures 17b and 18b shows that there are no active frequency components mixing or missing, which means the decomposition is complete. Next, the eBLIMFs will be decided according to Equation (10) by calculating their central frequencies and kurtosis values. Results of this step are listed in Table 5, in which can we see that only BLIMF2 and BLIMF3 are judged as eBLIMFs. At last, the SSA-based Shrinkage scheme is used to further remove the white noise in each eBLIMF. The final denoising result is shown in Figure 19.
Again, we use the newly designed indices to evaluate the performance of AVMDSSA under this field situation. The average computation results of fifty measured signals are as follow: Index1: 0.4860; Index2: 0.9723. This result demonstrates that the AVMDSSA method can effectively reduce the uncertainty of the measured data. And what's more, there is almost no loss of the characteristic components after the denoising process, which is critical to the subsequent processing and diagnosis.

Discussion
In this paper, a novel denoising method namely AVMDSSA is developed. To comprehensively evaluate the performance of AVMDSSA, three case studies are conducted, and we discuss the results as follows: (i) Simulative case In the simulative case study, complex UHF PD signals with heavy noise are simulated. One can easily see from Figure 4b,d that if the decomposition by VMD is incomplete (i.e., the value of K is smaller than the actual number of active components in the raw signal), there will be multiple local maximum points in the BLIMF. On the contrary, every decomposed BLIMF should contain only one active component if K is equal to the number of actual components, as can be seen from Figure 5. The optimization process by the proposed AVMD algorithm is given in Table 3. After the selection of

Discussion
In this paper, a novel denoising method namely AVMDSSA is developed. To comprehensively evaluate the performance of AVMDSSA, three case studies are conducted, and we discuss the results as follows: (i) Simulative case In the simulative case study, complex UHF PD signals with heavy noise are simulated. One can easily see from Figure 4b,d that if the decomposition by VMD is incomplete (i.e., the value of K is smaller than the actual number of active components in the raw signal), there will be multiple local maximum points in the BLIMF. On the contrary, every decomposed BLIMF should contain only one active component if K is equal to the number of actual components, as can be seen from Figure 5. The optimization process by the proposed AVMD algorithm is given in Table 3. After the selection of

Discussion
In this paper, a novel denoising method namely AVMDSSA is developed. To comprehensively evaluate the performance of AVMDSSA, three case studies are conducted, and we discuss the results as follows: (i) Simulative case In the simulative case study, complex UHF PD signals with heavy noise are simulated. One can easily see from Figure 4b,d that if the decomposition by VMD is incomplete (i.e., the value of K is smaller than the actual number of active components in the raw signal), there will be multiple local maximum points in the BLIMF. On the contrary, every decomposed BLIMF should contain only one active component if K is equal to the number of actual components, as can be seen from Figure 5. The optimization process by the proposed AVMD algorithm is given in Table 3. After the selection of eBLIMFs and an SSA-based shrinkage algorithm, the final denoising result is obtained. Comparing Figure 7 with Figure 9, one can readily observe the superiority of AVMDSSA over other methods. In addition, the results of the noise robustness tests (depicted in Figure 8) and quantitative comparison (depicted in Figure 10) are all demonstrate the effectiveness of our method.
(ii) Laboratorial case To examine the performance of AVMDSSA by real UHF PD signal, four kinds of laboratorial PD data are collected. In addition, we define two new evaluation indices for denoising method from the perspective of noise suppression and feature retention, respectively. The effectiveness of the proposed AVMDSSA method is readily shown in Figure 14, even for the signal that contains multiple PD components. By using the newly defined indices for each method, we can conclude from Table 4 that AVMDSSA shows the best performance (lowest Index1, highest Index2).
(iii) Field Case The UHF PD signals from a real high-voltage current transformer are also employed in this paper, and the denoising results shown in Figure 19 are satisfactory, suggesting that our method has the application potential in practical scenarios.
The above discussions prove the effectiveness of the presented method. Moreover, it is worth mentioning that all the denoising results of AVMDSSA are obtained by using the same parameters listed in Table 2, which indicates the robustness of this method.
Compared with our previous denoising study in reference [8], contributions in this work are summarized as follows: (i) the parameter optimization method adopted in this paper has more explicit physical meaning and faster searching speed; (ii) the SSA-based shrinkage method developed in this work shows better denoising performance compared with the Wavelet Shrinkage method used in [8]; (iii) more comprehensive assessment of the denoising method is given in this study, including the noise robustness test, newly designed evaluation indices, and the field test. All these advancements show that the current method achieves much improvement compared to our previous work.
Despite all the advantages, we should admit that the efficiency of AVMDSSA needs to be improved. For example, the average time consumption of AVMDSSA for a signal with length of 5000 is 3.6 s (hardware: Intel Core i5 CPU, 16 GB RAM; software: MATLAB 2018b), while this value can rise to more than 10 s when dealing with a signal with length of 10000. Such high computational cost mainly comes from the SVD algorithm used in SSA. To a certain extent, this restricts the real-time application of our method in embedded hardware. However, for offline applications that are not very sensitive to computational complexity, the proposed method still shows great advantages.
To suppress the electromagnetic interference in PD signals more thoroughly, improvements in hardware design are also important. For example, a reliable metal shell for the UHF sensor will prevent the external noise from coupling into the detection system to a large extent. Besides, considering that the optical fiber is a kind of ideal transmission medium due to its advantages such as high sensitivity, immunity to electromagnetic interference and stability in harsh environments, it can be used in the data transmission module of the PD detection system. In summary, only when the noise suppression measures are taken into account in both of hardware and software design, can the PD detection system play a more stable and reliable role in practice.

Conclusions
A key challenge for PD detection and diagnosis is how to suppress the complicated interfering noise. In this paper, a novel denoising method for UHF PD signal is proposed. First, the mode number of VMD is automatically decided by a mode-mixing judgement algorithm. Next, those decomposed BLIMFs which contain PD components are selected based on their central frequency and kurtosis values, namely eBLIMFs. Finally, the residual white noise in each eBLIMF is further reduced by SSA-based Shrinkage method, and the denoised PD signal is obtained by adding all these denoised eBLIMFs together. The proposed AVMDSSA method is applied to three cases, and the following conclusions are obtained: (i) The mode-mixing decision rule proposed in this paper works very well in all cases, enabling the AVMDSSA method to quickly determine the appropriate K value. (ii) In the simulative case, a complex synthetic UHF PD which contains three PD pulses and two kinds of noises is employed to examine our method. The results show that the proposed method can reduce all kinds of noises to a large extent, and in the meanwhile, all PD components are well retained. In addition, the results of robustness testing and comparison demonstrate its reliability and superiority. (iii) For the measured data, two new evaluation indices are presented by considering both of the capabilities of noise suppression and feature preservation. By using these newly designed indices, the effectiveness of AVMDSSA in laboratory experiments and field tests are identified.
Our future work will focus on improving the efficiency of the denoising method, making it more valuable in practical application.

Conflicts of Interest:
The authors declare no conflict of interest.