Partial Discharge Data Matching Method for GIS Case-Based Reasoning

: With the accumulation of partial discharge (PD) detection data from substation, case-based reasoning (CBR), which computes the match degree between detected PD data and historical case data provides new ideas for the interpretation and evaluation of partial discharge data. Aiming at the problem of partial discharge data matching, this paper proposes a data matching method based on a variational autoencoder (VAE). A VAE network model for partial discharge data is constructed to extract the deep eigenvalues. Cosine distance is then used to calculate the match degree between di ﬀ erent partial discharge data. To verify the advantages of the proposed method, a partial discharge dataset was established through a partial discharge experiment and live detections on substation site. The proposed method was compared with other feature extraction methods and matching methods including statistical features, deep belief networks (DBN), deep convolutional neural networks (CNN), Euclidean distances, and correlation coe ﬃ cients. The experimental results show that the cosine distance match degree based on the VAE feature vector can e ﬀ ectively detect similar partial discharge data compared with other data matching methods.


Introduction
CIGRE's statistics show that about 30% dielectric failures of gas insulated switchgears (GIS) are related to design deficiencies [1].Through the analysis of a large amount of partial discharge (PD) data from GIS in service, we also found that the proportion of PD cases caused by design reasons is high.This leads to a situation that the same type GIS equipment from the same manufacturer are susceptible to repeat partial discharge on similar location.This provides the basis for case-based reasoning (CBR) in GIS.Case-based reasoning is a branch of artificial intelligence (AI) that provides answers to new questions based on experience in historical cases [2,3].In the latest studies, CBR has been used in load forecasting, energy management, grid system safety assessment, and power equipment failure assessment [4][5][6][7].In Reference [8], a case-based reasoning method is utilized to diagnose the incipient fault of power transformer.Pretreated dissolved gas analysis (DGA) data is used in the CBR system.Reference [9] developed a case-based reasoning approach for identifying and filtering acoustic emission (AE) noise signals.The paper proposed a parametric case representation method for the AE signal process.Since CBR requires the accumulation of cases and data in the early stage, there is no CBR related literature published in the field of partial discharge.After accumulating a large amount of GIS PD detection data from substation site, CBR can provide new ideas for the interpretation and Some phase resolved pulse sequence (PRPS) graphs are used in Figure 1 to refer to the data detected by the GIS partial discharge ultra-high frequency (UHF) detection.The specific procedures are as follows: First, the historical data are retrieved from the historical case database according to the operating conditions of the detected equipment, the manufacturer and other search conditions; the detected data are then matched with the historical data, and those cases for which the data match degree exceeds a threshold are considered match cases; and finally, from the match case, we can obtain information such as the highest probability of PD location in the detected power equipment, the most likely cause of PD in the detected power equipment, and pictures of disintegrated power equipment in historical cases.Maintenance plans can be developed based on match information.Therefore, PD history detection data can be more effectively utilized and can provide a basis for data-driven device status evaluations.
There are two key processes that are used to calculate PD data match degree.The first key process is to extract the valid eigenvalues for PD data, and the second is to obtain the match degree (MD) based on the eigenvectors.The traditional feature extraction methods used for PD data extract a variety of statistical features from, for example, histograms, scatter plots, and grayscale images based on PRPD (phase resolved partial discharge) data [10][11][12].Moreover, there are also some other algorithms applied to PD data feature extraction, such as principal component analysis (PCA) [13], wavelet packets transformation [14], sparse representation [15], and signal norms [16].The algorithms proposed in the references behaved a good performance in the task of PD pattern recognition.However, due to the multi-source heterogeneity of access data in big data centers, the huge differences in the performances of PD detect instruments and the complex operating environments in substations, the statistical characteristics obtained by the traditional statistical methods have become inadequate in identifications of typical partial discharge types.In addition, data matching of PD data needs even more stringent requirements than those for PD pattern recognition.
In recent years, related technologies such as deep auto-encoders, deep convolutional networks, recurrent neural networks, and deep belief networks have shown good performance in many fields, including image processing and speech processing [17][18][19][20][21]. Reference [22] studied the application of deep neural networks in the diagnosis of partial discharges and demonstrated the improvements in accuracy and visualization that can be obtained through the deep learning method.Reference [23] obtained a two-dimensional spectral frame representation of a UHF signal employing a time-frequency analysis and then used a deep convolutional network to obtain enhanced features under different PD sources.Auto-encoding (AE) is an unsupervised feature learning method, and its Some phase resolved pulse sequence (PRPS) graphs are used in Figure 1 to refer to the data detected by the GIS partial discharge ultra-high frequency (UHF) detection.The specific procedures are as follows: First, the historical data are retrieved from the historical case database according to the operating conditions of the detected equipment, the manufacturer and other search conditions; the detected data are then matched with the historical data, and those cases for which the data match degree exceeds a threshold are considered match cases; and finally, from the match case, we can obtain information such as the highest probability of PD location in the detected power equipment, the most likely cause of PD in the detected power equipment, and pictures of disintegrated power equipment in historical cases.Maintenance plans can be developed based on match information.Therefore, PD history detection data can be more effectively utilized and can provide a basis for data-driven device status evaluations.
There are two key processes that are used to calculate PD data match degree.The first key process is to extract the valid eigenvalues for PD data, and the second is to obtain the match degree (MD) based on the eigenvectors.The traditional feature extraction methods used for PD data extract a variety of statistical features from, for example, histograms, scatter plots, and grayscale images based on PRPD (phase resolved partial discharge) data [10][11][12].Moreover, there are also some other algorithms applied to PD data feature extraction, such as principal component analysis (PCA) [13], wavelet packets transformation [14], sparse representation [15], and signal norms [16].The algorithms proposed in the references behaved a good performance in the task of PD pattern recognition.However, due to the multi-source heterogeneity of access data in big data centers, the huge differences in the performances of PD detect instruments and the complex operating environments in substations, the statistical characteristics obtained by the traditional statistical methods have become inadequate in identifications of typical partial discharge types.In addition, data matching of PD data needs even more stringent requirements than those for PD pattern recognition.
In recent years, related technologies such as deep auto-encoders, deep convolutional networks, recurrent neural networks, and deep belief networks have shown good performance in many fields, including image processing and speech processing [17][18][19][20][21]. Reference [22] studied the application of deep neural networks in the diagnosis of partial discharges and demonstrated the improvements in accuracy and visualization that can be obtained through the deep learning method.Reference [23] obtained a two-dimensional spectral frame representation of a UHF signal employing a time-frequency analysis and then used a deep convolutional network to obtain enhanced features under different PD sources.Auto-encoding (AE) is an unsupervised feature learning method, and its hidden layer can effectively extract the internal expression of data.Its deep structure makes the network closer to the Energies 2019, 12, 3677 3 of 15 human brain's information hierarchical processing, with better nonlinear modelling ability [24,25].The variational autoencoder (VAE) proposed by Kingma et al. is a generating network based on variational Bayesian inference [26].It avoids the computational complexity of dataset likelihood probability calculations and traditional Monte Carlo sampling and is therefore becoming an area of considerable research interest in text classification, semi-supervised learning, and other related fields.
This paper presents a PD data matching method based on VAE.The network uses variational Bayesian method to quickly approximate the posterior probability and extract the deep features of PD data.Euclidean distance, cosine distance, and correlation coefficient (Cc) methods were used to measure the similarity between different data, the comparative results of which are also shown in this paper.
The rest of the paper is organized as follows.Section 2 introduces basic information on variational autoencoder networks.Section 3 provides further information on the proposed partial discharge data matching approach.The dataset used in this paper is described in Section 4. Section 5 validates the data matching approach with different case studies and discusses the results obtained.The conclusions are presented in Section 6.

Variational Autoencoder
Variational Bayes inference [27] is a deterministic approximation method that maximizes the lower bound of the marginal likelihood function of the observed data by iteratively updating the variational parameters and approximates the posterior probability of unobservable variables.
For a sample set X, define the eigenvalues of the data as latent variables z because they cannot be directly observed.According to the Bayesian criterion, the posterior probabilities of the latent variables z are It is difficult to obtain an exact analytical solution for p(x), therefore, in the variational Bayes inference, an approximate distribution q(z|x) is introduced to fit the real posterior distribution p(z|x).Kullback-Leibler (KL) divergence is used to compare the similarities of the two distributions.
The approximate distribution q(z|x) is estimated by an auto-encoder network in VAE.VAE consists of a probabilistic encoder and a probabilistic decoder and uses a stochastic gradient variational Bayes algorithm to achieve a posterior distribution model that optimizes the hidden layer.
According to the variational Bayes method, the log marginal likelihood of the sample data X can be simplified as shown below.
where φ is the real posterior distribution parameter, and θ is the approximate distribution parameter of the hidden layer.The first item is the KL divergence between the approximate distribution of the hidden layer and the real posterior distribution.Since KL divergence is nonnegative, the KL divergence is zero only if the two distributions are exactly the same [28].Thus, log p θ (x (i) ) ≥ L(θ, φ; x (i) ).Equation ( 3) can be expanded: The optimal approximation of the sample set p θ (x (i) ) can be obtained by maximizing variational bound L(θ, ϕ; x(i)) [29].

Data Matching Method of Partial Discharge Based on VAE
The encoder section of the VAE model for partial discharge data can be represented by Equation (5).
where W and b are the weights and biases of each layer, and x is the input vector.h 1 , µ enc , and σ enc are the outputs of the first and second layers of the network.f is the activation function.Based on Gaussian distribution parameters µ and σ, the hidden layer output z is obtained by sampling q(z|x(i)), and N(0,I) is the standard normal distribution.
The decoder section of the VAE model for partial discharge data can be represented by the following equation.
where W and b are the weights and biases of each layer, h 2 , µ dec , and σ dec are the outputs of each layer of the decoder, and f is the activation function.
The target optimization function of Equation ( 4) can be rewritten as Equation (7). 2 where J is the dimension of the latent variables z, and L is the number of samples of the latent variables z on the posterior distribution.
The parameters of the probability encoder and the probability decoder are then optimized by the stochastic gradient descent algorithm.When Equation ( 7) converges or stabilizes, the output of the encoder part of VAE is the extracted eigenvalues.
Figure 2 shows the matching process of partial discharge data based on the VAE model.

Data Matching Method of Partial Discharge Based on VAE
The encoder section of the VAE model for partial discharge data can be represented by Equation (5).
where W and b are the weights and biases of each layer, and x is the input vector.h1, μenc, and σenc are the outputs of the first and second layers of the network.f is the activation function.Based on Gaussian distribution parameters μ and σ, the hidden layer output z is obtained by sampling q(z|x(i)), and N(0,I) is the standard normal distribution.
The decoder section of the VAE model for partial discharge data can be represented by the following equation.
) log where W and b are the weights and biases of each layer, h2, μdec, and σdec are the outputs of each layer of the decoder, and f is the activation function.
The target optimization function of Equation ( 4) can be rewritten as Equation (7).
where J is the dimension of the latent variables z, and L is the number of samples of the latent variables z on the posterior distribution.
The parameters of the probability encoder and the probability decoder are then optimized by the stochastic gradient descent algorithm.When Equation ( 7) converges or stabilizes, the output of the encoder part of VAE is the extracted eigenvalues.
Figure 2 shows the matching process of partial discharge data based on the VAE model.The match degree of partial discharge data can be obtained by calculating the distance between the partial discharge data by using the cosine algorithm of Equation (8).
where V a and V b are the eigenvectors extracted from the two PD datasets.||•|| is the length of the vector.

Dataset
For this paper, the PD data sample sets were set up by laboratory partial discharge simulation and substation partial discharge live detection.We used the ultra-high frequency detection method and PRPS data format that is commonly used in UHF partial discharge detection.The dataset contained more than 20,000 pieces of simulated experimental data and more than 20,000 pieces of field test data.

Laboratory Experiment
Four typical partial discharge defect models were designed, and the experiment was conducted on a real GIS platform, the experimental connection diagram of which is shown in Figure 3.Typical design defects include floating electrode defects, metallic protrusion defects, insulation void discharge defects, and free metal particle discharge defects.The match degree of partial discharge data can be obtained by calculating the distance between the partial discharge data by using the cosine algorithm of Equation (8).
where Va and Vb are the eigenvectors extracted from the two PD datasets.|| || • is the length of the vector.

Dataset
For this paper, the PD data sample sets were set up by laboratory partial discharge simulation and substation partial discharge live detection.We used the ultra-high frequency detection method and PRPS data format that is commonly used in UHF partial discharge detection.The dataset contained more than 20,000 pieces of simulated experimental data and more than 20,000 pieces of field test data.

Laboratory Experiment
Four typical partial discharge defect models were designed, and the experiment was conducted on a real GIS platform, the experimental connection diagram of which is shown in Figure 3.Typical design defects include floating electrode defects, metallic protrusion defects, insulation void discharge defects, and free metal particle discharge defects.(1) Floating electrode defect: Epoxy resin was used to cast copper sheets of different sizes.The amount of discharge can be controlled by changing different epoxy blocks, as shown in Figure 4a.
(2) Metallic protrusion defect: The high-voltage terminal is connected to an aluminum tip electrode, and the ground terminal is connected with a Ø54 mm aluminum disc.By adjusting the size of the tip electrode and the height of the air gap between it and the ground electrode, it is possible to control the amount of discharge, as shown in Figure 4b.
(3) Particle discharge: The high-voltage terminal is connected to a ball electrode, and the low-voltage ground terminal is connected to a concave disk electrode, with free metal particles of different sizes and numbers placed in the center of it, as shown in Figure 4c.
(4) Insulation discharge: Casting the epoxy into a cylinder will leave bubbles of different sizes inside during the casting process, as shown in Figure 4d.
The nominal voltage of the GIS used in partial discharge experiment is 145 kV, the output voltage of experimental power supply is 0-220 kV.Typical partial discharge inception voltage (PDIV) (1) Floating electrode defect: Epoxy resin was used to cast copper sheets of different sizes.The amount of discharge can be controlled by changing different epoxy blocks, as shown in Figure 4a.
(2) Metallic protrusion defect: The high-voltage terminal is connected to an aluminum tip electrode, and the ground terminal is connected with a Ø54 mm aluminum disc.By adjusting the size of the tip electrode and the height of the air gap between it and the ground electrode, it is possible to control the amount of discharge, as shown in Figure 4b.
(3) Particle discharge: The high-voltage terminal is connected to a ball electrode, and the low-voltage ground terminal is connected to a concave disk electrode, with free metal particles of different sizes and numbers placed in the center of it, as shown in Figure 4c.
(4) Insulation discharge: Casting the epoxy into a cylinder will leave bubbles of different sizes inside during the casting process, as shown in Figure 4d.
The nominal voltage of the GIS used in partial discharge experiment is 145 kV, the output voltage of experimental power supply is 0-220 kV.Typical partial discharge inception voltage (PDIV) and partial discharge extinction voltage (PDEV) for each type of defect are listed in   The typical PRPS data detected in the simulation experiment were normalized, as shown in Figure 5.The typical PRPS data detected in the simulation experiment were normalized, as shown in Figure 5.

Substation On-Site Detection
In the past five years, we have accumulated a large amount of on-site detection data by periodically conducting PD tests for more than 30 substations in China.Among those data, there are 42 cases in which the power equipment defects have been verified by disassembly overhaul, including floating electrode defects, metallic protrusion defects, insulation void discharge defects, and free metal particle discharge defects.The statistical information related to the cases is shown in Table 3.

Substation On-Site Detection
In the past five years, we have accumulated a large amount of on-site detection data by periodically conducting PD tests for more than 30 substations in China.Among those data, there are 42 cases in which the power equipment defects have been verified by disassembly overhaul, including floating electrode defects, metallic protrusion defects, insulation void discharge defects, and free metal particle discharge defects.The statistical information related to the cases is shown in Table 3.
As seen from Table 1, in all disintegrative cases, the proportion of similar cases for the same PD type was high.Therefore, for some detected data from equipment suspected of being defective, there is a high probability that similar data can be found from its historical cases, especially for those data that can be recognized as floating and insulation discharges.

Experiment Setup
The main flow of the comparative experiment is shown in Figure 6.First, the training set was composed of both laboratory experimental data and substation field detection data.The experimental data and the field detection data were mixed and disordered.An unsupervised training was performed to the established VAE model on this dataset, to obtain a feature extraction model with better generalization performance.The test set consisted of only substation field detection case data which the defect is verified by GIS disintegration.The data from four GIS disintegration cases were selected in order to examine the matching performance of the data matching model for case data.These four cases contained different similar situations.The matching degrees were calculated between data in four cases, and the different feature extraction methods and different matching degree calculation methods were compared.Finally, the generalization capabilities of different methods were analyzed on all the 42 cases.As seen from Table 1, in all disintegrative cases, the proportion of similar cases for the same PD type was high.Therefore, for some detected data from equipment suspected of being defective, there is a high probability that similar data can be found from its historical cases, especially for those data that can be recognized as floating and insulation discharges.

Experiment Setup
The main flow of the comparative experiment is shown in Figure 6.First, the training set was composed of both laboratory experimental data and substation field detection data.The experimental data and the field detection data were mixed and disordered.An unsupervised training was performed to the established VAE model on this dataset, to obtain a feature extraction model with better generalization performance.The test set consisted of only substation field detection case data which the defect is verified by GIS disintegration.The data from four GIS disintegration cases were selected in order to examine the matching performance of the data matching model for case data.These four cases contained different similar situations.The matching degrees were calculated between data in four cases, and the different feature extraction methods and different matching degree calculation methods were compared.Finally, the generalization capabilities of different methods were analyzed on all the 42 cases.The baseline systems that were used for feature extraction are now briefly described.
(1) Statistical eigenvalues: They are a commonly used feature extraction method in PD data processing.The traditional statistical eigenvalues consist of 16 characteristic parameters such as skewness (Sk), steepness (Ku), asymmetry (Q), the cross correlation coefficient (Cc) of the PD The baseline systems that were used for feature extraction are now briefly described.
Energies 2019, 12, 3677 9 of 15 (1) Statistical eigenvalues: They are a commonly used feature extraction method in PD data processing.The traditional statistical eigenvalues consist of 16 characteristic parameters such as skewness (Sk), steepness (Ku), asymmetry (Q), the cross correlation coefficient (Cc) of the PD amplitude, and PD numbers in the positive and negative half of the power frequency cycle [30].
(2) DBN: A deep belief network (DBN) consists of multi-layer RBMs.The DBN network used for comparison had six layers, and the numbers of units for each layer were 3600, 1000, 500, 100, 10, and 4. In addition, the output of the second to last layer is used as the extracted eigenvalue.The detailed calculation can be seen in Reference [31].
(3) CNN: A deep convolutional neural network (CNN) consists of a number of two-dimensional convolutional kernels and uses multi-layer convolutional and pooling operations to obtain deep features of data.The CNN input layer used for this paper was 50 × 72, the two convolutional layers were six convolutional kernels of 3 × 3 and 36 convolutional kernels of 3 × 3, and the corresponding pooling layers were 1 × 2 and 1 × 11.The numbers of two fully connected layers were 500 and 10, and the number of output layers was four.The input of the output layer was used as the extracted eigenvalue.Detailed calculations can be seen in Reference [23].
The baseline systems used for the MD calculation are now briefly described.
(1) Euclidean distance MD: MD is obtained based on the Euclidean distance [32] between two groups of vectors.The problem with matches based on Euclidean distance is that it is difficult to determine the appropriate standard, and thus normalization is difficult.For this paper, the maximum distance in all sample data was selected as the standard, and MD was calculated according to the following formula: where D ab is the Euclidean distance between the eigenvectors extracted from the two PD data.D max is the maximum Euclidean distance between the PD data in the dataset.
(2) Correlation coefficient: A correlation coefficient is a measure of the linear correlation between two variables [33].It has a value between +1 and −1, where one is total positive linear correlation, zero is no linear correlation, and −1 is total negative linear correlation.Therefore, the MD can be calculated according to the following formula: where r ab is the correlation coefficient between the eigenvectors extracted respectively from the two PD data.The experimental platform was configured as a Core i7 processor (Intel, Santa Clara, CA, USA) operating at 3.9 GHz with 16 GB of memory, the operating system was Ubuntu 14.0 (Canonical, London, UK), and the code was implemented in Python.For the results presented in this paper, the dimension of each PD data was 50 × 72.The VAE used in the study consisted of seven layers: An input layer, an output layer, a latent layer, and four intermediate layers.The structure of the network is shown in Table 4. Network layers 1-4 formed the encoder part of VAE, and layers 4-7 formed the decoder part of VAE.The output of the latent variables layer was the extracted eigenvalues.Using the established PD dataset, the VAE was trained without supervision, as described in Section 3.

The Comparison between Different Feature Extraction Methods
We selected four cases of partial discharge detection verified by disintegration and numbered them cases 1-4.The case information is shown in Table 5.
The equipment in Case 1 and Case 2 belonged to the same manufacturer and were of the same type.They also had the same discharge location.Case 3 has the same PD pattern as for Cases 1 and 2, but the equipment manufacturers and discharge locations differed.Case 4 was a comparative case with different PD types.The partial discharge data detected in the above four cases are shown in Figure 7.The equipment in Case 1 and Case 2 belonged to the same manufacturer and were of the same type.They also had the same discharge location.Case 3 has the same PD pattern as for Cases 1 and 2, but the equipment manufacturers and discharge locations differed.Case 4 was a comparative case with different PD types.The partial discharge data detected in the above four cases are shown in Figure 7.The trained VAE network model was used to extract the features of the partial discharge data in the above four cases and to calculate the MD between them.At the same time, the statistical characteristics, DBN eigenvalues, and CNN eigenvalues for the four cases data were used to calculate MD.All the MDs were based on cosine distance.The results are shown in Table 6.The trained VAE network model was used to extract the features of the partial discharge data in the above four cases and to calculate the MD between them.At the same time, the statistical characteristics, DBN eigenvalues, and CNN eigenvalues for the four cases data were used to calculate MD.All the MDs were based on cosine distance.The results are shown in Table 6.The information of four cases in Table 6 are described in Table 5.The similarity between case 1 and case 2 should be 100% because they have GIS devices produced by the same manufacturer, and partial discharge occurs at the same position, so the similarity result should definitely be the higher the better.Case 3 has the same PD type compared to cases 1, 2, but the case details such as PD location and reason are different.Case 4 has a completely different PD type, and the similarity should be 0%, so the similarity result should definitely be the smaller the better.As seen from Table 6, using the VAE method, case 1-2 had a higher MD than those of the other cases and was 23.09% higher than that of case 1-3 and 89.94% higher than that of case 1-4.As a comparison, for the MD results based on statistical eigenvalues, case 1-2 was 7.09% higher than case 1-3 and 26.01%higher than case 1-4, which means that the MD based on statistical eigenvalues were relatively close.It had a lower distinguishing ability, even for different PD pattern recognitions.Regarding the MD results based on DBN and CNN, the MD of case 1-4 and case 2-4 were obviously lower than those of case 1-2 and case 1-3.Therefore, the DBN and CNN models performed better for data from different PD types than the traditional statistical method.However, for cases 1-2, 1-3, and 2-3, the MDs were too close to distinguish similar and dissimilar cases and were therefore less effective than the VAE model.

The Comparison between Different Match Degree Calculation Methods
To investigate the effects of the different match degree calculation methods, the MD were calculated by cosine distance, Euclidean distance, and correlation coefficient, respectively based on the VAE eigenvalues for the four cases in Table 5.The results are shown in Table 7.It can be seen that there were slight differences in the specific values among the methods, but overall, all the methods had good ability to distinguish between similar and dissimilar cases.Furthermore, we calculated the MDs on all the data for the 42 cases.The VAE model was used to extract the eigenvalues, and the MD were calculated by cosine distance, Euclidean distance, and correlation coefficient, respectively.The match result is defined accurate if the MD exceeds 80% under the similar cases and less than 20% under the dissimilar cases.The accuracies are shown in Table 8.It can be seen from Table 8 that for a large number of cases, the accuracy of MD based on Euclidean distance and the correlation coefficient were lower than those based on the cosine distance.The reason was that in the calculation of the Euclidean distance MD, data from all kinds of cases were compared with the fixed maximum distance, thus the singular value will result in the poor effect overall.In the calculation of MD based on correlation coefficient, more MD values exceeded 20% under dissimilar cases.In addition, because a large amount of PD data was stored in each case, there may have been some low quality data that differed greatly from other data in the same case.To improve performance in big data engineering applications, the data cleaning method needs to be used for data filtration in future research.

The Comparison between Different Threshold
The different definition of accurate match also has an important impact on the final effect of the CBR system.In Section 5.3, the match result is defined accurate if the MD exceeds 80% under the similar cases and less than 20% under the dissimilar cases.The accuracies change with the threshold changes.In the classification problem, the number 50% is usually used as the threshold for the classification output.If the output of a category is greater than 50%, the sample can be classified into this category.If it is less than 50%, it is not considered to be the category.In data matching applications, it is necessary to adopt differentiated thresholds to get more accurate case results.However, different algorithms have different adaptability to different threshold settings.To investigate the accuracy of different algorithms at different thresholds, we performed the following experiments.
Firstly, 1000 sets of data were selected from similar cases, and the eigenvalues of the data in each case were calculated by VAE, statistical eigenvalue, DBN and CNN, and the MDs were obtained by the cosine algorithm.The thresholds were defined as 50%, 60%, 70%, 80%, and 90%, respectively.The match result is defined accurate if the MD exceeds the threshold, and the accuracy obtained by different algorithms is shown in Figure 8a.
Secondly, 1000 sets of data were selected from dissimilar cases, and the eigenvalues of the data in each case were calculated by VAE, statistical eigenvalue, DBN and CNN, and the MDs were obtained by the cosine algorithm.The thresholds were defined as 50%, 40%, 30%, 20%, and 10%, respectively.The match result is defined accurate if the MD less than the threshold, and the accuracy obtained by different algorithms is shown in Figure 8b.
As seen from Figure 8a, in the comparative analysis under similar case data, when the threshold was set to 70% or less, the accuracy obtained by CNN, DBN, and VAE has a small difference.When the threshold was set to 80% or more, the accuracy of CNN and DBN decreased more obviously, while the accuracy of VAE can still reach more than 60% at the threshold of 90%.The reason is that under similar cases, the MDs obtained by VAE were mainly distributed above 0.9, while the MDs calculated by CNN and DBN were mainly distributed between 0.7 and 0.8.The MDs calculated by the statistical eigenvalues were mainly distributed between 0.5 and 0.6. in the comparative analysis under dissimilar case data, when the threshold was set to 40% or more, the accuracy obtained by the four methods was not much different.When the threshold was set to 20% or less, the accuracy of the four methods all reduced, while the accuracy of VAE can still reach more than 40% at the threshold of 10%.Under dissimilar cases, the MDs obtained by VAE were mainly distributed below 0.2, while the MDs calculated by CNN and DBN were mainly distributed between 0.1 and 0.4.The MDs calculated by the statistical

Figure 1 .
Figure 1.The framework of partial discharge (PD) data matching.

Figure 1 .
Figure 1.The framework of partial discharge (PD) data matching.

Figure 2 .
Figure 2. The procedure of feature extraction and match degree computation based on variational autoencoder (VAE).

Energies 2018 , 15 Figure 2 .
Figure 2. The procedure of feature extraction and match degree computation based on variational autoencoder (VAE).

Figure 3 .
Figure 3. PD experiment circuit on a true gas insulated switchgears (GIS) model.

Figure 3 .
Figure 3. PD experiment circuit on a true gas insulated switchgears (GIS) model.

Figure 5 .
Figure 5.Typical PD phase resolved pulse sequence (PRPS) data from a true GIS model experiment.(a) Floating electrode discharge; (b) metallic protrusion discharge; (c) free metal particle discharge; (d) insulation void discharge.

Figure 5 .
Figure 5.Typical PD phase resolved pulse sequence (PRPS) data from a true GIS model experiment.(a) Floating electrode discharge; (b) metallic protrusion discharge; (c) free metal particle discharge; (d) insulation void discharge.

Table 1 .
The main parameters of instrument used in the experiment are shown in Table2.Energies 2018, 11, x FOR PEER REVIEW 6 of 15 and partial discharge extinction voltage (PDEV) for each type of defect are listed in Table 1.The main parameters of instrument used in the experiment are shown in Table 2.

Table 1 .
Typical partial discharge inception voltage (PDIV) and partial discharge extinction voltage (PDEV) in the experiment.

Table 2 .
The main parameters of instrument used in the experiment.

Table 1 .
Typical partial discharge inception voltage (PDIV) and partial discharge extinction voltage (PDEV) in the experiment.

Table 2 .
The main parameters of instrument used in the experiment.

Table 3 .
Information of on-site power equipment disintegration verification cases.

Table 3 .
Information of on-site power equipment disintegration verification cases.

Table 4 .
VAE model parameters for partial discharge data feature extraction.

Table 5 .
Information for four partial discharge detection substation site cases.

Table 5 .
Information for four partial discharge detection substation site cases.

Table 6 .
Match degree for the different feature extraction methods between data from four on-site detection cases.

Table 7 .
Comparison of different matching calculation methods on four detection cases on-site.

Table 8 .
Comparison of the different matching calculation methods for all the on-site cases.