Research on Transformer Partial Discharge UHF Pattern Recognition Based on cnn ‐ lstm

: In view of the fact that the statistical feature quantity of traditional partial discharge (PD) pattern recognition relies on expert experience and lacks certain generalization, this paper develops PD pattern recognition based on the convolutional neural network (cnn) and long ‐ term short ‐ term memory network (lstm). Firstly, we constructed the cnn ‐ lstm PD pattern recognition model, which combines the advantages of cnn in mining local spatial information of the PD spectrum and the advantages of lstm in mining the PD spectrum time series feature information. Then, the transformer PD UHF (Ultra High Frequency) experiment was carried out. The performance of the constructed cnn ‐ lstm pattern recognition network was tested by using different types of typical PD spectrums. Experimental results show that: 1 (cid:4667) for the floating potential defects, the recognition rates of cnn ‐ lstm and cnn are both 100%; 2 (cid:4667) cnn ‐ lstm has better recognition ability than cnn for metal protrusion defects, oil paper void defects, and surface discharge defects; and 3 (cid:4667) cnn ‐ lstm has better overall recognition accuracy than cnn and lstm.


Introduction
As the main core component of the power grid system, the transformer is directly connected to the safety and reliability of the power system. During long-term operation, the insulation system will inevitably deteriorate, resulting in the occurrence of partial discharge (PD) [1,2]. The effective identification of PD defects helps to accurately understand the characteristics of the internal insulation defects, and has important engineering significance for mastering the insulation condition of transformer [3].
At present, the main methods for PD pattern recognition are distance-based pattern classification [3,4], fisher classification [5,6], principal component classification [7,8], neural network classification [9], fuzzy cluster classification [10], Bayesian classification [11,12], support vector machine classification [13][14][15], etc., in which neural network classification and support vector machine classification are the most mature and widely used classification algorithms. However, these algorithms are based on the statistical feature quantity, and the extraction of feature quantity needs to rely on the expert experience, which leads to the lack of certain generalization of these pattern recognition algorithms [16][17][18].
In recent years, with the rapid development of recognition technologies in the research fields such as images, texts, and videos, classification algorithms based on deep learning, for example, dbn (deep belief networks), dnn (deep neural networks), cnn (convolutional neural networks), have gradually begun to be applied in the field of PD pattern recognition [19][20][21]. In 2016, Mingzhe Rong used convolutional neural network for PD pattern recognition. The simulation results show that the recognition accuracy of cnn is better than the Hilbert-Huang transform and wavelet entropycharacterized support vector machine (SVM) [22]. In 2017, Zheng used cnn to identify the pattern of transformer PD. The experimental results show that the recognition rate of cnn is better than the traditional identification method [18]. In 2018, Jiang and Sheng applied cnn to the pattern recognition of GIS PD. The PRPS (phase resolved pulse sequence) map and the whole-period time-domain waveform map were used as input. The research shows that the recognition accuracy of cnn is better than SVM and BP (back propagation) neural network [23,24]. In the same year, Peng used cnn for pattern recognition of cable PD. The results showed that the overall recognition accuracy of cnn increased by 3.71% and 4.06%, respectively, compared with SVM and BP neural network [25]. Nguyen used lstm to identify the partial discharge pattern of GIS (Gis Insulation Switchgear). The experimental results show that the recognition accuracy of lstm is better than SVM [26]. Adam applied lstm to the identification of different PD defect types in insulating oil. The time domain single-pulse electrical signal was used as input. Experimental results show that the recognition accuracy of lstm is slightly lower than the random forest method. However, since the lstm method has the advantage of not requiring human statistical recognition features, it is still recommended by the author [27].
Currently, the display of the on-site power equipment PD online monitoring system is usually in the form of spectrum. As the discharge time increases and the discharge intensity changes, the PD spectrum will change accordingly. If viewed from the time axis perspective, the PD spectrum is a video with local fluctuations over time, with changes in characteristic parameters such as phase, amplitude, and number of discharges, and time series characteristics of characteristic parameters over time. While cnn has excellent local spatial information mining capability, lstm has excellent time series information mining capability. Therefore, if the cnn and lstm can be combined and the local spatial information and time series information of the PD map are extracted at the same time, the recognition accuracy of the PD type with the undulating characteristics will be effectively improved.

Cnn
cnn has the ability of local perception to extract the details of the local area of the input image. By sharing the network weights, it can greatly reduce the number of variables in the design network, reduce the requirement of network operation on the computing ability of hardware devices, and improve the recognition efficiency. Typical convolutional neural networks usually consist of five layers: input layer, convolution layer, pooling layer, full connection layer, and output layer [28][29][30], as shown in Figure 1.

Input
Feature

Lstm
Long-term and short-term memory network (lstm) is a cyclic neural network with gated structure. It was proposed by German scholar Hochreiter and Swiss scholar Schmidhuber. Gate structure mainly includes three gates: forget gate, input gate, and output gate. Forget gate decides the amount of information discarded, input gate decides the amount of information input, output gate decides the amount of information output [31][32][33]. The gate unit structure of the lstm is shown in Figure 2, Foget gate Input gate Output gate which  is the multiplication of vector elements,  is the splicing of vectors, xt is the input of current time gated unit, ht−1 is the output of adjacent time gated unit, ct−1 is the state of adjacent time gated unit, and ct is the state of current time gated unit. ht is the output of the current gated unit,  is the function of sigmoid( )  , and tanh is the function of tanh( )  .

Cnn-lstm Network Architecture
Cnn-lstm is a combination of convolutional neural network (cnn) and long-term and short-term memory network (lstm), which combines the advantages of both: convolutional neural networks facilitate the extraction of input local spatial features, and long-term memory networks facilitate the extraction of inputs. Additionally there are timing features, and finally pattern recognition by the Softmax classifier. Figure 3 shows the network architecture of the cnn-lstm.  Cnn extracts spatial information of the input spectrum and generates corresponding feature vectors. Suppose n be the number of input PD spectrums. For a single input spectrum, assume that the pixel matrix of the spectrum is m n  , and the size of the convolution kernel is k k  . Use the same method to convolve and fill the periphery of the input spectrum with zero elements. The padding length is / 2 k , and the size spectrum matrix is expanded from m n  to ( ) ( ) m k n k    , so that the dimension of the output and the input result after the convolution operation are unchanged.
Let ij v be the pixel value of the ith row and the jth column of the input spectrum after the expansion, and the convolution core domain window matrix which use ij v is as follows: Convolution operation on window matrix ij X , where i w and i b are the weight vector and bias term of the ith convolution layer, The activation function in the formula (2) is ReLu function.
After completing the convolution operation of the input spectrum, the maximum pooling method is used to pool the output ij Y of the convolution layer, obtaining the maximum eigenvalue in the pooling window, and realizing the second optimization of the feature set. Suppose n R represents the feature spectrum of the nth input spectrum after convolution and pooling, then, After the feature extraction of all input PD spectrums is completed, the feature matrix extracted by the convolutional neural network layer is: Suppose that the acquisition time corresponding to the nth input spectrum is t, Rn can be used as the input of the long short-term memory network at the time t, and Rt is used to represent Rn. The corresponding update process of the long short-term memory network is as follows: , where Wf, Wi, and Wo are the weight matrix of the forget gate, the input gate, and the output gate, respectively. bf, bi, and bo are the bias terms of the forget gate, the input gate, and the output gate, respectively; Ct is the state of the input unit, and Wc and bc are the weight matrix and the bias terms of the input unit state, respectively.

Flow of PD Recognition
Cnn-lstm network combines the advantages of cnn and lstm. It can not only extract local spatial features, but also acquire temporal information of different PD spectrum. The PD recognition flow of cnn-lstm network is shown in Figure 4.

Experiment Circuit
The transformer PD UHF (Ultra High Frequency) experimental circuit is shown in Figure 5, where the insulating oil is 25# transformer oil, the sensor is Vivaldi antenna [34], and the detected electromagnetic wave signal is shown in Figure 6.

Construction of Transformer PD Model
For power transformers, according to the operating experience, the parts that are prone to PD include transformer winding turns, outboard insulation board, oil screen, pressure plate, fasteners, oil bubbles, and impurity particles. PD types of power transformers can be summarized as follows: oil gap discharge; PD in oil-paper insulation such as lead and lap wiring; partial breakdown of winding turn insulation; and slip flash discharge along the surface of electrical paper.
In order to verify the PD pattern recognition effect of cnn-lstm network, four typical PD insulation defects of transformer are set up in this paper, which are metal protrusion defect, oil-paper void defect, surface discharge defect, and floating potential defect, as shown in Figure 7.   The experimental voltage is gradually increased by a step-up method. The initial discharge voltage of PD defect model is a U and breakdown voltage is b U , experimental voltage is adjusted to m U , where: Keep the experimental voltage stable, use Tek DPO 7000 oscilloscope (Tektronix, Beaverton, OH, USA) to simultaneously acquire the AC phase voltage from the voltage divider capacitor and the high frequency electromagnetic wave signal which is captured by the Vivaldi antenna sensor [34,35]). The sampling rate is 50 MHz, and the positive half-cycle zero-crossing point of the AC voltage signal is used as the trigger signal. 2500 AC cycle PD UHF signals are collected in each sampling process.

Analysis of Transformer PD UHF Experimental Results
Assuming that the initial sampling time of discharge pulse of insulation defect is t and the whole sampling time of PD pulse signal of each insulation defect is T, the PD maps of four typical insulation defects of power transformer at X, Y, and Z times are shown in Figures 8-11.  Figure 8 is the PD UHF PRPD (Phase Resolved Partial Discharge) spectrum of metal protrusion insulation defect. It is shown that there is certain fluctuation in the discharge pulse spectrum at different discharge time. The PD spectrum of metal protrusion insulation defect in transformer oil is different from the PD spectrum in air and SF6 insulation environment. In transformer oil insulation environment, the PD pulse signal at the initial sampling time mainly concentrates near the positive half-cycle and negative half-cycle peak of AC phase, and the number of discharge pulses near the positive half-cycle peak is slightly larger than that near the negative half-cycle peak. With the increase of experiment time, the PD spectrum gradually extends to the whole AC phase range. However, in general, the PD pulses of transformer oil metal protrusion mainly distribute near the peak value of positive and negative half-cycle of AC phase. In addition, during the experiment, it was found that the insulation defects of metal protrusions had obvious "cracking" sound before breakdown.
It can be seen from Figure 9 that with the increase of voltage-adding time, the discharge pulse spectrum also has certain fluctuation at different discharge time. At the initial sampling time, the PD pulse mainly concentrates on the positive half-cycle acute angle phase interval and the negative halfcycle acute angle phase interval. With the increase of experiment time, the phase interval of PD pulse distribution increases gradually. In the positive half-cycle of AC phase, the PD pulse expands from the positive half-cycle acute angle phase interval to the positive half-cycle beginning zero-crossing phase interval. In the negative half-cycle of AC phase, the PD pulse expands from the negative halfcycle acute angle phase interval to the negative half-cycle beginning zero-crossing phase interval. It can be seen in Figure 10 that the stability of the PD pulse along oil-paper surface is not as good as that of metal protrusion defect and void defect. At the initial sampling time, PD pulses are mainly concentrated in the positive half-cycle biased to the acute-angle interval and the negative half-cycle biased to the acute-angle interval. With the increase of the experiment time, the distribution interval of the PD pulse gradually expanded to the zero-crossing point of the positive and negative half cycles of the AC phase. With the further increase of experiment time, the distribution interval of the PD pulse has nearly covered the entire AC phase interval.

PD Recognition Accuracy of Cnn-lstm
The configuration of the PD pattern recognition device is: Intel i5 processor (2.5 GHz) (Intel, company, Santa Clara, CA, USA), the memory is 8 GB, and the operating environment is Tensorflow (Anaconda). The recognition object is PD spectrum of four transformer typical insulation defects: metal protrusion defect, oil paper void defect, surface discharge defect, and floating potential defect. The input data is PD UHF PRPD spectrum, wherein the training set data accounts for 80% of the data sample set. And the test set data accounts for 20%. The main parameter settings of the cnn-lstm hybrid network are shown in Table 1. According to Table 1, the single-channel matrix data of the gray-scale PD spectrum are input into the network for training and recognition, and the recognition rate P is used as the evaluation parameter of the pattern recognition ability of cnn-lstm network. The calculation of Pr is where p N is the number of samples that the cnn-lstm network can accurately predict, and sum N is the total number of samples.
The recognition results of cnn-lstm network for different types of typical insulation defects are shown in Table 2. It can be seen from the table that cnn-lstm has the highest recognition rate for floating potential defect, the recognition rate reaches 100%, and the recognition rate for surface discharge defect is the lowest, but it also reaches 89%. Then, the same PD spectrum data is input to the cnn network and the lstm network, respectively, wherein the parameter settings of the cnn network and the lstm network are the same as the cnn-lstm network settings. Compared with cnn-lstm network, cnn network, and lstm network, the pattern recognition result of transformer typical PD insulation is shown in Table 3.
From Table 3, it can be seen that the pattern recognition rate of cnn-lstm network is more accurate than that of cnn network and lstm network for metal protrusion defect, oil-paper void defect, and surface discharge defect, but for floating potential defect, the recognition effect of cnn-lstm network is the same as cnn network, which is 100%. This is because the stability of floating potential defect discharge spectrum is obviously better than that of other types of discharge spectrum, while the advantage of lstm network is the extraction of time series feature information. So when the cnnlstm network recognizes the types of floating potential defects, the lstm network does not provide much help, but for metal protrusion defects, oil paper void defects, and surface discharge defects, with the increase of experiment time under constant voltage, the corresponding discharge patterns of these three kinds of defects fluctuate obviously. At this time, the advantages of lstm network in extracting temporal feature information of spectrum are embodied, which improves the recognition accuracy of cnn network for these defect types.

Conclusions
This paper constructs a PD pattern recognition algorithm of transformer based on cnn-lstm hybrid network, which combines the advantages of cnn in mining local spatial information of PD spectrum and lstm in mining temporal feature information of PD spectrum, and solves the problem that the traditional partial discharge pattern recognition algorithm lacks generality. Through the research, the following conclusions are obtained: (1) The cnn-lstm hybrid network PD pattern recognition architecture is constructed. The local spatial feature information of the PD map information is extracted by the convolutional layer and the pooled layer, and then the extracted feature information is input to the lstm network to extract all the information including the timing feature information. The PD spectrum identifies the feature information, and finally realizes the pattern recognition of the PD insulation defect through the fully connected layer and the Softmax classifier.
(2) Four typical insulation defect models of oil-immersed power transformer are constructed, and the spectrum information of different insulation defects under different experiment time is obtained. Experimental results show that cnn and cnn-lstm have 100% recognition accuracy for floating potential defect with stable PD spectrum, but the recognition accuracy of cnn-lstm is better than that of cnn and lstm for metal protrusion defect, oil-paper void defect, and surface discharge defect.