A Novel Integrated Method to Diagnose Faults in Power Transformers

In a smart grid, many transformers are equipped for both power transmission and conversion. Because a stable operation of transformers is essential to maintain grid security, studying the fault diagnosis method of transformers can improve both fault detection and fault prevention. In this paper, a data-driven method, which uses a combination of Principal Component Analysis (PCA), Particle Swarm Optimization (PSO), and Support Vector Machines (SVM) to enable a better fault diagnosis of transformers, is proposed and investigated. PCA is used to reduce the dimension of transformer fault state data, and an improved PSO algorithm is used to obtain the optimal parameters for the SVM model. SVM, which is optimized using PSO, is used for the transformer-fault diagnosis. The diagnostic-results of the actual transformers confirm that the new method is effective. We also verified the importance of data richness with respect to the accuracy of the transformer-fault diagnosis.


Introduction
Several advanced technologies can be used to monitor power-equipment, and the large amount of status data of used equipment helps make the power grid "smarter".Power transformers are expensive and important components of the smart grid, and hub devices for power transformation and transmission [1][2][3][4].Because various faults, such as discharge and overheating, can occur during the operation of transformers, many characteristics corresponding to the faults can be affected like dissolved gases (H 2 , CH 4 , C 2 H 6 , C 2 H 4 , C 2 H 2 , CO, etc.), organic compounds (methanol, ethanol and 2-furfural), as well as the current and power of the transformers [5].The dissolved gas analysis (DGA) is a common tool for monitoring and identifying transformer's faults.IEEE C57.104.and IEC 60599 provide different methods such as key gases, Doernenburg Ratio, Rogers, three basic gas ratio, Duval triangle, and so on.However, due to the complexity of the working environment and the process structure of the transformers, these methods are not enough to make a right judgement and cannot judge fault fuzzy boundary.According to [6], their accuracy rates are about in 60%, which means the ratio methods cannot account for the diagnostic criteria completely [7].In addition, the concentrations of cellulose chemical markers in oil, such as methanol, ethanol and 2-furfural, are used as a determination mark for diagnosing transformer insulation failure, which still present many challenges for an accurate interpretation in real transformers [8].
To improve the accuracy of fault diagnosis, artificial intelligence and machine learning algorithm were added to the field of transformer-fault diagnosis (TFD), including fuzzy sets [9], artificial neural networks (ANN) [10], artificial immune networks [11], probabilistic neural networks [12], rough sets [13], and support vector machines (SVM) [14].These algorithms provide ways to develop new Energies 2018, 11, 3041 2 of 8 TFD technologies.However, these algorithms have some disadvantages.For example, it is difficult to determine the selection of parameters of fuzzy sets, artificial immune networks and probabilistic neural networks, ANNs are easier to fall into local minimum, and the fault-tolerant ability and generalization ability of rough set are weak.
SVM is usually used as a classification tool.From early 2-category techniques, multi-class SVM have been developed and are more suitable for TFD.The accuracy of multi-class SVM is determined by the parameters of its kernel function and penalty factor.In order to improve the efficiency of SVM in processing large amounts of input fault data, principal component analysis (PCA) will be used.Moreover, to reduce the influence of human experience and subjective judgment on these parameters, a new Particle Swarm Optimization (PSO) is borrowed to search the optimized parameters.This way, the most suitable SVM parameters within the effective input data reflecting the transformer's fault will be found.SVM integrated with PCA and PSO can improves the speed and accuracy of TFD considerably.This paper is organized as follows.Section 2 introduces the complete TFD procedure implemented by improved SVM; In Section 3, we compare the accuracy of transformer-fault diagnosis using different methods.We then verify the effectiveness of the proposed method, and analyze the effect of data richness on the accuracy of the fault diagnosis.Section 4 summarizes all results.

TFD Model Based on SVM Integrated with PCA and PSO
TFD model based on SVM integrated with PCA and PSO is shown in Figure 1.It includes two main parts.One is that a set of transformer fault data (Data set) such as the densities of the dissolved gases is preprocessed by PCA.The other is that the parameters of SVM model are searched and optimized by PSO.
[13], and support vector machines (SVM) [14].These algorithms provide ways to develop new TFD technologies.However, these algorithms have some disadvantages.For example, it is difficult to determine the selection of parameters of fuzzy sets, artificial immune networks and probabilistic neural networks, ANNs are easier to fall into local minimum, and the fault-tolerant ability and generalization ability of rough set are weak.
SVM is usually used as a classification tool.From early 2-category techniques, multi-class SVM have been developed and are more suitable for TFD.The accuracy of multi-class SVM is determined by the parameters of its kernel function and penalty factor.In order to improve the efficiency of SVM in processing large amounts of input fault data, principal component analysis (PCA) will be used.Moreover, to reduce the influence of human experience and subjective judgment on these parameters, a new Particle Swarm Optimization (PSO) is borrowed to search the optimized parameters.This way, the most suitable SVM parameters within the effective input data reflecting the transformer's fault will be found.SVM integrated with PCA and PSO can improves the speed and accuracy of TFD considerably.This paper is organized as follows.Section 2 introduces the complete TFD procedure implemented by improved SVM; In Section 3, we compare the accuracy of transformer-fault diagnosis using different methods.We then verify the effectiveness of the proposed method, and analyze the effect of data richness on the accuracy of the fault diagnosis.Section 4 summarizes all results.

TFD Model Based on SVM Integrated with PCA and PSO
TFD model based on SVM integrated with PCA and PSO is shown in Figure 1.It includes two main parts.One is that a set of transformer fault data (Data set) such as the densities of the dissolved gases is preprocessed by PCA.The other is that the parameters of SVM model are searched and optimized by PSO.

Data Set Preprocessed by PCA
TFD is a complicated task.In order to improve the operating efficiency of the SVM when there are many transformer fault data, the data needs to be pre-processed before they are used to train the SVM model.PCA aims to reduce the dimensions of fault data and replaces them with fewer uncorrelated and unoverlapped data (called principal components).The number of principal components is selected by variance contribution rate indicating how much information is included.

Data Set Preprocessed by PCA
TFD is a complicated task.In order to improve the operating efficiency of the SVM when there are many transformer fault data, the data needs to be pre-processed before they are used to train the SVM model.PCA aims to reduce the dimensions of fault data and replaces them with fewer uncorrelated and unoverlapped data (called principal components).The number of principal components is selected by variance contribution rate indicating how much information is included.Suppose the data set X has n groups and each group has p fault data and they construct an original data observation matrix: To solve the principal components, it needs to find i(i ≤ p) linear functions: 1 Therefore, to obtain the maximum variance, the following equations of conditional extremes are formed: where ∑ represents the covariance matrix.
Here Lagrange multiplier method is used to solve (2).The Lagrangian objective function is expressed as: where the Lagrange multiplier λ j is the characteristic root of ∑ and A i is the corresponding eigenvector.
Because A i = 0 and A T i ∑ A i = var(A T i X j ) > 0, ∑ is positive definite and all characteristic roots are positive.Assuming that: In the practical applications, only p principal components will be selected, which satisfies

Support Vector Machine
Suppose the j-th group of principal component Y j reflects the fault type z j .We divide n groups of fault data into two sets.One set is the training set including l groups and the other set is the testing set including (n-l) groups.The training set is used to solve the parameters of SVM.
TFD is usually a multi-class problem to classify the categories in d (d ≥ 2).The one-versus-one (OVO) method is adopted to extend 2-category SVM to multi-class SVM in this paper.This means it need to build SVM classifiers for any two different fault types F 1 and F 2 (F 1 , F 2 ∈ z j ), and there are a total of d(d − 1)/2 classifiers.Assume a hyperplane function ω T ϕ(Y) + b = 0 can accurately separate F 1 and F 2 whose category labels are marked in −1 and 1.Here ω is the normal vector of the hyperplane, b is the offset, and ϕ(y) is nonlinear transformation function.For the optimal classification hyperplane, the following conditions should be satisfied: and In this case, Y j is mapped into a high-dimensional space.
Energies 2018, 11, 3041 The maximum margin between the plane and the nearest data is 1/ ω .The greater it is, the better the classification confidence is.To increase the misclassification tolerance of SVM, a non-negative variable e j is introduced.Then the problem can be described as: where C is a constant named penalty factor and controls the punishment degree for misclassified data.Lagrange multiplier method is also used to solve (6).The corresponding Lagrangian function is: L ω, σ, e j , α j , where α j > 0 and β j > 0 are the Lagrangian multipliers.After α j (j = 1, • • • , l), ω and b are solved, the final SVM classification function is: where is the kernel function and we choose Gaussian radial basis function: where σ is the parameter of kernel function.

Parameter Optimization in SVM Using Improved PSO
As mentioned before, when using SVM for fault diagnosis, we first need to determine the parameter σ in kernel function (9) and penalty factor C in (6).σ affects the optimal classification performance and generalization ability of the SVM.C is required to balance the learning machine's complexity and empirical risk when determining the minimization of the objective function.Therefore, σ and C should be optimized.We use an improved PSO algorithm for optimization.
Assuming that in a 2-dimensional search space, there is a swarm including S particles, q s = (q s1 , q s2 ) (s = 1, • • • , S).Each particle represents a potential solution and corresponds to a point in the 2-dimensional search space.Its velocity is v s = (v s1 , v s2 ) T and optimal position is P s = (P s1 , P s2 ) T .The optimal position within the S-particle population represents the global extremum, and it is set to P g = P g1 , P g2 T .The position-updating method for the particle's velocity is expressed as: where c 1 (t) and c 2 (t) are acceleration constants, r 1 (t) and r 2 (t) obey the (0,1)-uniform distribution, wv(t) is the speed update inertia weight representing the effect of the previous generation's particles on the next generation particles' velocity during the particle updating process.Generally, the algorithm has relatively strong global optimization capability when wv(t) is large, and a relatively strong local optimization capability when wv(t) is small.However, the linear weight-adjustment method is single, and thus limits the optimization of the search ability.Aiming to Energies 2018, 11, 3041 5 of 8 change to single adjustment mode and better adapt to the complex environment, we present a new scheme for the stochastic inertia weight: where fit(t) represents the optimal fitness value of the t-th generation and fit(t-10) is the optimal fitness value of the (t-10)-th generation, α 1 and α 2 are set to 0.5 and 0.4, respectively, reflecting the search ability in different situations, and ε is a random value between 0 and 1.The acceleration constants c 1 (t) and c 2 are modified in: where c 1 (t) decreases linearly from the initial value c 11 to the final value c 12 , while c 2 (t) increases linearly from c 21 to c 22 .

Verification and Discussion
Based on the above mentioned SVM-diagnosis model, optimized using PSO, a code is made in MATLAB in which SVM algorithm is implemented directly by MATLAB toolkit [15].Some real TFD examples are analyzed.

TFD Example 1
We analyze the dissolved-gas data for the existing 157 groups of transformers under normal and other fault conditions.The dissolved-gas data were detected from 6 types of real transformer faults: low-energy discharge fault (LE-D), high-energy discharge fault (HE-D), high temperature overheat fault (HT), medium temperature overheating fault (MT), medium and low temperature overheating failure (ML-T), and low temperature overheating fault (LT).112 groups of data were selected as training samples, and the remaining 45 groups were used for testing.The distribution of the various faults and normal state samples are shown in Table 1.In this analysis, the particle swarm number is 20, the maximum iteration number is 200, and the search intervals for parameters C and σ are [0.01,1000] and [0.01, 1000], respectively.We adopt the three-ratio method, Duval triangle method, back propagation neural network (BPNN), and SVM methods to diagnose the testing data set for comparison.The same set of data was used for all methods.During the test, BPNN selected a network structure with 13 hidden nodes.
Table 2 shows the fault-diagnosis accuracy for different methods, when testing the same sample of transformer.The Duval triangle method shows the lowest accuracy.The three ratio method's accuracy is better than The Duval triangle method, however, worse than other methods.Both of three-ratio and Duval triangle methods are obtained from typical accidents, and they will fail when dealing with some complicated faults.The accuracy of the neural-network algorithm (BPNN) is 60% and it will be improved if there are a lot of data.Compared with the BPNN and IEC methods, the SVM method shows a relatively good diagnosis.When the SVM parameters are optimized, the accuracy of the fault diagnosis improves substantially.

TFD Example 2
This section uses SVM optimized by PSO to analyze the fault and normal states from the 132 groups of data detected from real transformers.The data were from the oil-dissolved gas and SCADA.We also verified the impact of data richness on the results.The dissolved gases in the oil include C2H2, C2H4, C2H6, CH4, CO, CO2, H2 and total hydrocarbon.The SCADA data include maximum current, minimum current, average current, maximum active power, minimum active power, average active power, maximum reactive power, minimum reactive power, and average reactive power.SVM optimized by PSO is used to diagnose the faults for three kinds of data: using only the dissolved gas data in oil, using only the SCADA data, and using all data.We used 112 groups as the training set and 20 groups as the testing set, and then judged the effect of data types on fault diagnosis.
In this experiment, the number for the particle swarm is 20, the maximum iteration number is 200, and the search interval of parameters C and σ are [0.01,1000] and [0.01, 1000], respectively.
The optimized parameter values and accuracy rate of different data types are shown in Table 3.We adopt the three-ratio method, Duval triangle method, back propagation neural network (BPNN), and SVM methods to diagnose the testing data set for comparison.The same set of data was used for all methods.During the test, BPNN selected a network structure with 13 hidden nodes.
Table 2 shows the fault-diagnosis accuracy for different methods, when testing the same sample of transformer.The Duval triangle method shows the lowest accuracy.The three ratio method's accuracy is better than The Duval triangle method, however, worse than other methods.Both of three-ratio and Duval triangle methods are obtained from typical accidents, and they will fail when dealing with some complicated faults.The accuracy of the neural-network algorithm (BPNN) is 60% and it will be improved if there are a lot of data.Compared with the BPNN and IEC methods, the SVM method shows a relatively good diagnosis.When the SVM parameters are optimized, the accuracy of the fault diagnosis improves substantially.

TFD Example 2
This section uses SVM optimized by PSO to analyze the fault and normal states from the 132 groups of data detected from real transformers.The data were from the oil-dissolved gas and SCADA.We also verified the impact of data richness on the results.The dissolved gases in the oil include C 2 H 2 , C 2 H 4 , C 2 H 6 , CH 4 , CO, CO 2 , H 2 and total hydrocarbon.The SCADA data include maximum current, minimum current, average current, maximum active power, minimum active power, average active power, maximum reactive power, minimum reactive power, and average reactive power.SVM optimized by PSO is used to diagnose the faults for three kinds of data: using only the dissolved gas data in oil, using only the SCADA data, and using all data.We used 112 groups as the training set and 20 groups as the testing set, and then judged the effect of data types on fault diagnosis.
In this experiment, the number for the particle swarm is 20, the maximum iteration number is 200, and the search interval of parameters C and σ are [0.01,1000] and [0.01, 1000], respectively.The optimized parameter values and accuracy rate of different data types are shown in Table 3.

λ
ji ≥ 0.85.The k-th principal component for the j-th group is y jk = A T k X j .All the principal components form a vecor Y j = (y j1 , y j2 , • • • , y jp ) T .

Table 1 .
Statistics of samples for training and testing, corresponding to various types of real faults.

Table 2 .
Accuracy rate for the different diagnostic methods of transformer.

Table 2 .
Accuracy rate for the different diagnostic methods of transformer.

Table 3 .
Fault-diagnostic results of transformer of different methods.