1. Introduction
Power transformers suffer from the long-term effects of high-voltage electric, thermal, and mechanical stresses during operation [
1]. In the event of a fault, not only is the transformer seriously damaged, but people’s normal life and production are greatly threatened. Predicting the state of a transformer would help to recognize a potential threat in time and grasp the development trend of the fault. State prediction provides more opportunities to handle potential faults in advance and greatly reduce negative impacts on the transformer’s reliability and availability when a fault occurs [
2,
3].
Assessment and prediction technologies to determine the health condition of power transformers have been reported in the following aspects. Some studies have focused on predicting specific state parameters, such as gas concentration dissolved in the oil [
2,
3], top oil temperature [
4], residual flux [
5,
6], inrush current [
7], moisture in the insulating cellulose [
8], and furan [
9], to characterize the development of the transformer’s status. A small number of studies have put forward new ideas for establishing a transformer failure rate model [
10,
11]. In addition, some scholars have paid special attention to the remaining life [
12,
13,
14,
15] of the transformer. The state prediction models proposed in these studies include the neural network [
4], support vector machine regression [
2,
3], fuzzy logic [
14], nonparametric regression [
10], and probabilistic graph [
16]. These methods have demonstrated their effectiveness in a number of circumstances, and some research results have been obtained.
The transformer often deteriorates gradually, rather than abruptly. Correspondingly, the related parameters change continuously towards the status of fault. Thus, it is natural to employ temporal analysis methods to model the sequential dependency between the state parameters over time. Recurrent neural networks (RNNs) [
17] have been proven as an effective tool to model temporal dependency in various applications. Xu et al. [
18] introduced a novel method based on the RNN to assess the health status of hard drives via the sequence of their attributes. Experimental results show that the RNN method can effectively evaluate the health status of the hard drives and play the role of fault prediction. Tian and Zuo [
19] developed an extended recurrent neural network (ERNN)-based approach for predicting the health condition of gearboxes based on the vibration data collected from an experimental gearbox system. The long short-term memory (LSTM) network [
20] [
21], as an improved structure of the RNN, to some extent, relieved the problem of gradient dissipation and explosion in the modeling process of RNN over a long time, which gained the academic attention of the research community. An LSTM approach for the estimation of remaining useful life was proposed by Zheng et al. [
22]. This method can make full use of the sensor sequence information and expose hidden patterns within the sensor data with multiple operating conditions, faults, and degradation models. Kong et al. [
23] proposed an LSTM RNN-based framework to tackle the issue of short-term load forecasting for individual electric customers.
The existing assessment/prediction methods are mainly based on a single or a few state parameters to make the analyses and judgments. The status assessment results are always far from comprehensive and cannot reflect the objective rules between the fault evolution and state characteristics [
24]. With the improvement of information technology and network technology, relevant application systems such as on-line monitoring systems, production management systems (PMS), dispatching automation systems, and meteorological information systems can realize data sharing and interaction. It is thus urgent to conduct information fusion processing and analysis on all kinds of data to tap the characteristic information that represents the operating state of a transformer.
The accumulation of the transformer state panoramic information provides the prerequisite for the evaluation and prediction of the transformer operating state. In this paper, we use the transformer-condition-related data to employ analysis of transformer status evolution via a deep learning method. A data-driven equipment state correlation analysis and state prediction model is built to realize the preliminary warnings of potential failures of the equipment. This can help identify the equipment that needs specific attention. Based on the key parameters of the operating state, this paper proposes a method for predicting the running conditions of power transformers based on the LSTM network. By combining the quantitative and qualitative indicators, the LSTM prediction model explores the relationship between the characteristic parameters and the transformer state. The feasibility and accuracy of this method are verified through case studies.
The rest of the paper is organized as follows:
Section 2 introduces the basic information on the long short-term memory recurrent neural networks.
Section 3 provides further information on the proposed transformer operating state prediction approach.
Section 4 validates the prediction approach with different case studies and discusses the obtained results. Finally, conclusions are presented in
Section 5.
3. Transformer Operating State Prediction Using the LSTM-Based Approach
3.1. Input Characteristic Parameters Based on Panoramic Information
The relevant data needed for the research were provided by State Grid Corporation of China (Beijing, China). The voltage levels of the transformers are from 35 kV to 750 kV. Among them, the transformers appearing in the historical fault data are from across the 28 provinces in China, and the transformers were put into operation starting in the year 1989. Information relating to the defects and fault cases includes the basic transformer account information, inspection record information, poor working condition records, defects and fault dates, defect types, causes, disassembly photos, the corresponding routine test and diagnostic test data, H2, CO, CO2, CH4, C2H4, C2H2, C2H6, gas production rates, main gas ratios (C2H2/C2H4, CH4/H2, C2H4/C2H6, CO2/CO), corresponding load data (active and reactive power), and the corresponding meteorological data (temperature, humidity, sunlight intensity, wind speed, rainfall, and snowfall).
In the above database, there are two categories of data for assessing and predicting the transformer operating state: quantitative and qualitative indicators. Quantitative indicators represent data with different dimensions and magnitudes, and qualitative indicators represent the state in descriptive language. Qualitative indicators cannot be used directly in the assessment of the transformer status and must be quantified for calculation. The following section introduces the specific quantification method used.
3.2. Output Target Defined from the Transformer Operating Status
In general, the transformer operating state is divided into four patterns: normal operating state, minor defects, severe defects, and critical state [
27]. The corresponding set of states is
V = {
v1,
v2,
v3,
v4} = {good, poor, severe, and worst}.
v1 indicates that the equipment is stable and that all the state parameters are in accordance with the standard.
v2 indicates that some of the parameters of the trend are approaching the direction of the standard limit but have not exceeded the standard and that the transformer can continue to run.
v3 indicates that some of the characteristic parameters have changed significantly and are close to the standard limit or that some of the parameters exceed the standard limit.
v4 indicates that some of the characteristic parameters have exceeded the standard limit and manifested as one or more critical defects. Power outage maintenance must be arranged immediately.
Table 1 shows the operating state of a transformer and the corresponding maintenance strategy.
3.3. Methods for Indicator Quantification
In this paper, the relative degree of degradation (RDD) [
28] is used to characterize the current state of the transformer compared to the fault state. The RDD reflects the degree of conversion of the transformer state from normal to fault patterns, and is expressed as a value in [0, 1]. The smaller the value is, the better the state is. A value of 0 indicates that the transformer is in good and normal condition, and a value of 1 signifies that the transformer is in the critical fault condition.
The optimal value of the parameter in the quantitative index is
a, the alarm value is
b, and the current measured value is
d. The RDD of the indicator can be expressed as
where
r represents the RDD of the indicator and
G represents the deterioration function.
In this paper, the quantitative indicator function of the transformer status is established from the perspective of natural degradation. For the maximal indicators, such as the absorption ratio, the larger the data are, the better the state is. For the minimal indicators, such as gases dissolved in the insulation oil, the smaller the data are, the better the state is. The RDDs of the extremely large indicator and minimal indicator are respectively expressed by
where
rl and
rm represent the RDDs of the extremely large indicator and minimal indicator, respectively;
al and
am represent the optimal values of the extremely large indicator and minimal indicator, respectively;
bl and
bm represent the alarm values of the extremely large indicator and minimal indicator, respectively; and
d represents the current measured value.
For quantitative data, we use the fuzzy distribution method to establish the mapping of each indicator corresponding to different operating states. The triangular–trapezoid combination membership function has a simple distribution and intuitive results [
27]. The triangular–trapezoid model is in accordance with the four types of power transformer operating states [
29]. Taking the quantitative monitoring data as the input characteristic parameter and the RDD as the output target, the support vector machine (SVM) has a strong ability to address small sample data and is used to fit the distribution function, as shown in
Figure 3.
For qualitative indicators using descriptive language, such as manual inspection records and some technical performance parameters, we used a fuzzy statistical experiment to determine the membership. First, a number of experts gave the basis of the evaluation criteria and set the score range as [0, 100]. The higher the score is, the worse the degree of deterioration is. Then, the score was normalized to [0, 1] to determine the degree of membership. Based on the relative importance of job title, seniority, and academic qualifications, which are related to the level of technical experience, experts were given different weights to reduce the subjective influence on the quantitative results. The data set provided by the State Grid Corporation of China is relatively complete and there is no missing input information. The weighted scoring mechanism is given by:
where
li is the score for different state levels of indicator
i,
lij is the score for different state levels of indicator
i given by the
jth expert, and
wj is the weight of the
jth expert. The weights satisfy the relationship ∑
wj = 1, and the total number of experts is
n.
In the comprehensive state evaluation of the transformer, the contribution of each indicator is different. Different weights can distinguish the importance of the indicator. Therefore, determining the weights reasonably is the key to an accurate assessment. In view of the complexity of the transformer system and to minimize the subjective factors, the analytic hierarchy process (AHP) [
29] was used to give weights to each indicator.
By using the above quantitative process, we can assess the health index of the transformer and ultimately determine the transformer operating state. This can allow us to revise and supplement the labels for the operating state of the transformers.
3.4. The Proposed LSTM Prediction Model
The transformer panoramic state information is taken as the input characterization parameters. From the information, the quantitative data are normalized, and the qualitative information is transformed into state membership. The state probability interval to be predicted at the next moment is taken as the output. Through nonlinear transformations and LSTM correlation feature extraction, the Softmax classifier predicts the probability of the next moment to determine the state of the transformer.
Figure 4 shows the transformer state prediction architecture based on the LSTM network. The detailed steps are given below.
- (1)
Samples are collected and divided into training sets and test sets.
- (2)
To reduce the influence of the data dispersion, quantitative data are normalized using the standard deviation method:
where
dmin k is the minimum monitoring data of the indicator k,
dmax k is the maximum monitoring data of the indicator k, and
dk is the monitoring data of the indicator
k.
- (3)
Quantitative data is fit to the membership function of the RDD and operating state using the SVM.
- (4)
Qualitative indicators are quantified according to the fuzzy statistical experiment.
- (5)
The AHP method is used to determine the weight of each indicator.
- (6)
The comprehensive fuzzy evaluation results corresponding to v1–v4 are weighted with Steps (3) and (4) according to Step (5), and the comprehensive evaluation results are taken as the LSTM output labels.
- (7)
According to the BPTT algorithm, the LSTM network model is trained to extract the feature relationships between the key parameters and the predicted transformer status, and the parameters of the prediction model are obtained.
- (8)
The prediction parameters of the LSTM model are used to predict the operating state of the transformer in the test set, and the accuracy of the model is verified.
4. Case Studies and Analysis
A total of 206 transformers showing confirmed existence of abnormal defects/faults and 174 transformers indicating early warnings/alarms from the oil chromatographic online monitoring devices formed the sample library of the prediction model. According to the data from the 380 transformers in the sample database, 228 transformers were randomly selected to form the training set, and the remaining 152 transformers were used to form the test set. The LSTM networks were used to extract the correlation between the predicted transformer state and the panoramic information.
To increase the learning speed and reduce the risk of the network falling into the local minimum, the weight matrix in the LSTM was initialized using a Gaussian distribution with a mean of 0 and variance of 1, and the quadrature matrix was obtained from the singularity decomposition value [
30]. The LSTM bias term and the output layer bias were initialized to 0. The output layer weight matrix was multiplied by 0.01 for the random number from the Gaussian distribution with a mean of 0 and variance of 1. The input layer size of the prediction was 72, the number of LSTM hidden layer neurons was 100, and the output layer size was 4. To prevent over-fitting, the signal loss rate was set to 0.2.
Meanwhile, with the same input characteristic parameters and output targets, the support vector machine (SVM) and backpropagation neural network (BPNN) model were constructed to predict the transformer operating state using the training samples. The SVM model used the radial basis function (RBF) as the kernel. The optimal penalty factor was 0.1, and the RBF kernel parameter was 10−3, as obtained through cross-validation. The structure of the BPNN consisted of an input layer, a hidden layer, and an output layer. By using a trial and error method, the optimal number of neurons in each layer was chosen to be 72, 200, and 4, respectively. The learning rate in the BPNN model was 0.03, and the learning cycle was 1000. The prediction models were based on the Python language in an Ubuntu 15 operating environment.
To evaluate the performance of the prediction model, we used the overall average accuracy. Accuracy expresses the probability that the result for each random sample predicted using the model matches the actual type. The overall average accuracy is defined as
where
NP denotes the number of correctly predicted samples and
NT denotes the total number of samples in the entire dataset.
The relationship between the quantitative data represented by the dissolved gas in the oil and the operating state is calculated as follows. The RDDs of H2, CH4, C2H4, C2H6, C2H2, CO/CO2, and total hydrocarbons are taken as the input characteristic parameters, and the operating state is the output target. A least squares support vector machine (LS-SVM) was used to fit the distribution function.
The fitting sample database was composed of the off-line experimental dissolved gas analysis (DGA) data of 206 transformers, which showed confirmed existence of abnormal defects/faults. The sample data included the monitoring information from the equipment normal operation period, deterioration period, and fault, and they dynamically characterized the trend of the equipment status. Of these, 137 cases were used for training, and 69 cases were used for testing. The operating states of 66 samples in the test set were predicted with an accuracy rate of 95.7%. The RDD of the gases dissolved in the oil corresponds to the
v1–
v4 membership functions
, as follows.
We take the linguistic description of the maintenance history as an example to provide quantitative results of the qualitative variable. The results are shown in
Table 2.
Five experts were invited to give the relative importance of the comparison between the indicators according to the AHP requirements. We used these data to calculate the weights. The traditional method is to construct a judgment matrix and find the maximum eigenvalues of the matrix and the corresponding eigenvectors. The eigenvectors are the index weights. However, in practice, the construction of the evaluation matrix is only adjusted based on a rough estimate. It is arbitrary and often requires multiple adjustments to satisfy the consistency check. An improved method [
31] can be adopted to calculate the optimal transfer matrix to naturally meet the consistency, and the relative weight of each evaluation factor can be obtained directly. The calculation results are shown in
Table 3, and the specific calculation process can be found elsewhere [
31].
In this work, we adopted the weighted average of the comprehensive evaluation. The element vi corresponding to the maximum evaluation value is determined as the evaluated operating state. The assessment results are the labels used to construct the prediction models.
4.1. Short-Term Prediction of the Transformer Operating State
To evaluate the short-term prediction performance of the three models, experiments with a forecast horizon of one week were implemented. The overall average accuracies generated from the different models for the training and test datasets are shown in
Figure 5.
Based on the prediction results, the accuracies with the prediction horizon of one week clearly increase over BPNN, SVM, and LSTM models, in that order. The accuracy of the LSTM model is increased significantly by 10.7% and 6.2% compared with those of the BPNN and SVM models, respectively. The test accuracy is increased by 10.6% and 6.3%, respectively.
Taking the 500 kV #2 transformer as an example, the basic condition of the transformer is as follows. The date of production is July 2006 and the date of initial operation is November 2006. Routine tests on 19 March 2008 and 26 May 2011 showed no abnormalities. The transformer top oil temperature varies in the range of 30~60 °C. In the summer of 2009, it suffered a lightning over-voltage, and a defect occurred. The running environment is harsh, and the pollution level is II. The poor working condition records show that a 30% overload lasted for 43 min on 18 July 2011. The on-line monitoring data of the oil chromatography from 14 to 26 March 2012 are shown in
Table 4.
The data from 14 to 26 March 2012 are used to predict the operating state of the transformer one week later, on 2 April 2012. The probabilities predicted from the BPNN, SVM, and LSTM models corresponding to states v1–v4 are [0.2434, 0.6166, 0.1700, 0], [0, 0.4356, 0.4419, 0.1225], and [0, 0.0191, 0.7308, 0.2501], respectively. According to the principle of maximum confidence, the BPNN prediction result corresponds to a v2 (or poor) state. The SVM prediction result corresponds to a v3 (or severe) state, with a small difference between the v2 and v3 state reliabilities, indicating that the prediction recognition effect is not distinct. The LSTM prediction results correspond to the v3 (or severe) state, with an obvious identification effect.
On 2 April 2012, the content of H
2 dissolved in the oil reached 185.76 µL/L and the content of C
2H
2 reached 2.98 µL/L. The online monitoring system was activated. Then, the ultrasonic partial discharge test of the transformer was carried out, and an internal discharge phenomenon was found. During the overhaul of this transformer, the maintenance personnel found that overhang angle of the silicon steel sheet on the transformer core iron yoke parts exhibited severe deformation, as shown in
Figure 6. The protrusive tips of the overhang angles in magnetic fields vibrated strongly and caused the contact discharge, resulting in the abnormal content of dissolved gases in the transformer oil. The discharge did not affect the solid insulation, so the contents of CO and CO
2 exhibited no significant change. The predicted results of the LSTM model are consistent with the actual transformer running status.
4.2. Long-Term Prediction of the Transformer Operating State
To evaluate the long-term prediction performance of the three models, experiments with a forecast horizon of one month were implemented. The results are shown in
Figure 7.
Compared with
Figure 5, with an increase in the prediction horizon, the prediction accuracies of the three models are reduced. As seen from
Figure 7, the accuracies with the prediction horizon of one month ranked in the order of BPNN, SVM, and LSTM from worst to best. Compared with the BPNN and SVM models, the accuracy of the LSTM model in the training set is increased by 17.9% and 8.8%, respectively, and the accuracy of test set is increased by 18.3% and 9.7%, respectively.
Taking a 220 kV #1 transformer as an example, the basic condition of the transformer is as follows. The date of production is April 2000, and the date of initial operation is June 2000. The transformer is in basically good operation, and the overall load rate is relatively high. The chromatographic period detection found that the total hydrocarbon content of the transformer in 2010, after meeting the peak demands in the summer, had a greater increase. Subsequently, the total hydrocarbon content slowly increased year after year but did not exceed the alarm value. Except for the total hydrocarbons, the remaining characteristic gases dissolved in the insulation oil were normal. Some of the oil chromatography online monitoring data from June to July 2013 are shown in
Table 5.
The data from June to July 2013 were used to predict the operating state of the transformer one month later, on August 2013. As the load rate of this transformer is special, we consider both the under-load and full-load cases in the predictions.
In the under-load condition, the probabilities predicted from the BPNN, SVM, and LSTM models corresponding to states v1–v4 are [0.0027, 0.2956, 0.5711, 0.1306], [0, 0.2735, 0.5172, 0.2093], and [0, 0.1038, 0.6096, 0.2866], respectively. The BPNN, SVM, and LSTM model prediction results all correspond to the v3 (severe) state. We predict that there should be a severe fault inside the transformer. Although the transformer can continue to run, operation and maintenance personnel should arrange maintenance work as soon as possible.
In the full-load condition, the probabilities predicted from the BPNN, SVM, and LSTM models corresponding to states v1–v4 are [0, 0.2173, 0.2748, 0.5079], [0, 0.1985, 0.2561, 0.5454] and [0, 0.1835, 0.2149, 0.6016], respectively. The BPNN, SVM, and LSTM model prediction results all correspond to the v4 (the worst) state. Thus, the transformer needs to be repaired immediately.
In fact, the operation and maintenance personnel in the substation contacted the dispatching department and prohibited the operation of this transformer under full-load conditions. After the summer, during the overhaul of this transformer, the maintenance personnel found that a severe overheating fault had occurred under part of the clamp and tank potential connection. There were more obvious signs of overheating discoloration. For the controlling load, the fault of this transformer had not yet entered the critical worst running conditions.
5. Conclusions
This paper studied the status of early warning technologies for power transformers and proposed a transformer operating state prediction method based on the data-driven LSTM network. The conclusions are as follows.
(1) By analyzing the state panoramic information of the transformer, the degree of deterioration of the transformer is depicted in the RDD. The membership relationship between the RDD of each indicator and the state of the transformer is obtained using fuzzy treatment. Then, the LSTM network is constructed to automatically extract the feature relationship between each indicator and the predicted operating state.
(2) The case studies show that the proposed method can effectively predict the operating state of power transformers. The model based on LSTM networks predicts the state of the transformer with an accuracy of 94.4% for a one-week forecast horizon and 81.2% for a one-month forecast horizon. Compared with the traditional BPNN and SVM methods, the LSTM model can more accurately reflect the real situation of the transformers.
(3) By predicting and analyzing the operating state of the transformers, the prediction results based on the LSTM network are in accordance with the actual conditions. The difference in the predicted state probabilities is more obvious, and the results are more convincing.
We will focus on improving the LSTM model in future research. The deep learning methods will be combined with intelligent optimization algorithms to determine the optimal parameters of the prediction model.