1. Introduction
As the transformer is one of the most important unit of an electrical system, it is natural that efforts are made to preserve its integrity and increase its availability [
1]. For these purposes, maintenance policies and procedures are planned and applied to ensure the least interruption of such equipment [
1,
2,
3]. In fact, any failure in this equipment can affect the whole network, compromise other elements in the grid and generate significant economic impacts [
1,
2].
Especially regarding oil-filled transformers, the maintenance operations should be carried out with additional caution to minimize the potential problem of flammability of the thermal insulation material [
4]. Due to its complexity and importance, the problems of aging degree of paper insulation has been object of study in many recent works [
2,
3,
4]. Several other tests of insulation items have been an important part of transformers fault diagnosis systems, with emphasis on chromatographic oil-dissolved gas analysis, namely dissolved gas analysis (DGA) [
4,
5,
6,
7,
8,
9].
In this context, methods based on the dissolved gas ratios allow values of gas concentrations to be associated with the occurrence of some faults, such as partial discharges and thermal faults. Power transformers faults usually take to the degradation of the insulating materials, which results in the release of certain gases that are dissolved in the oil. From a certain concentration level, these gases act as a thermal insulator bringing forth the equipment overheating and, simultaneously, decreasing the oil dielectric vigor, which may cause electric isolation failure. On the other hand, the overheating of the oil increases the levels of some gases such as methane and ethylene, for example. So, an accurate prediction of oil-dissolved gas concentrations is a valuable tool to monitor the transformer condition and to develop a fault diagnose system [
10].
Over the last years, the analysis of dissolved gases in the transformers oil, based on International Electrotechnical Commission (IEC) [
11] and Institute of Electrical and Electronics Engineers (IEEE) guidelines [
12], became a widespread practice followed by all electric power companies. In this context, the use of artificial intelligence techniques combined with chromatographic analysis of oil-dissolved gases (DGA) deserves a highlight. In general, intelligent approaches have been proposed to circumvent the limitations of purely traditional DGA-based methods, with emphasis on artificial neural networks approaches, including generalized regression neural network [
10], support vector machine (SVM) [
5,
13,
14], expert systems (EPS) [
15], and fuzzy systems [
16]. A recent survey by Cheng and Yu [
4] shows that the use of these techniques has produced promising results in the development of high precision fault diagnosis systems. However, recent results indicate that these techniques still present limitations in the prediction of oil-dissolved gases, leaving room for further improvements.
Regarding expert systems, for example, an accurate simulation of the experience, skill, and reasoning process of the experts strongly depends on the quality of the established knowledge base, which is one of the main limitations of this approach in most cases. In general, the knowledge base hardly considers all possible cases which leads to errors in identifying symptoms of faults not present in the database [
4,
6].
In relation to neural networks, the basic idea is to map a highly nonlinear input and output relationship and, from this relation, output a diagnosis conclusion about the fault [
4]. Despite satisfactory results, this traditional approach is not able to predict multi-step ahead values and, mainly, the performance is influenced by the input data and it is limited by training samples and parameters. In [
10], for example, the authors propose the use of principal components analysis (PCA) to improve prediction accuracy by selecting the most representative inputs for network training. In fact, the use of PCA in this context is widespread [
10,
17,
18]. Moreover, the usual application of non-recurrent models, such as the Generalized Regression Neural Network (GRNN), performs only non-linear mapping between inputs and outputs and is not appropriate for future estimation of oil-dissolved gas values.
Another important difficulty in neural network approaches is the adjustment of training parameters. GRNN models, for example, are strongly dependent on the smooth factor parameter. The authors of [
10] overcame this limitation by applying an optimization method only to select a suitable value of the smooth factor. That approach has the limitation of needing to recalculate the smooth factor whenever the data is updated. Those authors also apply the principal components analysis to reduce the influence of the input data on the model.
Despite producing relatively accurate results, with good generalization capacity and little over-fitting, the SVM in regression problems is also strongly influenced by the quality of the input vectors [
14].
On the other hand, despite addressing some issues regarding data uncertainty, the use of fuzzy logic increases the dependence on expert knowledge to create a set of fuzzy rules, which describes the relationships between input and output variables. Many authors propose the creation of a set of rules based on standards (IEC 599, IEEE), which clearly is not efficient, since combinations of different gas ratios contemplated by the standard may not occur in practice, leading to a serious problem of indecision or non-decision in diagnosis.
Considering this situation, we propose here a combination of a nonlinear autoregressive neural network model with the discrete wavelet transform, for predicting dissolved gas concentrations and gas concentration ratio in transformer oil. The objective of this approach is to create a model that is invariant in relation to the time delay parameter and allows a less sensitive prediction to long-term time dependencies, besides presenting better generalization and learning capacities, resulting in a high-accuracy multi-step ahead forecast of in-oil gas concentrations.
Wavelet transform localizes features in the input data and concentrates its features in a few wavelet coefficients without affecting the data quality [
19]. As a result, we have a set of input data of simplified complexity that leads to a high accuracy prediction, without any human intervention. So, the hypothesis is that the use of wavelets functions to create sparse versions of the initial data can increase the prediction accuracy of the model, increasing confidence in multi-step ahead predictions and reducing the effect of the delay parameter in a high precision fault diagnosis.
As the proposed model can predict future values of the oil-dissolved gas concentrations, it is applied in a transformer condition monitoring system, in conjunction with reliability techniques, to provide an early diagnosis of possible faults and to estimate the remaining life of the transformer based on historical data and events accumulated over a given period.
It should be noted that the proposed model is not intended to produce a conclusion about the transformer fault diagnosis, but rather a high precision prediction of concentrations and gas ratios at future time points, allowing the operator to anticipate possible faults and proceed with recommended protective measures. Thus, the main contributions of the paper are as follows:
The development of a gas prediction model based on combination of wavelet functions with a nonlinear autoregressive network, insensitive to the delay parameter and type of wavelet function;
High precision forecasts of future values of the concentrations and ratio of oil-dissolved gases, contributing to increase the reliability of the monitoring system to anticipate possible faults;
An alert system that monitors future values of gas concentrations and allows the anticipation of abnormal situations and to carry out appropriate protection measures.
The data fitting and accurate prediction ability of the proposed model is evaluated in a real-world example, showing better results in relation to several current prediction models and common time series techniques.
3. The Proposed Prediction Model
The presence of a small concentration of oil-dissolved gases in the transformer is a natural consequence of the normal operation of this equipment, due to the electric field, humidity, and oxidation [
10]. However, an increase in the concentration of these gases, which includes hydrogen (H
2), carbon monoxide (CO), carbon dioxide (CO
2), methane (CH
4), acetylene (C
2H
2), ethylene (C
2H
4), and ethane (C
2H
6), may be related to the occurrence of failures and abnormalities. The elevation in methane and ethylene concentrations, for example, may indicate the occurrence of some thermal failure in the transformer, while variations in hydrogen and acetylene are indications of electrical faults [
12].
Therefore, the variation of the gas concentrations over time is a critical issue in the transformer fault diagnosis analysis. Moreover, as some of these gases have a strong correlation in a situation of failure, many gas concentration ratios also must be considered.
So, the proposed prediction model (NAR–DWT) is applied to predict future values of the seven kinds of oil-dissolved gas and the IEC and Rogers ratios (CH4/H2, C2H2/C2H4, C2H4/C2H6), according to the following steps:
Step 1: A set of historical oil-dissolved gas data is collected from a transformer equipped with a GE Kelman-Transfix (GE—General Electric, São Paulo, Brazil) and GE Intellix BMT 330 (GE—General Electric, São Paulo, Brazil).
Step 2: The collected data set is evaluated using the discrete wavelet transform to create a sparse and simplified version with good approximation properties.
Step 3: Nonlinear autoregressive neural network models are trained and validated according a k-fold cross validation approach.
Step 4: Apply the created neural network model to predict the oil-dissolved gas concentrations and ratios.
A flowchart of the proposed prediction model is presented in
Figure 2.
Prediction of In-Oil Gas Concentrations
The nonlinear autoregressive model has been applied and the accuracy of the prediction is evaluated using the mean squared error performance between given target and predicted values.
A training function was applied based on Bayesian regularization, random data division with 80% for training and 20% for testing. We also applied a k-fold cross-validation, in which the data were divided into 10 subsets and the training repeated 10 times, using one of the 10 subsets at a time to test/validation, while the other 9 subsets forming a single training set. The error estimation is averaged over all 10 trials.
The wavelets Symlets and Daubechies are applied to create a sparse and simplified version for the gas concentrations and gas ratios data, respectively.
In order to enable a comparison regarding prediction accuracy and validity we adopted the same evaluation criteria of [
1]:
- (a)
The relative percentage error between target and predicted values (
avg_err)
- (b)
The maximum relative error (
max_err)
in which
N is number of data samples,
and
are target and predicted value, respectively.
The proposed model is evaluated in real world example using a set of oil-dissolved gas concentration data from a transformer in a 13.8–230 kV, 190 MVA substation located in Brazil. The device has been equipped with a GE Kelman-Transfix that is featured with a photo-acoustic detection technology to measure the gas concentrations. The data set is composed of seven months of daily observations, carried out in the period from November 2016 to May 2017, corresponding to 176 samples of the gases H2, CO, CO2, CH4, C2H2, C2H4 and C2H6.
4. Numerical Results
First of all, we evaluated the effect of the time delay parameter
d on the performance of the training process, evaluated using the mean squared error (
mse) and the coefficient of determination R, which is a goodness-of-fit measure for linear regression between the target and the predictions. In this case, we set
d from 2 to 10, with step 1, and the error averaged over all 10 trials according to the cross-validation approach described above.
Table 1 illustrates the results for this evaluation using the concentration of CO
2 as an example. The NAR–DWT model presents a very accurate fit and a small
mse independent of the value of
d. The results for the other gases are similar.
We also tested the performance of the wavelet functions and the effect of this factor on the accuracy of the proposed model. For this we selected some usual functions from three traditional wavelet families, Daubechies (
db), Symlets (
sym), and Coiflets (
coif), and evaluated results regarding
mse, R,
max_err, and
avg_err. Corresponding results are highlighted in
Table 2.
The prediction results of oil-dissolved gas concentrations and the gas ratios are presented in
Table 3 and
Table 4, respectively. It is possible to see that the proposed model has a great accuracy regarding the
max_err and
avg_err. Additionally, an illustration of the output and target plot and error for the gas H
2 is presented in
Figure 3, corroborating the good degree of fit of the NAR–DWT model. From
Figure 3 it is possible to verify the high accuracy in the prediction of the proposed model. The output and target plot clearly illustrates that the model can accurately reproduce the oscillatory behavior of the input data, presenting a target-output error of less than 0.01. Even at points in which the gas variation is greater, as in the 40 s and 128 s time instants, the model can keep up with the variation despite producing a slightly larger average error in this case.
The results generated by the proposed method have been compared with important current prediction methods from the literature: KPCA-FFOA-GRNN, FFOA-GRNN, KPCA-GRNN, GRNN, BPNN from [
10] and SVM from [
13].
Some time series techniques were also used to compare the results of the in-oil dissolved gas concentrations prediction values, and their ratios. The following statistics models were used: autoregressive moving average model (ARMA), autoregressive integrated moving average models (ARIMA), seasonal autoregressive integrated moving average model (SARIMA), Autoregressive model for conditional heteroscedasticity (ARCH), and the generalized autoregressive conditional heteroscedasticity (GARCH) [
29].
The selection of the most suitable prediction model among all the options tested was performed using the Akaike information criteria (AIC) and Bayesian (BIC), which indicated the model most adjusted to the data by relative quality analysis of the statistical models.
This comparison is illustrated in
Table 5 for the prediction of the ethylene gas concentration.
For this test we used 141 samples for training and 35 samples for testing.
5. Discussion
This section presents the discussion about the application of the proposed model in a real-world example, in relation to the accuracy of data fitting for predicting oil-dissolved gas concentrations in transformer.
Regarding the evaluation of the performance of the wavelet functions and the time delay parameter
d on the training process, and the effect of these factors on the accuracy of the proposed model, the robustness of the model is clear. The combination of nonlinear autoregressive neural network and wavelet transform enable to produce high precision prediction of oil-dissolved gas concentrations and ratios. The low influence of the wavelet function on the accuracy of the prediction indicates that the proposed model is less sensitive to variations in input data for training, a problem that is usual in approaches based on neural networks [
4]. In addition, the adoption of the cross-validation approach increases the reliability of the model and leads to better quality forecasts.
Results from a real example showed that the proposed model presented good accurate predictions, higher than that obtained by the current tested methods such as Generalized Regression Neural Network (GRNN), support vector machine (SVM), back propagation neural network (BPNN), and usual time series techniques. Specifically, regarding the ethylene gas, the maximum prediction error of the NAR–DWT model was about 70% smaller than the other tested models. The superior performance presented was independent of the delay parameter,
d, used in the model. This result is coherent with the literature that states that nonlinear autoregressive network models are less sensitive to long-term time dependencies, besides presenting better generalization and learning capacities [
22]. Moreover, an additional increase in performance is due to the application of the discrete wavelet transform to create simplified and sparse versions of the original data.
The ability to accurately predict future values of gas concentrations allows the reliability of the monitoring system to be increased and to anticipate possible faults in the system. In this sense, a prediction of an increase in the concentration of a gas above the limit value issues an alert indicating the need for an immediate check of the operation status of the transformer. In addition, multiple gas concentration predictions are used to estimate the average time, in days, to an alert condition, as well as the lower and upper limits of the 95% confidence interval for that average. This information is then used to calculate the remaining life of the equipment.