1. Introduction
Distribution transformers are present in large numbers on distribution grids, where it is common to find thousands of such devices in large urban centers. In addition, technologies related to the Internet of Things (IoT), edge computing and advanced metering infrastructure (AMI) have been combined to electric power systems, generating the so-called smart grids [
1,
2,
3,
4,
5,
6,
7,
8]. This combination allows, for example, the migration from the paradigm of preventative maintenance over to condition-based maintenance [
9,
10,
11]. However, non-automated data collection and the need for specialists to interpret the results represent a significant limitation when applying diagnostic methods, on a large scale, to distribution transformers. As such, the development and improvement of methods that permit automatic diagnostic (i.e., capacity to determine the operational health status of these devices in an autonomous manner) can arrive at a significant contribution to the distribution sector of electric energy. Some of the main challenges associated with these automatic diagnostic methods are the ability to execute autonomously, use of algorithms that do not require human intervention or supervision, a simplified configuration with a minimum number of parameters that need to be estimated in advance, a structure that is adaptable and flexible to real-time measurements of the transformer and portability (i.e., applicability of the method independent of the transformer model or class).
In general, the automatic diagnostic methods are based on the concept of learning from the performance features over the long term, and in the identification of significant anomalous patterns that can lead to possible failure modes (set of operating patterns that the equipment presents when it fails). In addition, these techniques emphasize unsupervised learning algorithms, which can be online and real-time, to aid in decision-making and prognoses that autonomously identify tendencies that deviate from normal behavior, while also preventing possible failure states [
12]. Deep learning, learning vector quantization, self-organizing maps and hidden Markov models are some of the tools typically used in automatic diagnostic methods [
13]. However, the employment of such tools usually involves parameters as dissolved gas, short circuit impedance, frequency domain spectroscopy, humidity, etc. [
14,
15,
16,
17,
18,
19,
20,
21]. Therefore, the use of these parameters for distribution transformers can be very costly and inhibitory.
In this sense, this article discusses an unsupervised learning method, in real-time, that addresses some of the challenges cited, while proposing an automatic diagnostic strategy for distribution transformers. This new approach permits a system that executes this proposed method of automatic diagnostic to gradually tune the specifications of a particular transformer, through only its representation by a set of preselected feature variables (e.g., voltage, current, power factor, ambient temperature, tank temperature, total harmonic distortion.). Additionally, it provides the operator of the electric power distribution system, in graph form and without the need for deeper specific knowledge, information on the actual operational condition of the transformer. This reduces the need for detailed knowledge into the structural and dynamic properties of the transformer, together with its possible failure modes.
The proposed method includes two main tasks—basic modeling of the transformer in real-time and evaluation of the operational condition. The latter was implemented by means of two tools: the operation map and the health index. To this end, the method uses the concepts of clustering k-nearest neighbors (k-NN) and the Gaussian mixture model (GMM), for the automatic identification and updating of the main operational modes (set of stable patterns that the transformer presents in its various operating states) and establish the limits of the acceptable performance characteristics over the long term. The practical results presented in this study are based on a case study that has one year of data collected from five in-service distribution transformers.
The remainder of the article is organized in the following way. First, in
Section 2, a description is given of the online data acquisition system, along with the magnitudes measured through it.
Section 3 presents the automatic diagnostic method in real-time used to represent the data in a dynamic structure that models the transformer. Furthermore, it presents two tools, the operation map and the health index, which use this model to assess the condition of the transformer.
Section 4 presents the results and the respective discussions of a case study based on the implementation of the method in a pilot set of five in service distribution transformers. The conclusions are given in
Section 5.
2. Materials
The proposed method is supplied with data collected in real-time from transformers on the distribution grid by a hardware device for wireless measurement and communication developed especially for this research.
Figure 1 illustrates the hardware that is designated as IED (intelligent electronic device). The general specifications of the IEDs are presented in
Table 1. Each of the IEDs were calibrated to the range specified in
Table 1 using a programmable power source [
22].
The IED reads the sensors installed on the transformer at each second and then after three minutes, consolidates all the measurements and transmits the average values. These measurements are then used to produce the raw feature vector , where is the number of magnitudes measured on the transformer by the IED. In this study, and includes the measurements: electric voltages in per-unit (pu), currents in pu and power factor on each of the three phases (A, B, C) in low voltage; increase in tank temperature related to ambient temperature measured in loco and the total voltage harmonic distortion (THDv).
4. Results and Discussion
The automatic diagnostics method proposed in this work was evaluated in a case study with five distribution transformers remotely monitored by the IEDs described in
Section 2. The main construction and operational data of these distribution transformers are described on
Table 3.
4.1. Initialization
According to Equation (1), it is necessary to have, at least,
independent feature vectors
in order to commence the initialization phase. To avoid this obligation, the initial sample set
was made up of
raw feature vectors
(i.e., covering the first day of monitoring of the transformer), thus making
. Furthermore, considering a day means using the most relevant cyclic unit of transformer operation. This implies a greater stability of the model after initialization, since it is unlikely that there are relevant neglected operating patterns, thus preventing the model from going through major adaptations right from the start. Therefore, the initial set of each transformer in the case study can be submitted to the initialization phase described in
Section 3.1. In this way, the initial OMC model for each transformer was built. For transformer 1, for example, the OPs for each hour of the first day of monitoring and the two OMCs found for approximating the operation modes are shown in
Figure 5a. The set of measurements used in the initialization of the OMC model for transformer 1 are shown in
Figure 5b. In addition, this figure indicates to which OMC each measurement belongs. This, therefore, indicates the OMC2 is associated, principally, to the period of 3 to 9 o’clock, where there exists a decrease in the current, which causes the transformer to cool down, the voltage to rise, and a decrease in the THDv. The remaining transformers were also initialized using two OMCs, as demonstrated in
Figure 6a–d. Following a similar reasoning to transformer 1, the OMCs approximated the low- and high-load operating modes.
Figure 7a shows the distribution of the variance over the principal components of the five evaluated transformers, while
Figure 7b shows the accumulated variance up to the respective principal component. Consequently, for the initialization data of the transformers, on average 83.32 ± 6.09% of the information (variance) was preserved using only the first two principal components. This result is important, as one of the premises of the method is the ability to synthesize measured data remotely, using the two dominant principal components. As such, only the two principal components represent more than 80% of the variation in the data.
4.2. Modeling
After the initialization phase, in which the dynamic base of operation of the transformers was identified, the modeling phase was started. In this phase, the automatic diagnostic algorithm is executed at each new feature vector. Therefore, the OMC model built in the initialization phase is updated both dynamically and autonomously. The only parameter that is configured manually is the rate of forgetting α. The tuning of this parameter took into account three aspects: importance of long-term evaluation, reconstruction error and variance preserved by the model.
For the important aspect of long-term evaluation, the value of α controls the weight given to a previous number of feature vectors. Low values of
tend to forget old data more quickly. To adjust the rate of forgetting, besides considering an adequate sampling interval, it is also important that the variance preserved by the transformer OMC model
, calculated by Equation (39), is elevated, and the reconstruction error
, calculated by Equation (40), is as low as possible.
Figure 8a shows the average for the quantity of variance preserved by the five models, regarding the rate of learning. This graph corroborates the choice of values for α getting closer to 1.
In contrast, the average reconstruction error, shown in
Figure 8b demonstrates that values of α that tend to 1 generate increasingly larger errors. This is due to the fact that the model becomes increasingly slower in the identification of new standards, as α gets closer to 1. After all these considerations, a learning rate of
was considered, which results in a weight of around 90% for the data gathered over the last 10 days of monitoring, calculated according to Equation C11. Therefore, an average reconstruction error of 0.16 ± 0.01 standard deviations and an average variance quantity of 80.22 ± 0.98% were expected, in accordance with the approximations shown in
Figure 8a,b.
4.3. Operation Map
The operation map needs to be assessed, regarding the amount of variance
that the OMC transformer model preserves, in terms of reconstruction error
, calculated by Equations (39) and (40), respectively. If the error becomes large and/or the variance small, the operation maps lose their usefulness. As show in
Figure 9, the average values in terms of the five transformers of the reconstruction error and the preserved variance remain considerably stable over the whole period of the case study. The highest average reconstruction error value encountered was 0.26 standard deviations, and the lowest preserved variance value was 86.12%.
An elevated level of preserved variance does not necessarily generate a small reconstruction error. However, it is expected that a model that preserves a higher amount of variance has a lower reconstruction error. This point may be reinforced by the Pearson’s coefficient, which presented a correlation of 93.36 ± 0.01% between the reconstruction error and the quantity of variance preserved by the models.
The reconstruction error
is an important measure of the general performance of the model. Therefore, to obtain a real dimension of the error, it is interesting to calculate the error
of each feature individually.
Figure 10a–e show the errors for each calculated feature, according to Equation (41). The highest error values encountered were 0.006 ± 0.009 (pu) for the voltages, 0.014 ± 0.008 (pu) for the currents, 0.004 ± 0.003 for the power factor, 6.56 ± 3.55% for the THDv and 0.272 ± 0.272 (K) for the rise in temperature.
The results from the implementation of the operation map, using the parameterization of
Table 2 and a precision
of 0.05, are shown in
Figure 5a and
Figure 6a–d. Note the well-defined formation of the operation categories. A visual analysis of these regions in
Figure 5a and
Figure 6a–d indicates that all the transformers are operating appropriately and only transformer 3 is operating close to the precarious region, due to its voltage level.
4.4. Health Index
During the case study period, the method created OMCs to describe new modes of operation that were not described by the already existing OMCs. The creation of these new OMCs, initially with few OPs, led to the degradation of the
HI. Noted here was that the creation of new OMCs was often related to changes in the loading pattern (average of currents on the three phases) of the transformer. Therefore,
Figure 11a–e, demonstrate the daily loading average and the
HI during part of the case study period. In addition, in this figure, the moving average of the
HI is shown, which considers
samples (i.e., 20 days). Through such, one reaches the conclusion that in all transformers, there are natural oscillations at the value of
HI, since the condition imposed by Equation (34) was always satisfied after a given period.
The health index defined by Equation (33) has the advantage of not using as parameters measurements that are invasive and expensive, such as dissolved gas analysis [
14,
15], frequency response analysis [
16,
17,
18], partial discharge analysis [
19,
20,
21]. In addition, different to many of methods in current use of
HI [
14,
26,
27] eliminates the need to manually define the parameter weights.
4.5. General Considerations about the Case Study
Throughout the case study, the transformers were noted as operating for a greater part of the time within two OMCs. These two OMCs are predominantly associated with periods of lower and higher loads. However, as the transformers were not overloaded, there was not much permanence observed in the critical zone, except for occasions of grid events (such as overvoltages, for example).
Even though the initialization step (scheduled to take place on the first day of the start of monitoring) is responsible for defining the initial OMCs of the distribution transformer, variations in the transformer operation were observed when adapting these initial modes through the transformer adaptive model. Moreover, this is the very proposal of the transformer model, to identify new modes of operation at real-time.
Furthermore, in accordance with that indicated in
Figure 6, each transformer has its own map representation for specific zones of operation (appropriate, precarious, etc.), which are allocated within different regions of the operational map. However, what is of importance to the user is the location of the OPs on the map in order to facilitate identification of their operation in real-time. In terms of how these fits into the reality of energy companies, this resource can contribute greatly when assessing their assets and their trends. The very proximity of the OPs in regard to critical or precarious regions, can be seen as an indication as to transformer operation tendency.
In some situations of this case study used to illustrate the proposed method, the formation of OPs in regions classified as “precarious” were noted. In practice, by analyzing the collected readings, grid phenomena were noted (such as overvoltage) that generated this operational change. Effects of cumulative damage need to be studied further, in order that they be incorporated into the HI. Even so, the proposal presented herein already demonstrates the capacity to produce an indicator that is adequate for the operation of a distribution transformer, while using only electrical parameters that are typically measured and above all, without the need for a specialist. Moreover, this is performed in real time and in autonomous fashion. Such resources as these, if applied on the scale of a distribution grid, can bring a series of insights that can be explored by the maintenance, power quality and network planning sectors of an energy company. In addition, with the quantity of data produced and information generated, it paves the way for the use of tools for data mining and artificial intelligence with the ability to evaluate the grid as a whole, potentially contributing to a more sophisticated scenario of smart grids.
5. Conclusions
In this article, an unsupervised learning method that models, in real-time, the main modes of operation (OM) of the distribution transformer were presented. The method used the concepts of k-nearest neighbors (k-NN) clustering and the Gaussian mixing model to identify the main OMs and characterize these as cluster of the operation mode (OMC). This model can be used to evaluate the operational condition of the distribution transformer, by means of two tools: the operation map and the health index.
The use of this unsupervised approach in real-time, allowed for the performing of the automatic diagnostic of the transformers, through a preselected set of feature variables. Thus, reducing the need for interpretation of results by experts. This is important, since with the advent of smart grids, the significant amount of data generated makes automatic diagnostic methods increasingly more necessary.
The assessment of the method was performed by means of data collected remotely and in real-time by a measurement IED. These data were collected over nearly one year, from five distribution transformers. The 11 magnitudes measured in the transformer were summarized in two latent variables (the two dominant principal components) preserving, during the case study, at least 86% of the information (variance). Additionally, the operation map was able to categorize the transformer operation with a low-level of reconstruction error, always below 0.26 standard deviations. Finally, the health index proved to be a viable automatic diagnostic index, permitting identification and quantification of abnormal variations in the operating pattern of the transformer.