Risk Assessment Algorithm for Power Transformer Fleets Based on Condition and Strategic Importance

In every electric power system, power transformers (PT) play a critical role. Under ideal circumstances, PT should receive the utmost care to maintain the highest operative condition during their lifetime. Through the years, different approaches have been developed to assess the condition and the inherent risk during the operation of PT. However, most proposed methodologies tend to analyze PT as individuals and not as a fleet. A fleet assessment helps the asset manager make sound decisions regarding the maintenance scheduling for groups of PT with similar conditions. This paper proposes a new methodology to assess the risk of PT fleets, considering the technical condition and the strategic importance of the units. First, the state of the units was evaluated using a health index (HI) with a fuzzy logic algorithm. Then, the strategic importance of each unit was assessed using a weighting technique to obtain the importance index (II). Finally, the analyzed units with similar HI and II were arranged into a set of clusters using the k-means clustering technique. A fleet of 19 PTs was used to validate the proposed method. The obtained results are also provided to demonstrate the viability and feasibility of the assessment model.


Introduction
At present, as a result of the constant grown of societies, the physical assets on electric power systems suffer a continuous demand for high system reliability, power quality and cost benefits. Under these circumstances, physical assets should be maintained in optimal condition. PT constitutes a vital component in the electric power system [1]. Due to the high replacement and maintenance cost, as well as catastrophic failure consequences, a proper assessment of the condition of the units should be done [2].
An asset manager needs to justify all the maintenance or replacement decisions. To achieve this goal, it is necessary to count on reliable information about the condition and the importance of the unit in the context of the whole system. A set of diagnostic analyses can be carried out to understand the physical condition of the PT [3][4][5]. As for the strategic importance of the unit, it is necessary to know technical and operative characteristics, such as loading, geographic location, or the existence of critical loads. Once this information has been obtained, the acquired data need to be processed in a way that enables the asset manager to draw conclusions about the state of the unit. Different methodologies have been developed to assess the condition of PT, mostly known as health indices (HI) [6][7][8], and when the importance of the unit or the consequences of failure is also taken into account, the assessment methodology is known as risk index (RI). The RI is a measure that integrates several pieces of information regarding condition, probability of failure and the importance of the transformer in a power system [9]. The main goal of the RI outcome is to serve as an indicator of the need of acting either in maintenance or inspection, especially when the company has adopted a scheme of reliability-based maintenance (RBM) and/or condition-based maintenance (CBM) [10].
In the literature, there are many methodologies to assess the risk of PTs [11][12][13][14]. What differentiates one from the other is the technique used to process the data and the entry criteria. While most of the experts consider the HI to evaluate the risk, some authors also consider the consequences of failure [15], the probability of shutdown [16], the technical condition [17] or the strategic importance of the PT [18].
It should be noted that the RI is an indicator that enables an influential prioritization/ranking of the assets to support maintenance and replacement decisions [19,20]. However, when dealing with a set of transformers, more than a ranking, an asset manager needs to categorize the units in groups with similar characteristics to apply maintenance actions. Most of the current methodologies to assess the risk in PT analyze one unit at a time or use a ranking system. This paper proposes a practical application to determine the RI in fleets of PT and then categorize the units in groups of similar RI using the k-means clustering technique. This, in order to be a support tool for improving the efficiency of PT maintenance. The contributions of this work can be listed as: • An up-to-date approach to determine the degree of polymerization (DP) of the insulating paper and the HI. • A standardized procedure to assess the strategic importance of the PT. • A new method to categorize the risk in PT fleets employing a clustering technique The following sections introduce a description of the methodology developed to calculate the HI, the strategic importance of the asset or II, and the RI. Next, a fleet of 19 PTs is analyzed to validate the proposed method. Finally, the results, discussions and conclusions are presented.

Health Index Calculation
The health index (HI) is an indicator that quantifies and provides an undemanding understanding of the general condition of a PT. Most of the methodologies developed to calculate the HI employ weighting techniques [21,22]. However, in recent years new methods based on Markov chains [23], fuzzy logic systems [24] and artificial neural networks [25] have been developed.
Some authors [26][27][28], take into account the uncertainty, vagueness, or impreciseness of the available information. Derived from this approach, they consider the fuzzy inference systems (FIS) to be the most appropriate method to calculate the HI. The use of the FIS methodology requires the fuzzification of all the input data, which is often given in numerical values. The fuzzification process involves the conversion of those numerical values into linguistic functions which is accomplished with the modeling of membership functions.
The proposed method has its roots in the works in Ref. [29] and Ref. [9]. The original methodology uses six entry criteria to calculate the final value: breakdown voltage, moisture content (humidity), acidity, power factor, furan content and dissolved gas analysis (DGA). The furan content analysis is carried out to assess the degree of degradation of the winding's insulating paper. However, recent studies have proven that the furan content may be inaccurate to properly evaluate the state of the insulating paper [30][31][32]. Taking into account the facts stated above, this paper proposes the replacement of the furan content criterion for the DP one. The degree of polymerization (DP) value is usually estimated from the proportion of 2-FAL found in the furan content analysis., Chendong's Equation (1) is one of the most widely accepted methods to calculate the DP value.
Established equations for the DP only evaluate the 2-FAL content to calculate its estimates. The imprecise outcomes from considering only the 2-FAL content were suggested in researches conducted by [33][34][35]. Moreover, they highlight the importance of the CO 2 /CO ratio to estimate the real degree of degradation of the insulating paper. This work implements a FIS approach to calculate the DP value, and entry parameters for this system are Established equations for the DP only evaluate the 2-FAL content to calculate its estimates. The imprecise outcomes from considering only the 2-FAL content were suggested in researches conducted by [33][34][35]. Moreover, they highlight the importance of the CO2/CO ratio to estimate the real degree of degradation of the insulating paper. This work implements a FIS approach to calculate the DP value, and entry parameters for this system are the furan content and the CO2/CO ratio. The membership functions of the entry criteria and the output of the FIS process are presented in Figures 1-3.   According to the authors of [4], for the CO2/CO ratio, the respective values of CO2 and CO should exceed 5000 μL/L (ppm) and 500 μL/L (ppm) in order to improve the certainty factor. Additionally, a normal CO2/CO ratio should be around 7. Abu-Elanien et al. [29] state that a DP value higher than 700 represents an insulating paper with its mechanical properties closer to 100%.
The set of membership functions for the breakdown voltage, moisture content, acidity, power factor, DGA and the final HI value, were taken from Ref. [9] and are shown in   Established equations for the DP only evaluate the 2-FAL content to calculate its estimates. The imprecise outcomes from considering only the 2-FAL content were suggested in researches conducted by [33][34][35]. Moreover, they highlight the importance of the CO2/CO ratio to estimate the real degree of degradation of the insulating paper. This work implements a FIS approach to calculate the DP value, and entry parameters for this system are the furan content and the CO2/CO ratio. The membership functions of the entry criteria and the output of the FIS process are presented in Figures 1-3.   According to the authors of [4], for the CO2/CO ratio, the respective values of CO2 and CO should exceed 5000 μL/L (ppm) and 500 μL/L (ppm) in order to improve the certainty factor. Additionally, a normal CO2/CO ratio should be around 7. Abu-Elanien et al. [29] state that a DP value higher than 700 represents an insulating paper with its mechanical properties closer to 100%.
The set of membership functions for the breakdown voltage, moisture content, acidity, power factor, DGA and the final HI value, were taken from Ref. [9] and are shown in   Established equations for the DP only evaluate the 2-FAL content to calculate its estimates. The imprecise outcomes from considering only the 2-FAL content were suggested in researches conducted by [33][34][35]. Moreover, they highlight the importance of the CO2/CO ratio to estimate the real degree of degradation of the insulating paper. This work implements a FIS approach to calculate the DP value, and entry parameters for this system are the furan content and the CO2/CO ratio. The membership functions of the entry criteria and the output of the FIS process are presented in Figures 1-3.   According to the authors of [4], for the CO2/CO ratio, the respective values of CO2 and CO should exceed 5000 μL/L (ppm) and 500 μL/L (ppm) in order to improve the certainty factor. Additionally, a normal CO2/CO ratio should be around 7. Abu-Elanien et al. [29] state that a DP value higher than 700 represents an insulating paper with its mechanical properties closer to 100%.
The set of membership functions for the breakdown voltage, moisture content, acidity, power factor, DGA and the final HI value, were taken from Ref. [9] and are shown in   According to the authors of [4], for the CO 2 /CO ratio, the respective values of CO 2 and CO should exceed 5000 µL/L (ppm) and 500 µL/L (ppm) in order to improve the certainty factor. Additionally, a normal CO 2 /CO ratio should be around 7. Abu-Elanien et al. [29] state that a DP value higher than 700 represents an insulating paper with its mechanical properties closer to 100%.
The set of membership functions for the breakdown voltage, moisture content, acidity, power factor, DGA and the final HI value, were taken from Ref. [9] and are shown in  Established equations for the DP only evaluate the 2-FAL content to calculate its estimates. The imprecise outcomes from considering only the 2-FAL content were suggested in researches conducted by [33][34][35]. Moreover, they highlight the importance of the CO2/CO ratio to estimate the real degree of degradation of the insulating paper. This work implements a FIS approach to calculate the DP value, and entry parameters for this system are the furan content and the CO2/CO ratio. The membership functions of the entry criteria and the output of the FIS process are presented in Figures 1-3.   According to the authors of [4], for the CO2/CO ratio, the respective values of CO2 and CO should exceed 5000 μL/L (ppm) and 500 μL/L (ppm) in order to improve the certainty factor. Additionally, a normal CO2/CO ratio should be around 7. Abu-Elanien et al. [29] state that a DP value higher than 700 represents an insulating paper with its mechanical properties closer to 100%.
The set of membership functions for the breakdown voltage, moisture content, acidity, power factor, DGA and the final HI value, were taken from Ref. [9] and are shown in         Using the Mamdani FIS methodology the linguistic entry values are integrated with the linguistic output using fuzzy inference rules. For the DP value, the methodology developed a set of 20 inference rules. Meanwhile, for the final HI value, a group of 80 rules was implemented. Once the output membership function is obtained, a defuzzification process takes place to convert the linguistic result into a numerical value. One can observe the complete HI fuzzy model in Figure 10.      Using the Mamdani FIS methodology the linguistic entry values are integrated with the linguistic output using fuzzy inference rules. For the DP value, the methodology developed a set of 20 inference rules. Meanwhile, for the final HI value, a group of 80 rules was implemented. Once the output membership function is obtained, a defuzzification process takes place to convert the linguistic result into a numerical value. One can observe the complete HI fuzzy model in Figure 10.      Using the Mamdani FIS methodology the linguistic entry values are integrated with the linguistic output using fuzzy inference rules. For the DP value, the methodology developed a set of 20 inference rules. Meanwhile, for the final HI value, a group of 80 rules was implemented. Once the output membership function is obtained, a defuzzification process takes place to convert the linguistic result into a numerical value. One can observe the complete HI fuzzy model in Figure 10.       Using the Mamdani FIS methodology the linguistic entry values are integrated with the linguistic output using fuzzy inference rules. For the DP value, the methodology developed a set of 20 inference rules. Meanwhile, for the final HI value, a group of 80 rules was implemented. Once the output membership function is obtained, a defuzzification process takes place to convert the linguistic result into a numerical value. One can observe the complete HI fuzzy model in Figure 10.      Using the Mamdani FIS methodology the linguistic entry values are integrated with the linguistic output using fuzzy inference rules. For the DP value, the methodology developed a set of 20 inference rules. Meanwhile, for the final HI value, a group of 80 rules was implemented. Once the output membership function is obtained, a defuzzification process takes place to convert the linguistic result into a numerical value. One can observe the complete HI fuzzy model in Figure 10. Using the Mamdani FIS methodology the linguistic entry values are integrated with the linguistic output using fuzzy inference rules. For the DP value, the methodology developed a set of 20 inference rules. Meanwhile, for the final HI value, a group of 80 rules was implemented. Once the output membership function is obtained, a defuzzification process takes place to convert the linguistic result into a numerical value. One can observe the complete HI fuzzy model in Figure 10.

Importance Index Estimation
The importance index (II) of a PT indicates the strategic relevance of the asset within the reliability context of the electrical power distribution system. It considers the unit's technical parameters, the operational environment, and the penalties imposed by regulatory entities. For this work, there are a set of 11 criteria to determine the II; each criterion can take an integer value between 1 and 3. The final value for the II can be inferred using Equation (2). where II indicates the importance index of the PT, Sii is the score assigned to each criterion, Smaxii represents the maximum possible value, in this case, a value of three, and Wii is the respective weight of each entry parameter. As stated before, those criteria can only take integer values between 1 and 3. For this reason, a normalizing process was carried out to obtain an II value between zero and one. For that purpose, Equation (3) II II  II  II  II  (3) where IImax and IImin represent the maximum and minimum values that II can take, for this work, those values are one and one-third. Values close to zero indicate the low strategic importance of the PT, whereas values close to one denote a high relevance of the asset. The 11 criteria chosen for this method and their respective weights are presented in Table  1. Figure 10. Health index fuzzy model.

Importance Index Estimation
The importance index (II) of a PT indicates the strategic relevance of the asset within the reliability context of the electrical power distribution system. It considers the unit's technical parameters, the operational environment, and the penalties imposed by regulatory entities. For this work, there are a set of 11 criteria to determine the II; each criterion can take an integer value between 1 and 3. The final value for the II can be inferred using Equation (2).
where II indicates the importance index of the PT, S ii is the score assigned to each criterion, S maxii represents the maximum possible value, in this case, a value of three, and W ii is the respective weight of each entry parameter. As stated before, those criteria can only take integer values between 1 and 3. For this reason, a normalizing process was carried out to obtain an II value between zero and one. For that purpose, Equation (3) was used.
where II max and II min represent the maximum and minimum values that II can take, for this work, those values are one and one-third. Values close to zero indicate the low strategic importance of the PT, whereas values close to one denote a high relevance of the asset. The 11 criteria chosen for this method and their respective weights are presented in Table 1.
The N-1 criterion indicates the ability of the system to handle a sudden disconnection of the PT and supply the demanded load. The critical loads' criterion refers to essential loads such as hospitals, airports and continuous production factories, powered by the evaluated trans-formation unit. Mean load is the estimated average power provided by the transformer in the last 30 days, while the unavailability penalty criterion will depend on the regulations of each country. For this work, the measure is based on Argentina's energy regulator provisions.

Risk Assessment
Once the HI and II have been calculated, the risk of the unit can be found. Traditionally, the risk index (RI) is obtained by multiplying the HI and the II, then, a list or ranking is generated to determine which assessed units should be prioritized. However, under the context of fleet analysis, an asset manager goal's is to arrange the units in groups with similar conditions to implement suitable maintenance strategies. To achieve this goal, risk matrices whose axes are the HI and the II will be used. The k-means clustering technique is applied to establish the groups or clusters of PT.
K-means is considered one of the simplest unsupervised learning algorithms that solve the well-known clustering problem [36]. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k-clusters) fixed a priori. This algorithm has as a goal the minimization of an objective function known as the mean square error (MSE) and it is given by (4).
where x i −v j is the Euclidean distance between xi and v j , c j corresponds to the number of data points in the cluster ith and c is the number of centroids.
In this paper, three groups or clusters will be designated to assess the risk of units with similar HI and II as a means to define proper maintenance schemes. Cluster 1 indicates the PT with the lower risk meanwhile Cluster 3 encloses the units that should be prioritized. Figure 11a,b presents a data set example before and after the k-means clustering technique is applied.
indicates the PT with the lower risk meanwhile Cluster 3 encloses the units that should be prioritized. Figure 11a,b presents a data set example before and after the k-means clustering technique is applied.

Case Study, Results and Discussion
Researchers tested the proposed methodology presented in the previous sections in a PT fleet composed of 19 units. Table 2 summarizes the main results of the diagnostic tests, meanwhile, Table 3 presents the operative features of the whole fleet.

Case Study, Results and Discussion
Researchers tested the proposed methodology presented in the previous sections in a PT fleet composed of 19 units. Table 2 summarizes the main results of the diagnostic tests, meanwhile, Table 3 presents the operative features of the whole fleet. Once the results for the HI and the II were obtained, the risk was assessed with the k-means technique, and, as a means of comparison; the RI and DP values were also calculated employing the HI*II approach and Chendong's equation.
The corresponding scores calculated as per values of transformer parameters for the DP value, HI, II, cluster membership and the RI using the HI*II approach are shown in Table 4. In accordance with Teymouri and Vahidi, who in their work [35] indicated that a good DP estimator should fall within ±20% of the measured DP value or the calculated DP employing Chendong's equation, it can be noted that the proposed method agrees with this condition. There are four exceptions, units 3, 7, 18 and 19; whose DP values differ more than ±20 percent compared with Chendong's estimates. For units 3 and 19, their values are greater than Chendong's method; that can be explained by the low furan content and the good CO 2 /CO ratio of the PT. Meanwhile, for units 7 and 18 whose values are lower than Chendong's, their respective CO 2 /CO ratios are 1.35 and 12.49. According to [4] CO 2 /CO ratios lower than 3 may be an indication of a fault in the insulating paper, while CO 2 /CO ratios higher than 10 may indicate accelerated thermal degradation at low temperature. The proposed method for the DP includes these considerations in its algorithm, thus the differences with Chendong's equation are justified. It also serves as an indicator of the importance of the CO 2 /CO relationship in determining the real state of the insulating paper. The proposed methodology ensures a more comprehensive estimate of the condition of the insulating paper. However, it was noted a certain degree of inelasticity in the results. This can be attributable to the nature of the trapezoidal fuzzy membership functions (TFMF), where they maintain the same value unless the input value moves to an interception zone between membership functions or another function at all.  [37] who states that once a unit has reached DP values lower than 250, it is considered that it has entered the end-of-life stage.
The II assessment results for the PT fleet show a certain degree of uniformity in the final results, with II values between 0.3 and 0.7. It can be noted that unit 10 which presents a high HI value has a relatively low strategic importance. Meanwhile, unit 18, which has the best technical condition in the fleet, also presents a higher II than unit 10. This can be explained by the independence in the calculation of indices, as well as the location and operative voltage of the units. PT number 10 has a voltage of 132/33 kV and operates in a rural environment. In contrast, PT number 18 works in a more demanding urban setting and has an operative voltage of 220/132 kV. Therefore, the critical condition of the asset in the system is higher. Units 4 and 13 have the highest II value operating at extra-high voltages (EHV) of 500/132 kV.
The risk of the fleet can be assessed with the HI and II calculated. As stated before, three groups or clusters were defined using the k-means clustering technique. Out of the 19 units, 8 were placed in cluster number one, 6 in cluster number two and the remaining 5 PT in cluster number three. The units in the risk matrix are presented in Figure 12a, while the cluster arrangements are displayed in Figure 12b. The placement of the units on the risk matrix in Figure 12 (a) allows a better understanding of the HI and II when combined. Also, it can be discerned how a high percentage of the units are concentrated in the lower left quadrant. An objective categorization of the units in groups using traditional methods might be challenging. Therefore, to solve this obstacle the proposed k-means technique is applied. Figure 12b shows that the k-means technique defined three clusters, in which the PT belonging to cluster number three was the easiest to disaggregate from the rest of the The placement of the units on the risk matrix in Figure 12a allows a better understanding of the HI and II when combined. Also, it can be discerned how a high percentage of the units are concentrated in the lower left quadrant. An objective categorization of the units in groups using traditional methods might be challenging. Therefore, to solve this obstacle the proposed k-means technique is applied. Figure 12b shows that the k-means technique defined three clusters, in which the PT belonging to cluster number three was the easiest to disaggregate from the rest of the units. Furthermore, cluster three represents the units with the highest risk. Hence the units where the most thorough and immediate maintenance measures will be needed.
As for clusters number one and two, the units were closer to each other and the manual allocation could be difficult. However, with the k-means technique as a non-supervised ML method, the allocation of units in a particular cluster is done automatically by comparing Euclidean distances and the minimization of the MSE. This allocation of units is mainly observed in units 1 and 5 which were close to both centroids number one and number two. Nevertheless, unit 5 was allocated in cluster number one and unit 1 was allocated in cluster number two.
The RI using the HI*II approach was plotted in Figure 13. The units with higher risk were PT 4, 7, 8 and 16. The method failed to categorize PT 10 as a unit with high risk, moreover, the classification of the remaining units becomes fuzzy and subjective. The main advantage of the proposed methodology is its capacity to categorize the PT according to their condition into groups. This is particularly useful when dealing with large fleets of PT. In case an individual assessment is required, the use of the traditional RI calculation technique is advised. In view of these findings, it can be assumed that the proposed method has the capabilities of estimating the risk in PT fleets that can be embraced by utility expects for asset management.

Conclusions
This paper proposes a new approach to assess the risk in power transformer fleets. This approach examines the technical condition and the strategic importance of the units to create clusters of PTs with similar risks. The researchers have tested a fleet of 19 units to validate the proposed method, and the results showed the method's viability.
Additionally, a new fuzzy-based approach was developed to improve the accuracy and consistency of the transformer insulation estimation. It considers the furan content and the CO2/CO ratio, which several studies have proven to play a critical role in correctly assessing the DP in the insulating paper. The results showed an accurate assessment of the condition of the insulation paper. Although the fuzzy-based approach exhibited a certain degree of inelasticity, this can be reduced by using a higher number of membership functions by criterion or by replacing the trapezoidal with triangular or Gaussian distribution functions.
The technical condition was estimated using a fuzzy-based HI proposed in the literature with the DP calculation method. The strategic importance was measured with a weighting technique and taking into consideration critical operative aspects of the units, such as location, voltage, rated power, the existence of critical loads and security levels. The main advantage of the proposed methodology is its capacity to categorize the PT according to their condition into groups. This is particularly useful when dealing with large fleets of PT. In case an individual assessment is required, the use of the traditional RI calculation technique is advised. In view of these findings, it can be assumed that the proposed method has the capabilities of estimating the risk in PT fleets that can be embraced by utility expects for asset management.

Conclusions
This paper proposes a new approach to assess the risk in power transformer fleets. This approach examines the technical condition and the strategic importance of the units to create clusters of PTs with similar risks. The researchers have tested a fleet of 19 units to validate the proposed method, and the results showed the method's viability.
Additionally, a new fuzzy-based approach was developed to improve the accuracy and consistency of the transformer insulation estimation. It considers the furan content and the CO 2 /CO ratio, which several studies have proven to play a critical role in correctly assessing the DP in the insulating paper. The results showed an accurate assessment of the condition of the insulation paper. Although the fuzzy-based approach exhibited a certain degree of inelasticity, this can be reduced by using a higher number of member-ship functions by criterion or by replacing the trapezoidal with triangular or Gaussian distribution functions.
The technical condition was estimated using a fuzzy-based HI proposed in the literature with the DP calculation method. The strategic importance was measured with a weighting technique and taking into consideration critical operative aspects of the units, such as location, voltage, rated power, the existence of critical loads and security levels.
Finally, the assessed units were plotted in the risk matrix and classified into clusters using the k-means clustering technique. This novel approach proved superior to the classical RI calculation to support asset managers in the maintenance decision-making process. Given the ability of the k-means technique to categorize the units, not as individuals but as groups. From which it is possible to make joint maintenance decisions. A reasonable risk assessment tool can facilitate PT reliability, enhance its residual life span, and also be a preventive indicator of the need for replacement.
As a continuation of this research, the authors recommend that future investigations should be aimed at the development of an asset management algorithm that combines the risk assessment method proposed in this work and other references, using data fusion techniques. In addition, risk assessment techniques should be combined with analysis and decision-making methodologies, for example, to define maintenance strategies based on the found TP clusters.