Application of the Gaussian Mixture Model to Classify Stages of Electrical Tree Growth in Epoxy Resin

In high-voltage (HV) insulation, electrical trees are an important degradation phenomenon strongly linked to partial discharge (PD) activity. Their initiation and development have attracted the attention of the research community and better understanding and characterization of the phenomenon are needed. They are very damaging and develop through the insulation material forming a discharge conduction path. Therefore, it is important to adequately measure and characterize tree growth before it can lead to complete failure of the system. In this paper, the Gaussian mixture model (GMM) has been applied to cluster and classify the different growth stages of electrical trees in epoxy resin insulation. First, tree growth experiments were conducted, and PD data captured from the initial to breakdown stage of the tree growth in epoxy resin insulation. Second, the GMM was applied to categorize the different electrical tree stages into clusters. The results show that PD dynamics vary with different stress voltages and tree growth stages. The electrical tree patterns with shorter breakdown times had identical clusters throughout the degradation stages. The breakdown time can be a key factor in determining the degradation levels of PD patterns emanating from trees in epoxy resin. This is important in order to determine the severity of electrical treeing degradation, and, therefore, to perform efficient asset management. The novelty of the work presented in this paper is that for the first time the GMM has been applied for electrical tree growth classification and the optimal values for the hyperparameters, i.e., the number of clusters and the appropriate covariance structure, have been determined for the different electrical tree clusters.


Introduction
Electrical treeing is a key degradation phenomenon of high-voltage polymeric insulation [1]. When electrical trees are initiated, they grow until they bridge the entire insulation material, resulting in catastrophic failure of the power system plant. Electrical trees are strongly related to partial discharge (PD) activity, which is usually characterized using techniques such as phase-resolved PD (PRPD) patterns [2,3], and, to a lesser extent, pulse sequence analysis (PSA) [4], pulse waveform analysis [5], and nonlinear time series analysis [6,7].
When PD activity is severe, there is higher dissipation of energy and greater PD amplitudes, resulting in tree growth, and serious degradation [8,9]. As mentioned in the literature [10], PD activity might be undetectable when the tree structure is forming conductive channels within the insulation system, therefore, thorough knowledge of PD behavior related to treeing degradation is needed, especially for conditioning monitoring engineers in the industry. Understanding the tree phenomenon is crucial in determining the remaining lifetime of an electrical asset.
Several studies analyze tree growth from PD activity [10][11][12][13][14]. In particular, Lv et al. [10], Bao et al. [11], Zhou et al. [14], and Alapati [13] investigated PD development during the early stages of PD degradation in epoxy resin insulation, cross-linked polyethylene (XLPE) cable, and low-density polyethylene (LDPE). These studies found that the growth rate of electric trees is strongly influenced by the increasing level of voltages. However, for electrical tree propagation in an XLPE cable, the skewness (i.e., the extent to which the distribution deviates from the normal distribution) of the maximum amplitude-phase distributions decreased with the spread of the electrical stress, and the skewness can be considered as a parameter to fully identify different levels of electrical tree propagation. Furthermore, the tree pattern feature in cross-linked polyethylene (XLPE) cable is similar to the needle-plate electrode system.
In the case of the LDPE, there was a considerable decrease in PD repetition rate and PD magnitude of LDPE filled with alumina nanocomposites compared to unfilled LDPE. There was an increase in the PD inception voltage with 3% weight of the filler loadings and then decreases when the filler loadings reached 5%. On the other hand, other researchers investigated tree propagation mechanisms in XLPE cable insulation based on a double electrical tree structure [12]. It was found that five types of electrical tree structures (branch, forest, bine-branch, pine-branch, and mixed configurations) will propagate in XLPE cable insulation due to the effect of the irregular congregating state, differences in the crystalline structure, and the presence of residual stress in the semi-crystalline polymer.
Few investigations are found in the area of pattern identification of PD characteristics from the initial stage to breakdown of the insulation due to electrical treeing. Park et al. [15] evaluated and classified PD degradation of electric trees for cable insulation. The authors of this study utilized three classification techniques: Adaptive neuro-fuzzy combination (AN-FIS), multi-layer perceptron (MLP), and principal component analysis (PCA). Compared to other defects, such as voids and metal surfaces, the findings specifically demonstrated different features of electric trees. The results clearly showed distinct characteristics of electrical trees as compared to that of other defects such as voids and metal surfaces. Among all the classification techniques, ANFIS showed higher identification potential and can be used for classifying electrical tree progression with about a 99% recognition rate. In another investigation, Salama et al. [16] applied a MLP neural network (NN) to discriminate between PD defects in voids and electrical trees. In particular, the algorithm could recognize discharge patterns from different degradation levels of the electrical trees. Although this algorithm was applied for the case of real power cable faults, it has not been applied to electrical tree faults degradation up to the breakdown stage. In another study, Park et al. [17] attempted to recognize three different electrical tree models using the adaptive network-based fuzzy inference system. The models considered samples with a needle-plane electrode, needle-void-plane electrode, and needle-metal strip-plane electrode. Statistical features extracted from the aforementioned tree models were applied as inputs to the ANFIS system. The results showed a good discrimination rate of these models up to 100%. However, this work was limited to electrical tree patterns only, without analyzing the progression and different stages of tree growth.
This paper studies the growth of electric trees in samples of epoxy resin under various voltage levels aiming to correlate PD activity with the stage of tree growth through the analysis of PRPD patterns using a Gaussian mixture-based model (GMM) clustering technique. GMM was chosen over other techniques because it is flexible and can perform hard clustering for complex data. Using this approach, it is expected to assess the remaining life of insulation subjected to electrical treeing degradation more accurately.
Section 2 details the experimental setup and the data capture procedure, describing the dataset, and their analysis. Section 3 explains the GMM model used in this work. Section 4 is the data processing technique adopted, while Section 5 describes this work's results and conclusion.

Experimental Setup
The samples were prepared using the conventional needle-to-plane configuration with a gap distance of~2 mm between the needle tip and the bottom of the sample. The needle was a hypodermic needle Terumo with an approximate curvature radius of 3 µm, the insulating material was epoxy resin (Mepox-1685/L, a Bisphenol A diglycidyl ether (epoxy system) in Santiago, Chile and the cuboid dimensions were 10 × 10 mm base and 25 mm height. Electrical treeing experiments were carried out using the test circuit shown in Figure 1. The voltage source (Vac) was a transformer fed from the grid through a variac (Variable AC Transformer). The samples were fed through a limiting resistance (R) in order to reduce disturbances and protect the instruments in case of breakdown. PD measurements were carried out using the balanced circuit shown in standard IEC 60,270 [18]. The treeing sample (N2) and the dummy sample (N1, PD free) were placed into a transparent oil container to prevent unwanted surface discharges and allow visualization of tree growth using an optical camera. The signals from the treeing and dummy samples were subtracted in the subtracting circuit (SC), whose output was fed to a commercial PD system (Acquisition System) that continuously registered PD activity. The voltage was measured using a voltage divider (Vm), which was also used by the PD measurement system. The minimum value of PD magnitude for the measurement was set to 2 pC; however, to reduce the background noise recorded, a threshold between 10 to 15 pC was used for the analysis. Before the electrical tree growth experiment itself, an incipient electrical tree needed to be created in each sample. To initiate an electrical tree, 12-16 kV 50 Hz voltage was applied to each sample until the camera optically observed a tree, and then the voltage was turned off and the sample was prepared for the tree growth experiment. By doing this, the initiation stage was separated from the propagation stage, which was the stage to be analyzed in this research. The electrical tree growth experiment was carried out in the selected samples by applying 12, 14, and 16 kVrms 50 Hz until breakdown, according to Table 1, where also the resulting time-to-breakdown (Time BD) is shown. This time is the duration of tree growth from the initial stage until final breakdown of the insulation. PD measurements were made using two simultaneous means: continuous recording with filming camera and taking pictures every 10 s. The utilization of this simultaneous registry system was required to correlate PD behavior (electrical response of the insulation) and tree propagation shape/length (physical damage).

Partial Discharge Recorded and Selected Data for Analysis
The measurements of the electrical tree growth experiment are shown in Figure 2, where the PD amplitude time series (left axis) and the tree length progression (right axis) are compressed in the same graph for each sample. Tree length was extracted from the tree images taken during the growth and was measured as the furthest tree extent from the needle tip in the direction of the plane electrode. In the graph, the length was represented in per unit values, i.e., the ratio between the length (L) and the length of the first tree branch that reached the plane electrode (L max ). Note that dielectric breakdown did not occur immediately after the tree arrived at the counter electrode. In particular, in the cases of Samples A and B, a considerable amount of experimental time passed after the tree bridged the insulation.
Although PD was recorded during the entire experiment, tree growth analysis was carried out using ten selected windows or intervals of analysis to study the parameters' evolution during tree growth. The selection of data and intervals followed the criteria previously described by Zheng et al. [19]. Each interval was selected to have at least 10,000 PD events and at least 10 s of continuous measuring time. In practice, this resulted in a total of 10,000-60,000 PD events (observations) per analysis interval for all the samples. The first analysis interval was chosen to start three minutes after the beginning of the test for a more stable PD activity, and the last interval was set to finish at least five minutes before the breakdown. The separation between intervals depended on the duration of each test. The intervals of analysis are shown as black bands in Figure 2a-e.
The results indicate that PD dynamics are different for every sample, depending on the stressing voltages and the stage of tree growth. For example, Samples A and B had irregular trends, and Sample B even had periods of no detected PD while the tree was growing. This phenomenon has been reported before and is due to the growth of 'filamentary' trees [19]. This is observed in Figure 3; though Samples B and C were both stressed at 14 kV, time series of PD amplitudes had different behavior, which was also observed when comparing Samples D and E, stressed at 16 kV. In particular, Sample D showed the highest PD amplitude values among all the samples, with a constantly increasing trend.
Images of the electrical trees of each sample at interval 6 are presented in Figure 4. It can be observed how Sample A, aged at 12 kV, presented the widest electrical tree. It is worth noting that the images are from the same interval (6th), but they do not correspond necessarily to similar stressing time; for Sample A, the 6th interval was at 130 min of aging, which is longer than any other total stressing time.

Gaussian Mixture Model Clustering Technique and Classification Model
Clustering techniques have been widely used in power system analysis and Rajabi et al. [20] have discussed a literature survey of various clustering techniques available and their application towards smart metering. Out of all the available models for unsupervised learning, the most popular is k-means clustering, which groups data according to a distance-based calculation with respect to a centroid [21]. The centroids are updated iteratively through a mean value and the clustered data will be in a circular shape. The drawback of the k-means clustering is that it fails to cluster data that are not in a circular shape, such as elliptical shape or irregular patterns. This drawback is overcome by GMM, which uses a probability density function (PDF) determining parameters by expectation-maximization (EM) technique. Compared to the k-means, the centroid formed by GMM takes into account the mean as well as the variance of the data, accommodating different sized clusters with varying correlations within them [22].
The clustering of unimodal distribution and multimodal distribution using GMM is explained in [22][23][24]. The comparison in [25] reveals that GMM takes more simulation time than k-means. Additionally, GMM can group complex patterns into similar components that match closely while k-means uses simple principles to produce only abstract information. The performance and comparison of the sampling methods used in GMM are reported in [26].
GMM can also be used for both hard and soft clustering of the dataset. In hard clustering, the GMM assigns each query data point to a particular cluster, which will maximize the posterior probability of the component given the data. In soft clustering, the GMM calculates the likelihood of the query data point belonging to a specific cluster and then assigns the query data point to a cluster, which would have maximum posterior probability, calculated using Bayes' theorem. In this study, the versatile soft clustering GMM is utilized for unsupervised learning to model unknown data distribution by multivariate normal distributions. Unfortunately, k-means clustering has no means to measure the likelihood or uncertainty of cluster assignments. On the other hand, GMM uses probability distribution functions that can model any input dataset by assigning each point a probability to belong to a certain cluster. Hence, it is used for clustering in this work.
The various steps in the GMM are explained in [22]. The Gaussian model is formulated by Equations (1) and (2). Let X = {x 1 , x 2 , · · · x n } be a set of n observations. The variable x i is distributed among a mixture of M components. The PDF of x i is written as shown in Equation (1), which is the weighted sum of Gaussian densities given by Equation (2) and the sum of weights where M is the number of Gaussian densities, x-D is the dimensional continuously valued data vector, w i , I = 1 . . . M is the weight of the mixture,g(x|µ i , Σ i ), i = 1, 2 . . . M is the component of Gaussian densities, µ is the mean vector of dimension D, Σ is the covariance matrix of dimension D×D, and In this study of clustering for insulation degradation using GMM the following covariance structure is adopted: 1.
The covariance structure of the components will determine the shape and orientation of the ellipsoid drawn over the cluster. The covariance matrix is diagonal instead of being full to avoid the over-fitting problem, and major and minor axes of the ellipsoid are parallel and perpendicular to the abscissa and the ordinate. The covariance matrix is shared among the components; hence, the ellipse of each cluster has the same size and orientation. 2.
The expectation-maximization (EM) algorithm fits the GMMs. The initial values of the parameters are set, and then the initial cluster assignments for data points are allowed to be selected randomly.

3.
Regularization is applied in order to avoid the likelihood of data point becoming ill-conditioned and starts moving towards infinity.

The Expectation Maximization (EM) Algorithm
Expectation maximization (EM) is a mathematical algorithm used to find the correct parameters for a model. The estimated parameter of mean, variance, and weight are necessary to cluster the data, but this is possible only if the Gaussian family is known. The EM algorithm starts with random parameters, and then the optimal parameters are found by iteration. This algorithm has the capability to deal with latent variables. Assuming k clusters are to be assigned, then k distributions are required with mean and covariance values of µ1, µ2, . . . , µk and Σ1, Σ2, . . . , Σk, respectively. The EM algorithm generally has two main steps, i.e., the Expectation step (E-step) and the Maximization step (M-step) [27].

The Expectation
Step (E-step) In this step, using randomly initialized parameters, for every point x j , we obtain the likelihood of belonging to a certain cluster c 1 , c 2 , ..., c k . This is achieved using Equation (3).
This value would be high if the point is allocated to the correct class, or vice versa.

The Maximization
Step (M-step) In this step, the parameter λ is updated as follows: 1.
The weight is updated using Equation (4), which is the ratio of cluster points to the overall number of points.
w c = Number o f points assigned to a cluster Total numbero f points 2.
Then, the covariance and the mean values are modified using Equations (5) and (6) in relation to the probability values for the data point and based on the values assigned to that particular distribution. Therefore, any data point having a high probability of being a member of the distribution should be contributing a higher portion.
Based on the updated parameters the E-step is repeated. These two steps are iterated until the optimal parameters are obtained using the log-likelihood function as described in [3].

Data Processing
Data pre-processing is an important step in machine learning models. To explain the data pre-processing steps, dataset A at the first interval (A1) is used. Sample A1 analysis is summarized in Table 2. The dataset has 10,139 observations with minimum values of Phi (phase angle) and Q (PD amplitude) are −1.790 × 10 −5 and −7.220 × 10 −11 , respectively. The maximum values of Phi and Q are 9.997 × 10 −1 and 8.880 × 10 −11 , respectively. In the data pre-processing step, the data is transformed using the normalized function available in MATLAB, which transforms the data with a mean of 0 and standard deviation of 1. The transformed variables Phi (Stdscale) and Q (Stdscale) are used by the clustering algorithm, which has a mean of 0 and standard deviation of 1 as shown in Table 2. GMM clustering was applied to generate clusters for every training pattern from the initial stage to the breakdown of the insulation, for each dataset. The overall aim was to be able to understand and classify different PD patterns during the stages of tree growth. The important hyperparameters in GMM are the number of clusters k and appropriate covariance structure Σ. In GMM the covariance structure includes a covariance matrix, which can be diagonal or full, and the nature of the covariance matrix, which can be shared or unshared. When a diagonal covariance matrix is chosen, the minor and major axes of the confidence ellipsoids drawn over the clusters are parallel or perpendicular to the x and y axes. When a full covariance matrix is chosen, there is no restriction to the orientation of the minor and major axis of the confidence ellipsoids drawn over the clusters. A shared covariance matrix indicates that all confidence ellipsoids have the same size and orientation, whereas an unshared covariance matrix indicates different sizes and shapes of the confidence ellipsoids. Choosing the appropriate hyperparameter is a very important task, and the method adopted is presented in the next section.

Hyperparameter Tuning of GMM
The number of cluster components k and appropriate covariance structure Σ is unknown for each stage of electrical treeing in Samples A−E. The most commonly used technique to tune the hyperparameters is by comparing the Akaike information criterion (AIC) and Bayesian information criterion (BIC). AIC is the relative distance between the unknown true likelihood function of the data and the fitted likelihood function of the model. A lower AIC means the model is closer to reality. BIC is an estimate function of the posterior probability of a model being true under certain assumptions, so a lower BIC means the model is bound to be the genuine model. A detailed discussion on the importance of AIC and BIC related to model selection is available in [28]. The procedure to choose the optimal values for the hyperparameters namely the number of clusters k and appropriate covariance structure Σ is shown in Figure 5 and was coded using MATLAB. A regularization value of 0.01 is specified and the EM iteration is specified as 10,000 in order to avoid ill-conditioning of the covariance matrix during EM iteration. The number of clusters k takes a value from 1 to 12.
The procedure shown in Figure 5 is performed for all fifty datasets. The procedure is explained for the initial stage of electrical treeing in Sample A in the first interval and the same procedure is applicable for other datasets. For sample A1, the AIC and BIC values are summarized in Tables 3 and 4, respectively. The bar plots grouped by the number of clusters are shown in Figures 6 and 7, respectively. From the bar plot it is very clear that when the number of clusters is more than 1, the AIC and BIC values decrease, and after a specific number of clusters is reached, the variation of AIC values is significantly low. The point where the change in AIC value declines the most is the elbow point. The elbow point in Figures 8 and 9 was determined by plotting a curb, and corresponds to the number of clusters k = 4 and covariance structure which is full and unshared. These are the best hyperparameters for dataset A1. The same procedure has to be repeated in all datasets and the best hyperparameter for each dataset in samples A to E are provided in Tables 5-9, respectively.           According to Tables 5-9, the minimum value of k is 4 and its maximum value is 8. The nature of the covariance matrix is mostly full with an unshared structure and rarely diagonal with an unshared structure. The values of hyperparameters selected for the GMM are shown in Table 10.
The shared covariance is false, which indicates a non-identical or unshared covariance matrix. The grid length is important in order to draw the confidence ellipsoids over the clusters. Grid length and the number of iterations for the EM algorithm are selected by trial and error.

GMM Results and Discussions
The six clusters with their centers and confidence ellipsoids as shown in Figure 10 indicate GMM models for the PRPD patterns of the initial stage (interval 1) of electrical treeing in Samples A, B, C, D and E. From Figure 10, it is clear that the ellipsoids are of different sizes and there are no restrictions to the orientation of their minor and major axis. Figure 11 shows the six clusters with their centers and confidence in the final stage (interval 10) of electrical treeing in Samples A, B, C, D and E ,which is close to the insulation breakdown. An interesting observation from Figure 11 is that none of the clusters are circular in shape and, therefore, K-means clustering could not be used here since it is not built to account for other shapes and a circular fit would be a bad fit to the data. In samples C and D, cluster overlapping can be clearly appreciated. The GMM algorithm produces the cluster centers based on the shape of the PRPD pattern and the PD activity and plots confidence ellipsoids with a 99% probability threshold as specified in the MATLAB program.
For each dataset sample, the normalized data is clustered into six groups, differentiated by color, using the GMM clustering. For each cluster in the two-dimensional (2D) plane, the midpoint of the cluster is also indicated in Figures 10 and 11. In each case, the Phi and Q are normalized to return the vector-wise Z score of all the datasets (i.e., Samples A−E) with center 0 and standard deviation 1. The midpoints of the six clusters for each sample are shown in Appendix A.
For each dataset, the normalized data is clustered into six groups using the GMM clustering. From Figures 10 and 11, it can be seen that the clusters are different, showing that the applied voltage and the stage of tree growth have a significant effect on the generated PD patterns. The initial and final stage clusters appear to be different for all the samples except for Sample A. For this sample, there is a similarity between the initial and the final stage clusters probably due to the continuous irregular trend in PD because of the formation of filamentary trees. The variance of the data in each cluster appears to be higher in the final tree stages when compared to the initial stages. This might be the PD mechanism over time and the increase in the number of PD events as the tree approaches the breakdown stage. It can be seen that the PRPD pattern clusters for all 5 samples at the initial stage appeared to be centered at different positions despite the fact that the applied voltages are close to each other. This shows that, even for the same applied voltage, the PD phenomenon in trees exhibits complex behavior and does not show any clear trend, making it difficult to evaluate in some instances. These conclusions can also be associated with the final stages of the treeing patterns in Figure 11.  As a further analysis of the GMM technique, Appendices A and B shows the cluster centers and their mean values/variances for all the samples (A−E), i.e., from the initial to the final stages of tree growth. In Appendix A, the cluster centers for the different sample datasets are shown. It can be seen that for samples A, B, and C, the positioning of the clusters from the beginning to the breakdown level varies significantly, although the first and the third cluster for all the samples A−E, fall within close range with insignificant variance in the X-axis in Appendix B. However, cluster 2 evidenced higher variance for samples A−C. This might be due to the stochastic nature of the PD mechanism of electric trees and the fact that the occurrence of the PD patterns within the first and last half of the AC power cycle are similar. Furthermore, it is interesting to note that the patterns for samples D and E have very similar pattern trends in the initial stage and the final breakdown stage, with insignificant variance among all the cluster shapes (see Appendix B, Table A6). This is a clear indication that with a lower breakdown time of insulation, treeing patterns are identical, and can easily be categorized. In the case of samples A−C, the results showed wider PD variability among the samples, showing their distinct nature. It was difficult to clearly differentiate between the initial and breakdown stage of the PD patterns. In addition, most of the pattern cluster centers show lower variance along the x-coordinates as compared to the y-coordinate, showing higher amplitude variation for the PD patterns as compared to the phase changes. The information in appendix B can serve as a statistical tool to predict and identify the applied voltages of the treeing patterns and their breakdown times for electrical trees in epoxy resin insulation. Assuming that there are new samples, they need to be classified as belonging to any of the confidence intervals of the samples in Appendix B and can be regarded as that particular sample.
In general, the results imply that the GMM can classify different degradation stages of the treeing patterns up to breakdown for samples having breakdown times higher than one hour, while it cannot effectively perform the same function for samples with shorter breakdown times, e.g., half an hour and lower. This might be because the PD mechanism has not been allowed to develop and it bridges the insulation in a shorter time, while more time before breakdown refers to a wider spread of treeing branches and more deterioration of the insulating material. To a certain extent, it can be said that the information in Appendix B almost correlates with the breakdown times but not necessarily the applied electrical stress across the sample defects. Although voltage can be regarded as one of the factors affecting the tree growth shape and structure in the cases analyzed here, the insulation breakdown time appears to be a key factor in determining the degradation levels of the PD patterns emanating from electrical trees.

Electrical Tree Pattern Recognition
As GMM is an easy and efficient technique for clustering, it has been shown, in the previous section, to aid the classification of electrical tree patterns. The fundamental flowchart for the electrical tree pattern recognition is shown in Figure 12. First, the data is captured and subsequent PRPD patterns are formed, followed by the data processing and the GMM clustering. The recognition of the tree growth clusters can be done by PD pattern judgment, i.e., comparing the clusters with the already established cluster confidence intervals in Appendix B. In this case, the level of the tree growth can be known, i.e., the initial, final, or other stages of degradation can be determined.

Conclusion
In this paper, the GMM has been utilized for clustering and classification applications in electrical trees emanating from epoxy resin insulation. Different PD samples were captured at different voltages from the initial to the final breakdown stage. The results show that PD dynamics vary with different stressing voltages and with the level of the tree growth. Depending on the sample and the applied electrical stress, there are different breakdown times. The GMM is chosen over other techniques in this work because it is robust and can perform hard clustering for complex data such as electrical tree patterns. The results clearly indicate that GMM can effectively classify patterns from the initial to the breakdown level for breakdown times above an hour, but not breakdown times of less than an hour, as the ones obtained with samples stressed at the highest stressing voltage of 16 kV. The PD patterns for shorter breakdown times possess identical clusters through the degradation stages. In this paper, the cluster centers and their confidence intervals have been developed to recognize the PD patterns in electrical trees at different stages ranging from the initial to the breakdown stage. However, the results presented in this paper can be further validated by experimenting with different samples captured at different voltages and breakdown times. Further research can also be conducted for different insulating materials such as polyethylene or cross-linked polyethylene to ascertain the efficiency of the proposed classification tool.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.