Influence Analysis and Prediction of ESDD and NSDD Based on Random Forests

Equivalent salt deposit density (ESDD) and non-soluble deposit density (NSDD) measurements are a basic requirement of power systems. In order to predict the site pollution severity (SPS) of insulators, a new method based on random forests (RFs) is proposed. Using mutual information (MI) theory and RFs, the weights of factors related to the SPS of insulators are analyzed. The samples of contaminated insulators are extracted from the transmission lines of high voltage alternating current (HVAC) and high voltage direct current transmission (HVDC). The regression models of RFs and support vector machines (SVM) are constructed and compared, which helps to support the lack of information in predicting NSDD in previous works. The results are as follows: according to the mean decrease accuracy (MDA), mean decrease Gini, (MDG), and MI, the types of the insulators (including surface area, surface orientation, and total length) as well as the hydrophobicity are the main factors affecting both ESDD and NSDD. Compared with NSDD, the electrical parameters have a significant effect on ESDD. For the influence factors of ESDD, the weights of the insulator type, hydrophobicity, and meteorological factors are 52.94%, 6.35%, and 21.88%, respectively. For the influence factors of NSDD, the weights of the insulator type, hydrophobicity, and meteorological factors are 55.37%, 11.04%, and 14.26%, respectively. The influence voltage level (vl), voltage type (vt), polarity/phases (pp) exerted on ESDD are 1.5 times, 3 times, and 4.5 times of NSDD, respectively. The influence that distance from the coastline (d), wind velocity (wv), and rainfall (rf ) exert on NSDD are 1.5 times, 2 times, and 2.5 times that of ESDD, respectively. Compared with the natural contamination test and the SVM regression model, the RFs regression model can effectively predict the contamination degree of insulators, and the relative error of the predicted ESDD and NSDD is 8.31% and 9.62%, respectively.


Introduction
Research on insulator natural contamination is a basic requirement of external insulation.The contamination degree of insulators is a result of the actual operating environment, which can reflect the pollution resistance characteristics of insulators under natural conditions [1].Due to the complex working environment of insulators, the natural contamination characteristics of the insulator are difficult to depict with mathematical expressions.
At present, machine learning algorithms are widely used in the study of the natural contamination characteristics of insulators: Jiao et al. combined particle swarm optimization (PSO) and support vector machine (SVM) to build an insulator contamination on-line monitoring system to predict the contamination degree, and the relative error was less than 10% [2].Ahmad et al. modeled the relationship between equivalent salt deposit density (ESDD) with temperature, humidity, pressure, rainfall, and wind velocity, using artificial neural networks (ANN), and the mean absolute error of the model output was found to be 3.6% [3].Meanwhile, Karamousantas et al. provides the foundation for routine maintenance of insulators according to the same algorithm, and the relative error was less than 17% [4].Muniraj et al. used the characteristics of leakage current as an input, and predicted ESDD by an adaptive neurofuzzy inference system (ANFIS) model, whose coefficient of determination was 0.998 and root mean square error was just 0.00323 [5].In recent years, the random forest algorithm has been widely used in the field of power systems.Hannan et al. proposed that the performance of the RFs based space vector pulse width modulation (SVPWM) technique is superior to both ANN-SVM and ANFIS-SVM techniques in terms of damping capability, settling time, steady-state error, and transient response under different operating speeds and load conditions [6].Samantaray et al. proposed the identification of the fault zone in flexible alternating current transmission systems (FACTS)-based transmission lines using RFs (random forests) with an accuracy and reliability of more than 99% [7].Shah et al. proposed a RFs-based fault discrimination technique for power transformers, and the fault discrimination accuracy was more than 98% [8].Meanwhile, Kannan et al. proposed a RFs classifier, and the identification rate of the RF classifier lies above 98% at all pollution conditions [9].Larivière and Van den Poel found that both random forests and regression forests techniques provide a better fit for the estimation and validation sample compared to ordinary linear regression and logistic regression models, and the prediction accuracy was 95% [10].Therefore, the study of random forest algorithms is essential in the field of external insulation.However, the following shortcomings exist in the studies of SPS (site pollution severity) prediction: Firstly, the influencing factors are different and there is no weight analysis of the factors related to the natural contamination.Secondly, only ESDD is predicted in all of the above algorithms; NSDD is ignored, and according to IEC60815, NSDD is also an indispensable factor [11].Lastly, RFs based natural pollution prediction has not been compared with other algorithms, and there is no actual natural contamination test validation.
In this paper, 16 factors that are related to ESDD and NSDD are proposed, and the weights are analyzed using MI (mutual information) and RFs.Based on the most related factors, a new regression method of using RFs as a function estimation is established.Experiments show that the method is greatly superior to the SVM regression model in terms of accuracy.The prediction results of this method have been verified by the actual data from the insulators of the Chinese East Coast transmission line.

Insulator Parameters
In order to study the influence of voltage type and voltage level on the natural accumulation of composite insulators, the surrounding insulators of HVAC transmission lines are sampled.The voltage level of the transmission line and the parameters of the insulators are shown in Table 1.As the contamination degree of the insulators is not uniform [12], the parameters of each insulator sample consists of four values: ESDD of the upper surface, NSDD of the upper surface, ESDD of the lower surface, and NSDD of the lower surface.Since the insulator string of the transmission line is composed of a plurality of insulators, the potential of each insulator is different.So each string of insulators is numbered from 1 to n according to its potential.The sampling diagrams of suspension insulators and the long rod composite insulator are shown in Figure 1.The natural contamination data presented in this paper were sampled in the transmission line of East China.In order to reduce the measurement error, the insulators were removed from the ground, using non-woven cloth (with a small amount of ethanol solution) to wipe off all the contamination.The method introduced in [11] was used to measure the ESDD and NSDD of each insulator.

The Influences of the Natural Contamination
Meteorological factors are evaluated using the average value of one-year, such as days with a wind velocity that is more than 5.5 m/s, days of medium to heavy rain, annual average temperature, and so on.
The SPS is determined by ESDD and NSDD under various factors.At present, the difficulty in the modelling of natural contamination is the selection criteria for the related factors.According to references [12][13][14][15][16] and the actual working condition, the following factors are chosen to evaluate the contamination of insulators: (1) Contamination factors: deposition time (dt), hydrophobicity (HC), particle size (ps); (2) Meteorological factors: altitude (a), distance from coastline (d), rainfall (rf ), temperature (t), and wind velocity (wv); (3) Insulator type: material (m), position factor (pf ), surface area (sa), surface orientation (so), total length (n); (4) Electrical factors: voltage type (vt), voltage level (vl), polarity/phases (pp).Considering the factors shown in Table 2 as inputs, RFs was used to calculate the weights.

Concept of RFs
RFs is a statistical learning algorithm, which adopts the bootstrap resampling method to extract multiple samples from the original sample.Firstly, the decision tree for each bootstrap sample is constructed.Then the predictions of multiple decision trees are combined by voting for the final prediction, and the results are obtained [17].This method has high prediction accuracy, good tolerance for abnormal values and noise, and it is not easy to over-fit.On the basis of obtaining a variety of influencing factors, the ESDD and NSDD prediction model is carried out by using the random forests regression (RFs-R) model.The modelling process is shown in  There are two methods for calculating the weight of RFs variables: one is based on mean decrease accuracy (MDA); the other is based on Gini impurity, called the mean decrease Gini (MDG).The weight of the corresponding factor will increase with the decrement of the above two parameters [17].

Mean Decrease in Gini
Suppose that S is a set of s data samples whose class label attribute has m different values and defines m different classes (C i , i = 1, ..., m).According to the difference of the class label attribute values, S can be divided into m (S i , i = 1, ..., m), and let S i be the set of samples belonging to class C i and S i is the number of samples in set S i .The Gini index of set S is: where, P i estimated by s i /s is the probability of any sample belonging to C i .When Gini(S) is 0, all the records in the set belong to the same category.When all the samples in set S are evenly distributed, Gini (S) reaches its maximum, indicating that the minimum useful information can be obtained.If the data set S is divided into n subsets (S j , j = 1, . . ., n) according to a certain attribute partition, the index of Gini split after splitting is: where, n is the number of child nodes, s j is the record number of the child node j, and s is the number of records at node P. When the classification method has traversed all of the attributes, the attribute whose Gini split is the smallest will be selected as the split attribute of the node.

Mean Decrease in Accuracy
The order of the eigenvalues of each feature is disrupted and the impact of the order variation on the accuracy of the model is measured.Obviously, the order of disruption has little effect on the accuracy of the model for unimportant variables, but the order of the scramble reduces the accuracy of the model for important variables.
At first, train the RFs model to test the out of bag (OOB) error of each tree in the model using the sample data outside the bag.And then, randomly disrupt the value of the variable v in the sample data outside the bag and retest the OOB error of each tree.Finally, obtain the weight measure of the single tree to the variable v, which is the mean value of the difference of the OOB error in the test.

RFs-R Model
Using the RFs-R model written in MATLAB [18], and compared with the research of Larivière and Van den Poel on profit evolution estimation in the field of economics, RFs can be trained to predict the SPS of insulators.There are two major parameters of RFs, one is "n tree " representing the number of trees and the other is "m try " representing the number of variables used in determining the best split at each node.Leo Breiman [17] explained that the optimal value of m try usually lies between log 2 v and √ v where v is the number of features.Larivière and Van den Poel had chosen a total of 30 dependent variables, and set the parameters of RFs-R as n tree = 5000, m try = 6.In this paper, there are 16 dependent variables, and the m try is set as 4.
The following are the indicators to evaluate the RFs regression model [19], and R 2 is the fitness of the model.Mean percent standard error (MPSE) is an accurate indicator reflecting the fitting effect, where n represents the total number of insulators involved in the prediction, ŷi is the predicted output of the trees in the forest corresponding to the given input x i sample, and y i is the observed output.The parameter of n tree can be set according to the minimum MPSE [17].

Concept of Mutual Information
There are different correlations between the SPS of the insulators and the influencing factors, and the impacting degree of the same influencing factors on the SPS of different insulator strings is also different.Using the theory of mutual information (MI), the high correlation factors with ESDD and NSDD can be selected [20].
The information about the system X obtained after the known system Y can be represented by the difference between the unconditional entropy and the conditional entropy, which is defined as the mutual information of X and Y. X is the decision attribute and Y is the condition attribute.
For discrete data, where p(x i ) is the probability of X = x i , p(y j ) is the probability of Y = y i ; p(x i |y j ) is the conditional probability of X = x i when Y = y i .In order to calculate the probabilities, the data of the samples should be discretized.The discretization of the samples can be used to convert the specific value into the interval represented by the probability, and although the discretization will lose the details, the result is more statistically significant.The specific method of discretizing the attribute fields of each condition attribute and decision attribute is to find the maximum and minimum values in each attribute domain respectively, and divide the distance between the maximum and minimum values into w intervals.Each value is placed in the corresponding interval, and the number of values in each interval is obtained.

The Weights of the Related Factors-RFs
Figure 3a,b shows the MPSE between the predicted and actual values based on the RFs when n tree takes different values.Overall, the MPSE of the prediction model decreases as n tree increases.However, when the value of n tree is large, a too fine classification will lead to a rapid increase in the amount of calculations.Considering the modelling speed and the prediction error, n treeESDD = 45 and n treeNSDD = 66 are the best decision tree numbers.The RFs overcome the shortcomings of traditional variable selection methods (choosing one or two variables in a group of variables that are equally highly correlated).As can be seen from Figure 4, there are gaps in the importance of different factors.The importance of the factors given by the RFs shows that the dominant factor is the insulator type, and the smallest influencing factor is the contamination deposition time.From the perspective of the insulators' maintenance data during spring, although the meteorological factors affect the natural pollution of the insulators, the effect is not obvious.As can be seen from Figure 4a, the related factors that affect ESDD are: the insulator surface area (sa), the position factor of the insulator (pf ), the insulator voltage level (vl), the orientation of the surface (so), the hydrophobicity after contamination (HC), and the polarity/phase (pp) of the line.Compared with ESDD, the factors that affect NSDD are mainly the insulator surface area (sa), the orientation of the surface (so), and the hydrophobicity after contamination (HC), as can be seen from Figure 4b.The electrical factors of the insulators have less of an impact on NSDD.

The Importance of the Related Factors-Mutual Information
For the insulator string G k (k = 1, ..., n), it is assumed that the ESDD and NSDD data sequence of the p insulator samples constitute the data set: x esdd,1 , x esdd,2 , . . ., x esdd, p x nsdd,1 , x nsdd,2 , . . ., x nsdd, p The data sequence of the l latent correlation factors constitutes the data set Y D = {Y 1 , Y 2 , ..., Y l }, and the mutual information between the X D and each correlation factor Y D can be expressed as: where, x esdd,I , x nsdd,I ∈ X D , and Y j ∈ Y D .Randomly selecting p = 50 and j = 16, the thermal maps were drawn based on I ESDD and I NSDD , which are shown in Figure 5.In Figure 5, the horizontal axis represents the 16 related factors listed in Table 2, and the vertical axis represents the ESDD and NSDD, respectively.Each colour block represents the mutual trust size of the impacting factor.The greater the mutual information, the stronger the correlation is.
If only the mutual information between the SPS of a single insulator and the correlation factors is analysed, it can be found that there are individual differences in the colour distribution of each row.However, when the results from many insulators are integrated, the overall colour distribution captures the common characteristics of the correlation between SPS and each factor, and the strong correlation factors that affect the SPS of the insulators can be determined.
It can be seen from Figure 5a that the hydrophobicity (Y 3 ), surface area (Y 11 ), and surface orientation (Y 12 ) have a strong correlation with ESDD.Material (Y 9 ), total length (Y 13 ), polarity/phase (Y 14 ), and voltage level (Y 15 ) have a significant impact on ESDD.It can be seen from Figure 5b that the influencing factors of hydrophobicity (Y 3 ), surface area (Y 11 ), surface orientation (Y 12 ), and total length (Y 13 ) have a strong correlation with NSDD.

Natural Contamination Tests
The regression models of RFs and SVM have been compared in this paper.Two-thirds (2/3) of the data are randomly selected as the RFs-R model training sample set, and we establish the support vector machines regression (SVM-R) model based on the same training sample set.Two regression models are tested with the 1/3 of the remaining data as the test sample set.
Due to the test samples being large, we randomly selected the ESDD and NSDD actual values of the two strings of insulators and compared them with the forecasted values of RFs-R and SVM-R.It can be seen from Figure 6 that the forecasted ESDD and NSDD trends are consistent with the actual ESDD and NSDD.Therefore, it is possible to use RFs-R and SVM for the prediction.

Discussion
In this chapter, the effects of electric field factors, meteorological factors, and contamination on the SPS of insulators are analysed.At the same time, the RFs classifier is used to quantitatively analyse the weight of the relevant factors.

Potential Distribution of the Insulators
Based on the finite element method, the electric field and the potential distribution of the insulators were simulated.Figure 7 shows the results of two-dimensional simulation of the suspension porcelain insulators, where Figure 7a,b is the voltage distribution and the electric field distribution of the insulator, respectively.Since there are 60 insulators on the ±660 kV DC transmission line, the voltage drop of each insulator is about 11 kV.In the simulation, the excitation voltage of the steel pin is set as 11 kV, and the excitation voltage of the steel cap is set as 0 kV.According to Figure 7, the upper and lower surface of the insulator potential and electric field distribution is significantly different, corresponding to the actual contamination of the suspension insulator as shown in Figure 8, which reflects the importance of so (surface orientation).Figure 9 shows the two-dimensional electric field and potential distribution simulation of the suspension insulator string and composite insulators.It can be seen from Figure 9b,d that the electric field strength at both ends of the insulator is high, and the field strength decreases first and then increases along the direction of the insulator string.From Figure 9a,c we can see that the potential difference between the two ends of the insulator is very large, and the potential difference in the middle part of the string is very small.Therefore, the total length (n) and the position factor (pf ) of the insulator string will have a large effect on the electric field strength and potential of the insulator, which indirectly affects ESDD.

Contamination Analysis
From the cumulative frequency point of view, as shown in Figure 10a, the D90 of the composite insulators at the anode is 23.26 and 21.90 at the cathode; the D90 of the porcelain insulator is 41.54 and 33.33 at the cathode; the D90 of the coated room temperature vulcanized silicone rubber (RTV) insulator is 32.62 and 30.99 at cathode, and 90% of the contamination particle size conforms to the following rules: composite insulators < insulators coated with RTV < porcelain insulators.The composite insulator has a stronger ability to adsorb small particles than the porcelain insulators, and the insulator material has a great influence on the particle size of the particles.The insulator string under the cathode more easily adsorbs small particles than the insulator string under the anode.The D90 of the composite insulators at the cathode is lower than that at the anode by 5.85%.The D90 of the porcelain insulators at the cathode are lower than that at the anode by 19.76%.The D90 of the insulators coated with RTV at the cathode is lower than that at the anode by 5.00%, as shown in Figure 10a.
From the frequency point of view, as shown in Figure 10b, the maximum particle size of the composite insulators at the cathode is 7.83 µm and 8.77 µm at anode; the maximum particle size of the RTV-coated insulators at the cathode is 14.2 µm and 12.3 µm at the anode; the maximum particle size of the porcelain insulators at the cathode is 18.1 µm and 22.3 µm at the anode.The maximum particle size of contamination conforms to the same rules as the cumulative frequency, composite insulators < insulators coated with RTV < porcelain insulators.The effect of contamination on the organic materials is greater than that of the inorganic materials, and the particle size of contamination is smaller.
Table 3 shows the particle size distribution of the contamination, where D10 refers to the particle size corresponding to the cumulative probability distribution number of 10%, and P < 3 refers to the cumulative probability of a particle size less than 3 µm.It can be seen from Table 3 and Figure 10 that the particle size distribution of the contamination of the composite insulator is concentrated in the range of small particles below 10 µm, while the other two kinds of insulator particles have a frequency distribution of more than 10 µm.The particle size distribution of the contamination is significantly affected by the insulator type and the suspension insulators have a wider range of particle size distribution than the rod insulators.From the above analysis, the insulator polarity, type, and material affect the contamination particle size distribution, but not obviously.

Meteorological Factors
The following figures show the impact of related factors on ESDD or NSDD.The trend of the SPS of the insulators is qualitatively analysed according to Figure 11.The ranges of the related factors are shown in Table 4.   From Figure 11a we can see that the distance factor (d) has a significant influence on the NSDD, and little influence on the ESDD.With the increase of the distance factor (d), the NSDD of the lower surface shows an increasing trend.The NSDD within 5 km of the coastline is significantly lower than the NSDD in the inland area.The maximum values of ESDD and NSDD in all samples appear on the lower surface of the insulator string farthest from the coastline.Compared with Figure 11a-c, since the distance factor (d) has a linear relationship with altitude (a) and temperature (t), the effect on SPS is basically the same.
As can be seen from Figure 11d, the larger the annual average rainfall (rf ) is, the greater the impact on ESDD and NSDD is.The ESDD and NSDD in the region of less rainfall are generally higher than that in the heavy rainfall area.However, with the increase of rf, the ESDD value of the upper surface gradually decreases, indicating that the rainfall plays a role in scouring the upper surface of the insulator.However with the increase of rf, the ESDD of the lower surface shows a tendency to decrease first and then increase.We believe that the region of rf ≤ 40 days is mainly concentrated in the inland areas, and the effect of scouring is obvious, leading to a downward trend of ESDD.With the increase of rf, the corresponding samples are concentrated in the coastal areas.The residual electrolyte increases on the surface because of the rain.Since it is difficult to scour the lower surface of the insulators by rain, the electrolyte gradually increases, resulting in the lower surface salt density rise.
It can be seen from Figure 12a that the wind has an obvious effect on the ESDD.With the increase in wv, ESDD showed a decreasing trend.The relationship between NSDD of the upper surface with the wind is complicated.We believe that the contamination can be easily carried by wind, causing uneven accumulation of contamination.As can be seen from Figure 12b, the SPS tends to increase first and then decrease with the increase of dt.Based on the statistical results of the insulator, the first saturated time of SPS is in the second year.

The Hydrophobicity of the Insulators
Choosing one suspension insulator and one composite insulator as samples, the hydrophobicity classification (HC) is measured based on the method of reference [21].
Figure 13 shows the hydrophobicity of the composite insulators.Regardless of the upper surface or the lower surface, the HC results of the composite insulators are mostly HC2 and HC3, as can be seen in Figure 14a,b, indicating that the composite insulators have good hydrophobicity.On the whole, the poorly hydrophobic points appear near the ends of the composite insulator, which corresponds with the distribution of the high electric field intensity in Figure 9d.It can be seen in Figure 15a that the upper surface has good hydrophobicity and is concentrated in HC3-HC4.The lower surface of the suspension insulator shown in Figure 15b substantially lacks hydrophobicity.The HC of the entire string of suspended insulators are shown in Figure 16; the HC of the upper surface is maintained at HC4, and the weak hydrophobicity of HC5-HC7 of the insulators is related to the high field strength region of Figure 9b.In Figure 16b, the hydrophobicity of the lower surface is completely lost, whose ESDD is 0.306 mg/cm 2 and NSDD is 3.245 mg/cm 2 , and there is a direct correspondence with the high SPS.
According to the ratio of α to β, the influence of the material (m), position factor (pf ), total length (n), polarity/phases (pp), voltage level (vl), and voltage type (vt) on ESDD are greater than that on NSDD, while the other related factors have a larger effect on NSDD.The impact of the voltage level (vl), voltage type (vt), and polarity/phases (pp) on ESDD are 1.5 times, 3 times, and 4.5 times that of NSDD.The impact of the distance from coastline (d), wind velocity (wv), and rainfall (rf ) on NSDD are 1.5 times, 2 times, 2.5 times of ESDD.It can be seen that the electrical factors have an obvious impact on ESDD, which corresponds to the results in reference [22].
The smallest influencing factor for ESDD and NSDD is the deposition time (dt).It has been considered that the RFs learning data are taken from the annual sampling and the external influencing factors are basically the same in each fixed period, which leads to the least impact on the SPS.On the basis of the SPS measurements, it is believed that the following inputs have high weights for the SPS forecasting of insulators, which include: the surface factor (sa), the position factor (pf ), the insulator voltage level (vl), the surface orientation (so), hydrophobicity (HC), and the polarity/phases (pp).When the NSDD is forecasted, the information of the surface factor (sa), surface orientation (so), and hydrophobicity (HC) should be analysed emphatically.
In order to actually detect the validity of the RFs-R trained model, the FC160P/C170DC suspension insulators were suspended on the ±660 kV transmission line tower in 2016 for a one-year natural contamination test.A picture is shown in Figure 17.Using the trained RFs-R model, we set the surface factor (sa), position factor (pf ), insulator voltage level (vl), the surface orientation (so), and the polarity/phases (pp) as inputs to predict the ESDD.We set the surface factor (sa), surface orientation (so), and hydrophobicity (HC) as inputs to predict the NSDD.The parameters of the RFs-R model were set as n treeESDD = 45, n treeNSDD = 66, and m try = 4 (Consistent with Section 3.1).The relative error of the predicted ESDD and NSDD and the real ESDD and NSDD is 8.31% and 9.62%, respectively, which is shown in Figure 18.

Figure 2 .
Firstly, a series of training samples are randomly selected from the original training sample set S k (k = 1, ..., n) by the Bootstrap method for the insulator group G k (k = 1, ..., n), then we use the test set to test the decision tree, synthesize the test results of multiple decision trees, and obtain the final ESDD and NSDD forecast model by voting.

Figure 2 .
Figure 2. Modelling process of ESDD and NSDD forecasting based on RFs.

Figure 3 .
Figure 3. Forecasting the OOB error of RFs with different n tree .(a) The change in MPSE of ESDD when n tree changes; (b) The change in MPSE of NSDD when ntree changes.

Figure 4 .
Figure 4. Importance analysis of influencing factors.(a) The results of ESDD; (b) The results of NSDD.

Figure 7 .
Figure 7. Two-dimensional simulation results of the suspension porcelain insulator.(a) The potential distribution of the suspension insulator; (b) The electric field of the suspension insulator.

Figure 8 .
Figure 8.The contaminated insulator.(a) The upper surface; (b) The lower surface.

Figure 9 .
Figure 9. Insulator electric field and potential distribution.(a) The potential distribution of the suspension insulators; (b) The electric field of the suspension insulators; (c) The potential distribution of the composite insulators; (d) The electric field of the composite insulators.

Figure 10 .
Figure 10.Contamination particle size.(a) The cumulative frequency of the particle size; (b) The frequency of the particle size.

Figure 11 .
Figure 11.Effects of the influencing factors on ESDD and NSDD.(a) Distance from coastline; (b) Altitude; (c) Annual average temperature; (d) Medium to heavy rain.

Figure 13 .
Figure 13.The hydrophobicity of the composite insulators.(a) The upper surface; (b) the lower surface.

Figure 14 .
Figure 14.The hydrophobicity of the composite insulator.(a) The upper surface; (b) the lower surface.

Figure 15 .
Figure 15.The hydrophobicity of the suspension insulator.(a) The upper surface; (b) the lower surface.

Figure 16 .
Figure 16.The hydrophobicity of the suspension insulators.(a) The upper surface; (b) the lower surface.

Figure 17 .
Figure 17.The diagram of the artificial natural pollution test.

Figure 18 .
Figure 18.The results of the artificial natural pollution test.

( 1 )
The R 2 and MPSE of the trained ESDD RFs-R model are 0.951 and 7.98%, respectively, and the relative error of the predicted ESDD is 8.31%.The R 2 and MPSE of the trained NSDD RFs-R model are 0.911 and 9.04% respectively, and the relative error of the predicted NSDD is 9.62%.Compared with natural contamination test and the SVM regression model, the RFs-R model can effectively predict the natural contamination of insulators.(2) According to the MDA (MDG) and MI, the types of the insulators (including surface area, surface orientation, and total length) as well as the hydrophobicity are the main factors affecting both the ESDD and NSDD.Compared with NSDD, the electrical parameters have a significant effect on ESDD.For the influence factors of ESDD, the weights of the insulator type, hydrophobicity, and meteorological factors are 52.94%,6.35%, and 21.88%, respectively.For the influence factors of NSDD, the weights of the insulator type, hydrophobicity, and meteorological factors are 55.37%, 11.04%, and 14.26%, respectively.(3) The effect of electrical parameters on the ESDD is greater than that on NSDD, while other non-electrical parameters have a significant impact on NSDD.The influence that the voltage level (vl), voltage type (vt), and polarity/phases (pp) exert on ESDD are 1.5 times, 3 times, and 4.5 times that of NSDD.The influence that the distance from coastline (d), wind velocity (wv), rainfall (rf ) exert on NSDD are 1.5 times, 2 times, and 2.5 times that of ESDD.(4) For engineering reasons, SPS has been measured only in a fixed yearly period, and on-line daily ESDD and NSDD measurements are urgently needed, although considerable results have been achieved.More locations and higher accuracy data should be collected and analysed to quantitatively reveal more robust and accurate rules.

Table 3 .
The particle size distribution of the contamination.

Table 4 .
The particle size distribution of the contamination.

Table 5 .
The MDA and MDG of related factors.