A Classification Method for Transmission Line Icing Process Curve Based on Hierarchical K-Means Clustering

Icing forecasting for transmission lines is of great significance for anti-icing strategies in power grids, but existing prediction models have some disadvantages such as application limitations, weak generalization, and lack of global prediction ability. To overcome these shortcomings, this paper suggests a new conception about a segmental icing prediction model for transmission lines in which the classification of icing process plays a crucial role. In order to obtain the classification, a hierarchical K-means clustering method is utilized and 11 characteristic parameters are proposed. Based on this method, 97 icing processes derived from the Icing Monitoring System in China Southern Power Grid are clustered into six categories according to their curve shape and the abstracted icing evolution curves are drawn based on the clustering centroid. Results show that the processes of ice events are probably different and the icing process can be considered as a combination of several segments and nodes, which reinforce the suggested conception of the segmental icing prediction model. Based on monitoring data and clustering, the obtained types of icing evolution are more comprehensive and specific, and the work lays the foundation for the model construction and contributes to other fields.


Introduction
Loads induced by atmospheric icing on transmission lines may lead to ice flashover, conductor galloping, even conductor breakage, tower collapse, etc., seriously threatening the reliability of the power supply and integrity of the power transmission infrastructure [1,2]. As a major hazard, ice storms have caused great damage to electric power systems and huge economic losses to citizens at home and abroad, such as China [3,4], America [5], Canada [6,7], Germany [8], Japan [9], Iceland [10], Sweden [11], Czech Republic [12], and so on, for over 50 years. As an increasing number of ultra-high voltage transmission lines pass through heavy icing areas in China [13], the online monitoring on transmission lines alone could not satisfy the demands of anti-icing in power grids.
The prediction of ice accretion on transmission lines is of great significance for electric power companies at risk to get prepared and take appropriate preventative measures [14]. Additionally, the investigation on icing evolution makes sense to the anti-icing design of transmission lines, anti-icing deployment, and emergency strategies. The purpose of icing forecasting is to predict ice thickness following some certain methods or rules based on existing information for grasping the course of the ice event in advance. less than 4.61% and 3.74%, respectively. Liu [26] divides ice thickness into three states including light, medium, and heavy icing, then applies a fuzzy Markov chain to calculate the nth state based on the transition probability matrix derived from the former n-1 data. The verification of prediction effect suggests an accuracy of 80% based on the practical measured data of annual maximum ice thickness from 1940 to 1999 in the Czech Republic, where the previous 54 years are taken as the training sample and the last five years as the testing sample. In a study by Huang [23], an autoregressive moving average model is improved and the prediction effect in the next three hours on two lines is adequate with an error of only 1.59% and 2.33%, respectively, but it gets worse when the error exceeds 5% as time goes on. A combined model of genetic algorithm and Kalman filtering is proposed in a study by Huang et al. [27], and the prediction of ice thickness during the 10th and 15th day suggests an error of under 2.58% based on the previous 10 days of data as the training sample. However, the time-dependent model cannot apply to the beginning of an event unless it has proceeded for a while. Moreover, as it depends on the previous sequence, it lacks accuracy when the ice suddenly increases or decreases due to an abrupt change of temperature. For example, it is difficult to track the downward trend after the ice reaches the maximum. In brief, the model is appropriate for short-term prediction but inapplicable for long-term prediction.
As mentioned above, even though a lot of prediction models are studied by many researchers, there are still some problems of concern. The main problem is the lack of an entire perceptual ability since the existing models mainly focus on short-term predictions. Wang [28] has pointed out that the entire icing phase needs to be considered in further research, as the safe operation of the power grid will also be affected during the melting of ice. Studying the icing growth and shedding mechanisms in different conditions, Wang [29] also agree with the great significance of wire icing evolution for ice disaster recognition and prediction. In spite of different research objects, the icing evolution is also emphasized in other fields. For example, the ice accumulated on the structures of wind turbines may result in the reduction of turbine power output [30,31]. Gantasala [32] suggests that the wind turbines need to be roughly designed to withstand blade icing by evaluating their designs with various ice masses for safe operation, which requires the study of the entire icing process for obtaining the possible maximum ice. In the field of aircraft industry, the degradation of aircraft due to ice accretion on aircraft wings is a major concern, thus the study of the icing evolution makes sense to the detection and de-icing system [33,34]. In the field of atmospheric icing, people also concentrate on the entire icing process, because it can be used as a base for development of more detailed prediction models and the icing load over a certain period of time can provide a vital input for designers to improve the safety of structures [35][36][37].
To sum up, it is important to take into account an entire icing evolution and a long-term prediction ability of icing forecasting models. With the developments of icing monitoring technology, massive icing data make it feasible to study the entire icing evolution of transmission lines based on data mining, which contributes to the icing prediction models. For this purpose, we suggest a new conception about a segmental icing prediction model which can solve some problems. The steps of the suggested model are roughly as follows: (1) consider the icing process as an integral whole which consists of several segments, and then summarize the types of icing evolution according to their curve shape; (2) propose personalized prediction methods for each segment, such as ascending segments, descending segments, and inflection point; (3) study the internal relationship between types and micro-meteorology; (4) for a certain terminal, estimate the type of the following ice event based on the above-mentioned relationship and weather forecast, and subsequently perform the prediction for each segment. It is no doubt that the model requires a great deal of work and it is difficult to elaborate in a single paper. Although the suggested model is barely an embryo, we can see that it combines the advantages of the mathematical meteorological model and the time-dependent model. Since meteorology is used to inferring a few evolution types rather than predicting an uncertainty range of ice thickness, and due to the separated segments, the time-dependent algorithm can avoid the problem of abrupt change, which means the ability of global prediction. Obviously, the type of icing evolution is crucial for this model, because only when we grasp the course of an incoming ice event can we perform the segment prediction. Thus, the goal of this paper is to obtain the icing evolution in the first step mentioned above. With regard to this model, the icing evolution refers to the different icing processes based on their ice thickness curve shape. Thus, in this paper, we characterize the ice thickness curves with mathematical parameters and utilize the clustering algorithm to classify ice events according to their curve shape. Eventually, we try to abstract the representative curves and summarize the icing evolution. This work may contribute to the further understanding of the entire icing process and lay a foundation for the model construction.
This paper is organized as follows. Section 2 describes the classification method including data source, parameters definition, and the clustering principle. Some specific problems of curve clustering will also be discussed and the solutions are provided in this section. Section 3 describes the procedure of clustering in detail and shows the preliminary clustering results. It also elaborates on how to select parameters for each layer, thus it will provide a clearer understanding of hierarchical clustering for readers. Section 4 is a deep research on the basis of the preliminary clustering result. It abstracts the icing evolution curves from the cluster and provides an extended comparison with other preferences. Section 5 is a brief conclusion.

Icing Process
The Overhead Transmission Line Online Icing Monitoring and Early Warning System (abbr. Icing Monitoring System), comprised of stations, communication network and terminals, was built up in China Southern Power Grid after an ice storm in 2008 [38]. With vigorous development, the terminals have been distributed throughout the heavy icing area in four provinces and have accumulated massive data up to now. The stations include one primary station (i.e., China Southern Power Grid) and five secondary stations (i.e., Guangdong, Guangxi, Yunnan, Guizhou, and EHV (Extra High Voltage) Company Station). Every terminal installed on a transmission tower contains a mechanical sensor, meteorological sensor, and camera; it can collect mechanics parameters (e.g., wind deflection angle, conductor inclination angle, tension), micrometeorological parameters (e.g., temperature, humidity, wind speed, wind direction, atmospheric pressure, rainfall, sunlight intensity), and icing image at an interval of about 10 min, 10 min, and 3 h, respectively, but the collection frequency can be adjusted according to actual requirements. The terminals are also equipped with a solar panel and storage battery to keep the uninterrupted logging active. The raw data collected by the terminal are transmitted through the network GPRS/CDMA (General Packet Radio Service and Code Division Multiple Access) to the corresponding secondary stations where mechanics parameters are used to calculate ice thickness with a mechanics-based model [39,40]. Subsequently, the raw data and ice thickness will both be transmitted to the primary station where the data will be aggregated and stored. The primary station will provide diagnosis and early warning information based on the monitoring data for the operators. In addition, the visual chart can be generated and historical data can be exported from the system for scientific research, thus it is the data source of the icing process used in this paper.
The term 'icing process' used in this paper has effectively the same meaning as 'ice accretion process', 'icing event', or 'ice event' in other references. It refers to the entire process from the appearance to the disappearance of ice accretion observed by a terminal on a transmission line. The evolution curves of temperature, T, humidity, H, and ice thickness, d, of two typical icing processes are depicted in Figure 1.
Generally, we are more concerned with the major part of the curve that is larger than 0. It is worth noting that the stage of 0 located on the left/right of the major part may greatly affect the shape of the curve. Thus, in this paper, the icing process used for analysis only extracts the major part and one nearest 0 value on the left/right of major part, but ignores the redundant 0 value. data source of the icing process used in this paper.
The term 'icing process' used in this paper has effectively the same meaning as 'ice accretion process', 'icing event', or 'ice event' in other references. It refers to the entire process from the appearance to the disappearance of ice accretion observed by a terminal on a transmission line. The evolution curves of temperature, T, humidity, H, and ice thickness, d, of two typical icing processes are depicted in Figure 1.

Data Preprocessing
In order to improve the quality of data, data preprocessing is indispensable. Firstly, delete the missing value as well as the outlier and then replace them with local polynomial interpolation. Secondly, fit the ice thickness curve using the locally weighted scatter plot smoothing to eliminate the noise [41]. Thirdly, normalize the duration and amplitude of each icing process to (0, 1) for comparison.

Characteristic Parameters
Taking the icing process in Figure 1b as an example, the characteristic parameters of a normalized icing process are depicted in Figure 2 and their mathematical definitions are as follows.
(1) Peak time, tmax, denotes the moment corresponding to the maximum ice thickness.
(2) Area under curve, A, denotes the sum of the trapezoidal numerical integration of the curve.
(3) Peak, P(P t , P d ) and valley, Q(Q t , Q d ), whose subscripts t and d represent the time and ice thickness. For a given ice thickness sequence, if it contains a local maximum d i (in addition to the global maximum (tmax, dmax)) as well as a local minimum d j which satisfies that d i − d j > 0.1, then If the two local extremum mentioned above do not exist, then (4) Maximum growing rate U(U t , U d ) and maximum melting rate V(V t , V d ). Calculate the average of ice thickness with a normalized duration taken at 0.01 to obtain an average sequence,   Generally, we are more concerned with the major part of the curve that is larger than 0. It is worth noting that the stage of 0 located on the left/right of the major part may greatly affect the shape of the curve. Thus, in this paper, the icing process used for analysis only extracts the major part and one nearest 0 value on the left/right of major part, but ignores the redundant 0 value.

Data Preprocessing
In order to improve the quality of data, data preprocessing is indispensable. Firstly, delete the missing value as well as the outlier and then replace them with local polynomial interpolation. Secondly, fit the ice thickness curve using the locally weighted scatter plot smoothing to eliminate the noise [41]. Thirdly, normalize the duration and amplitude of each icing process to (0, 1) for comparison.

Characteristic Parameters
Taking the icing process in Figure 1b as an example, the characteristic parameters of a normalized icing process are depicted in Figure 2 and their mathematical definitions are as follows.
(1) Peak time, tmax, denotes the moment corresponding to the maximum ice thickness. (2) Area under curve, A, denotes the sum of the trapezoidal numerical integration of the curve.
(3) Peak, P(Pt, Pd) and valley, Q(Qt, Qd), whose subscripts t and d represent the time and ice thickness. For a given ice thickness sequence, if it contains a local maximum di (in addition to the global maximum (tmax, dmax)) as well as a local minimum dj which satisfies that di − dj > 0.1, then If the two local extremum mentioned above do not exist, then

Hierarchical K-Means Clustering Method
Clustering algorithm is a classification system that partitions data into a certain number of clusters (groups, subsets, or categories) which have different features according to some criterions or rules [42]. K-means clustering is one of the most classic clustering algorithms whose steps are as follows: (1) Initialize k centroid to represent k clusters randomly or based on prior knowledge.
(2) Assign each data to the nearest cluster after calculating the distance to each centroid. The distance method could be Euclidean distance, Minkowski distance, city-block distance, Hamming distance, etc. This paper utilizes Euclidean distance. As the initial centroid is randomly selected, it cannot guarantee the convergence to the global optimum. To overcome this shortcoming, a general strategy for the problem is to run the algorithm repeatedly with a random initial centroid and finally select the optimal solution with minimum within-cluster sum of squared errors, SSE.
where (C 1 , C 2 , . . . ,C k ) denotes k clusters, u i denotes the centroid of the ith cluster, and x-u denotes the modulus of the vector. There are some problems to be discussed before performing clustering for curve classification. Firstly, the ranges of characteristic parameters may be different. As K-means clustering essentially distinguishes the difference among data by distance, the normalization of characteristic parameters is necessary. Secondly, in terms of the curve shape, the influence of each characteristic parameter may also be different. Taking the two icing processes in Figure 1 as an example, by comparison, the most obvious difference is that there is a repeating growing stage after the first ice-shedding in the second icing process but not in the first, thus it seems that the parameters P and Q are more effective than other parameters to highlight the difference of these two curves. However, the effects of parameters may change when processing other curves so it depends on the situation. Thirdly, the uncertainty of optimal cluster number k is a problem because we cannot determine the number of categories in advance. To solve the latter two problems, a weighting factor which highlights the effective parameters and a hierarchical clustering method which divides the data into optimal clusters by progressively increasing layers can be considered. That is to say, when performing the current layer clustering, we select a few strong parameters with corresponding weight coefficients to classify the curves into two categories, and then select other strong parameters for the following layer clustering until it achieves a good result. This process may require appropriate parameter adjustment according to the curve shape of each clustering result, but it will contribute to the overall clustering effect.

Data Set Setup
A high-quality icing process should meet a relatively long duration and a relatively large amplitude to avoid the impact of measurement errors. Therefore, in this paper, the criteria for selecting an icing process was a duration greater than 24 h and amplitude greater than 2 mm. Based on this criteria, 97 icing processes were randomly selected from the Icing Monitoring System and the waterfall curve is depicted in Figure 3. As for the specific information, the data set contains 27 terminals, 18 transmission lines (including six 110 kV, five 220 kV, and seven 500 kV), eight power supply bureaus, and three provinces. The duration varies from 24 h to 10 days and the amplitude varies from 2 mm to 50 mm. The data size of each icing process varies from 44 to 2123 and the total data size is 49,479. It is pertinent to mention that the locations of the selected terminals have complex plateau, hilly, and mountainous terrain. The period of the selected ice events is the heavy ice months with considerable transitions from December to March. Each icing process is further checked to examine their validity, such as data break and data exception. In brief, the data set has extensive sources and validity.
Energies 2020, 13, x FOR PEER REVIEW 7 of 14 clustering, we select a few strong parameters with corresponding weight coefficients to classify the curves into two categories, and then select other strong parameters for the following layer clustering until it achieves a good result. This process may require appropriate parameter adjustment according to the curve shape of each clustering result, but it will contribute to the overall clustering effect.

Data Set Setup
A high-quality icing process should meet a relatively long duration and a relatively large amplitude to avoid the impact of measurement errors. Therefore, in this paper, the criteria for selecting an icing process was a duration greater than 24 h and amplitude greater than 2 mm. Based on this criteria, 97 icing processes were randomly selected from the Icing Monitoring System and the waterfall curve is depicted in Figure 3. As for the specific information, the data set contains 27 terminals, 18 transmission lines (including six 110 kV, five 220 kV, and seven 500 kV), eight power supply bureaus, and three provinces. The duration varies from 24 h to 10 days and the amplitude varies from 2 mm to 50 mm. The data size of each icing process varies from 44 to 2123 and the total data size is 49,479. It is pertinent to mention that the locations of the selected terminals have complex plateau, hilly, and mountainous terrain. The period of the selected ice events is the heavy ice months with considerable transitions from December to March. Each icing process is further checked to examine their validity, such as data break and data exception. In brief, the data set has extensive sources and validity.
Intuitively, it seems that Figure 3 is irregular and chaotic. But as to the individual, the curves have some distinguishable features. The normalized curves after data preprocessing are depicted in Figure 4a and the detailed clustering procedure is as follows.

First Layer Clustering
Obviously, the existence of a repeating growing stage might be an easily identifiable feature, which looks like a curve forming multiple peaks. The purpose of the first layer clustering on ice thickness curves is to distinguish single-peak and multi-peak ice events. For single-peak ice thickness curves, the characteristic parameters P and Q are 0. Therefore, P and Q are selected in the first layer clustering, and their weight coefficients are shown in Equation (8). Intuitively, it seems that Figure 3 is irregular and chaotic. But as to the individual, the curves have some distinguishable features. The normalized curves after data preprocessing are depicted in Figure 4a and the detailed clustering procedure is as follows. into three clusters including two early rising (denoted A-b1), nine middle protruding (denoted A-b2), and 40 late descending (denoted A-b3).

Third Layer
Energies 2020, 13, x FOR PEER REVIEW 10 of 14

Icing Evolution Clustering Centroid Curves
As shown in Section 3, 97 icing processes in China Southern Power Grid are clustered into six categories according to their curve shape. Although the curves in the same category have a similar outline, further abstraction into a representative curve is necessary for the icing evolution. The centroid can describe the nature of a cluster of points, but all centroids that we obtained above only contain a portion of parameters. It is worth noting that the centroid is approximated to the mean of points of the cluster in K-means clustering, so the integrated centroids containing all 11 parameters can be obtained from each category by averaging the characteristic parameters. The integrated centroids are listed in Equation (16). As the integrated centroids contain full detail of the icing processes, we can draw the corresponding representative curves of icing evolution, which are depicted in Figure 5.

First Layer Clustering
Obviously, the existence of a repeating growing stage might be an easily identifiable feature, which looks like a curve forming multiple peaks. The purpose of the first layer clustering on ice thickness curves is to distinguish single-peak and multi-peak ice events. For single-peak ice thickness Energies 2019, 12, 4786 9 of 14 curves, the characteristic parameters P and Q are 0. Therefore, P and Q are selected in the first layer clustering, and their weight coefficients are shown in Equation (8).
Multiply the elements in R 1 by the corresponding coefficient in coeff 1 to form a n × 4 matrix composed of P and Q for the first layer clustering (n denotes the total of curves), and then perform the K-means clustering on the matrix. The result of the first layer clustering on 97 icing processes is depicted in Figure 4b,c, and the clustering centroid is in Equation (9). It can be seen from Figure 4b,c that the first layer clustering divides 97 curves into 75 single-peak curves and 22 multi-peak curves.

Second Layer Clustering
For single-peak curves, the difference of saturation period tends to cause a different curve shape. A short saturation period means that the curve has a sharp peak, whereas a long saturation period makes the curve flat. Meanwhile, under the premise of normalization, the longer saturation period means that the area under the curve may be larger and vice versa. Thus, the selected characteristic parameters and the weight coefficients of the second layer clustering are shown in Equation (10).
For multi-peak curves, although they all have a repeating descending-ascending process, their extent of ice-shedding might be different. Actually, the large-scale ice-shedding is usually caused by artificial ice melting, so the curve may decline by a big margin, possibly as low as 0. On the contrary, the small-scale ice-shedding may be caused by a temporary temperature rise, strong wind, mechanical vibration, and so on, so it turns to the ascending stage soon after a small decline and forms an oscillation curve. Thus, the parameters P and Q are essential for this layer and their corresponding weight coefficients are shown in Equation (11).
The result of the second layer clustering on 75 single-peak curves is depicted in Figure 4d,e, and the clustering centroid in Equation (12). The 75 single-peak curves are divided into 24 saturation curves (denoted A-a) and 51 unsaturation curves (denoted A-b).
The result of the second layer clustering on 22 multi-peak curves is depicted in Figure 4f,g, and the clustering centroid in Equation (13). The 22 multi-peak curves are divided into 16 melting curves (denoted B-a) and six oscillation curves (denoted B-b).

Third Layer Clustering
As for the third layer clustering, the unsaturation curves may require further classification to make the curve shape clearer. Although the unsaturation curves all have a sharp peak, the period of the formation of the peak may be different, perhaps in the early, middle, or late stages of the event.
If the peak appears in the early stage, it means that the curve may have experienced a rapid rising in the early stages to reach the maximum and will experience a long time of ice-shedding. Similarly, the conclusion can be drawn when the peak appears in the middle or late stage. Thus, the selected characteristic parameters and weight coefficients are shown in Equation (14).
The results of the third layer clustering on unsaturation curves are depicted in Figure 4h,i,j, and the clustering centroid in Equation (15). As shown in Figure 4h,i,j, 51 unsaturation curves are divided into three clusters including two early rising (denoted A-b1), nine middle protruding (denoted A-b2), and 40 late descending (denoted A-b3).

Icing Evolution Clustering Centroid Curves
As shown in Section 3, 97 icing processes in China Southern Power Grid are clustered into six categories according to their curve shape. Although the curves in the same category have a similar outline, further abstraction into a representative curve is necessary for the icing evolution. The centroid can describe the nature of a cluster of points, but all centroids that we obtained above only contain a portion of parameters. It is worth noting that the centroid is approximated to the mean of points of the cluster in K-means clustering, so the integrated centroids containing all 11 parameters can be obtained from each category by averaging the characteristic parameters. The integrated centroids are listed in Equation (16). As the integrated centroids contain full detail of the icing processes, we can draw the corresponding representative curves of icing evolution, which are depicted in Figure 5.
From an intuitive point of view, the abstracted representative curves are similar to the original cluster of curves, but clearer for understanding the entire icing process. Similarly to Section 3, the representative curves further indicate that the icing evolution of ice events is probably different indeed, which makes sense for the icing forecasting. nodes and then utilize the time-dependent model to predict the rising segment (~), the saturated segment (~), and the descending segment (~), respectively, without worrying about the error caused by the abrupt change of curve. In terms of the time-dependent model, it also makes sense because it improves the global prediction ability. As for how to determine the type of ice event and the location of nodes, it may require some meteorological parameters as they are utilized in the mathematical meteorological model. Actually, these are the contents of future work but not the purpose of this paper, thus we will not discuss them specifically. Figure 5. Icing evolution clustering centroid curves: (I) single-peak saturation; (II) single-peak early rising; (III) single-peak middle protruding; (IV) single-peak late descending; (V) multi-peak melting; (VI) multi-peak oscillation.

Conclusions
In this paper, in order to overcome the problems of existing models, we put forward a new conception of a segmental icing prediction model and focused on the first step of model construction, namely, studying the types of icing evolution. This work sought to obtain the classification of transmission line icing processes according to the curve shape. A hierarchical K-means clustering method was utilized and 11 characteristic parameters were proposed; a detailed clustering procedure Figure 5. Icing evolution clustering centroid curves: (I) single-peak saturation; (II) single-peak early rising; (III) single-peak middle protruding; (IV) single-peak late descending; (V) multi-peak melting; (VI) multi-peak oscillation.
All six clustering centroid curves consist of several segments and we marked the serial number on the inflection points, also known as nodes. For example, curve I consists of four nodes and three segments including ascending, saturated, and descending segments. Similarly, curve II, III, and IV are composed of three nodes and two segments including ascending and descending segments. Also curve V and VI are composed of five nodes and four segments. Actually, the obtained types of icing evolution are expected to further reinforce the suggested conception of a segmental icing prediction model.
Compared with other references, the results in the present paper may be more comprehensive and specific. For example, Wang [29] studies the wire icing evolution using the simulation of a wet snow ice accumulation model and shows that the growth rate of ice accretion may be affected under different meteorology conditions. To some extent, the types of icing evolution in the present paper also reflect the same characteristic, but the simulation simulates growth and shedding separately while the present paper studies the entire icing evolution based on big data. Darge [36] measures an ice event by an ice scale and analyzes its development in different processes, which is consistent with type IV (single-peak late descending) in the present paper. However, it does not mention other types of icing evolution while we achieve a more detailed and complete result in this wok. Rashid [37] studies the atmospheric icing in a complex terrain and plots an evolution curve of the daily average ice load growth based on the records within a period of 31 days. However, the measurement object may not be fixed as it considers the average ice load in a region while the present paper focuses on the wire. Moreover, some researchers [4,[15][16][17][18][19][20][21] were more concerned about the icing mechanisms and paid little attention to further summarizing what kinds of entire icing evolution happened. To sum up, the work is important for further understanding of the entire icing evolution of the transmission line icing events and it can provide a methodology for obtaining icing evolution for other fields.
With regard to the prediction, the work also reveals that it is feasible to divide an incoming ice event into several segments and subsequently make specific predictions at different stages. For example, if the incoming ice event is known to be curve I, we can try to determine the position of the nodes and then utilize the time-dependent model to predict the rising segment ( 1 ~2 ), the saturated segment ( 2 ~3 ), and the descending segment ( 3 ~4 ), respectively, without worrying about the error caused by the abrupt change of curve. In terms of the time-dependent model, it also makes sense because it improves the global prediction ability. As for how to determine the type of ice event and the location of nodes, it may require some meteorological parameters as they are utilized in the mathematical meteorological model. Actually, these are the contents of future work but not the purpose of this paper, thus we will not discuss them specifically.

Conclusions
In this paper, in order to overcome the problems of existing models, we put forward a new conception of a segmental icing prediction model and focused on the first step of model construction, namely, studying the types of icing evolution. This work sought to obtain the classification of transmission line icing processes according to the curve shape. A hierarchical K-means clustering method was utilized and 11 characteristic parameters were proposed; a detailed clustering procedure on 97 icing processes was introduced. Eventually, the types of icing evolution are summarized based on the clustering result. The main conclusions are as follows.
(1) In total, 97 icing processes derived from the Icing Monitoring System were clustered into six categories, which can be summarized as single-peak saturation, single-peak early rising, single-peak middle protruding, single-peak late descending, multi-peak melting, and multi-peak oscillation, respectively. It indicates that processes of ice events are probably different. (2) The abstracted representative curves based on the centroids of clusters provided a visualized result that an entire icing process can be considered as a combination of several segments and nodes, which further reinforces the conception of a segmental icing prediction model. (3) The types of icing evolution were obtained based on the monitoring data and clustering. Compared with other researches, the work is more comprehensive and specific, which contributes to further understanding of the wire icing evolution and provides a methodology of types of icing evolution for other fields.
This work lays the foundation for the model construction. Future work may include the selection strategy for different segments based on time-dependent algorithms in order to improve the prediction accuracy. Additional work may also entail data mining in order to master the relationship between types of icing evolution and meteorology. When completing fundamental works, the suggested model will be performed and compared to other models.