The Application of Hierarchical Clustering to Power Quality Measurements in an Electrical Power Network with Distributed Generation

: This article presents the application of data mining (DM) to long-term power quality (PQ) measurements. The Ward algorithm was selected as the cluster analysis (CA) technique to achieve an automatic division of the PQ measurement data. The measurements were conducted in an electrical power network (EPN) of the mining industry with distributed generation (DG). The obtained results indicate that the application of the Ward algorithm to PQ data assures the division with regards to the work of the distributed generation, and also to other important working conditions (e.g., reconﬁguration or high harmonic pollution). The presented analysis is conducted for the area-related approach—all measurement point data are connected at an initial stage. The importance rate was proposed in order to indicate the parameters that have a high impact on the classiﬁcation of the data. Another element of the article was the reduction of the size of the input database. The reduction of input data by 57% assured the classiﬁcation with a 95% agreement when compared to the complete database classiﬁcation.


Introduction
A smart grid can be seen as the future of electrical power systems [1][2][3]. A smart grid requires the monitoring and cooperation of more and more elements, devices, and systems. Thus, it introduces the need for analyzing an increasing amount of data. Single parameter analysis, conducted by humans, has become a thing of the past in terms of the functioning of an electrical power system (EPS). Thus, a need for tools to support the long-term assessment has become very necessary [4][5][6][7].
This research is a continuation of previous work [8], which involves a method for analyzing long-term power quality (PQ) data using non-hierarchical clustering and its assessment using global indices in [9]. The presented results in Jasiński et al. [8] were based on 72 cases of clustering, which differ in terms of both the number of clusters (2/25), and also the distance definition of the items in the database (Euclidean, Chebyshev) for the K-mean algorithm. The different constructions of the database were discussed. The direct impact of the distributed generation (DG) on the PQ conditions was obtained when clustering using the K-mean algorithm and the Euclidean distance for non-standardized data that are extended by power consumption, using database C: frequency (f), voltage variations (U), short term flicker severity (P st ), asymmetry (k u2 ), total harmonic distortion in voltage (THD U ), active power (P). Thus, in this article, the same input database was selected. However, the Ward algorithm is presented in this research, which represents the hierarchical approach. Additionally, this work contains an analysis of the importance rate in order to indicate which parameters have an impact on the final classification. The comparison of clusters, which represent different working conditions of the electrical power network (EPN), obtained automatically, was only conducted for the indicated parameters with a high importance rate but not using a global index, which includes all the parameters as in [9]. Additionally, the next novelty of this work is the proposition of reducing the input database without losing data features. The proposed reduction to one value, instead of three phase-to-phase parameters, assured the classification with a 95% agreement when compared to the complete database classification.
The article is organized into four sections. Section 2 presents the state of the art of literature review. Section 3 describes the definitions and techniques of cluster analysis (CA), with special consideration for the Ward algorithm. Also, Section 3 contains the description of the research object-the EPN of the mining industry with gas-steam units and conducted long-term PQ measurements. Additionally, Section 3 contains the application of the Ward algorithm to PQ data and the results of the analysis with regards to the different working conditions of the EPN. The final element of Section 3 presents a discussion of the obtained results. Section 4 highlights the conclusions.

Cluster Analysis-Ward Algorithm
Generally, the definition of data mining in the literature concerns the achievement of knowledge from big databases. Possible algorithms and techniques are well-known and described in the literature. Examples of data mining techniques are [68][69][70][71][72] One of the described techniques is cluster analysis, also known as clustering [73]. The main aim of cluster analysis is to achieve homogeneous groups (clusters) of data as defined by Witten et al. and Wu et al. in [74,75]. The homogeneous aspect of the group is defined by the similarity or dissimilarity level of the data in the same cluster. There are a lot of data similarity/dissimilarity conditions that can be selected. However, due to the grouping process approach, two basic methods of dividing are known: In this article, the hierarchical method is presented. Hierarchical approaches are agglomeration or divisive techniques. This article presents the agglomerative approach. Agglomerative techniques represent a set of observations in which each piece of data is treated as a separate cluster at the beginning. Then, the data are aggregated into a smaller number of clusters until one single cluster is established, which represents all the data [73]. The possible methods for connecting data into clusters are [73,76]: The hierarchical method is selected because the agglomerative sequence is presented on a dendrogram. It is, therefore, possible to analyze if the connection is better realized by single data or by a group of similar data (achieved in the previous agglomeration) to get a final classification. The authors selected the Ward algorithm due to its features. Clustering is carried out in order to connect data concentrated in an average value until the data has a similar value (range). The hierarchical cluster analysis algorithm using the Ward method of minimal variance is presented in Figure 1.
In this paper, the hierarchical Ward method and non-hierarchical method based on the K-mean algorithm are proposed for the power quality data analysis. The indicated "finding pair of clusters which have the smallest sum of squares distance between the object and the cluster center to which this object belongs", is calculated as presented in Equation (1) [77]. D pr = n p + n r n p + n q + n r * d pr + n q + n r n p + n q + n r * d qr + −n r n p + n q + n r * d pq (1) where: D pr -distance of the new cluster to cluster of number "r", r-proceed numbers of cluster from "p" to "q", d pr -distance of primary cluster "p" from cluster "r", d qr -distance of primary cluster "q" from cluster "r", d pq -common distance of primary clusters "p" and "q", n-number of single objects in each object. initiating agglomeration cluster analysis division into m clusters from m data calculating the distance between each pair of clusters creating a symmetrical D matrix consisting of distances find a pair of clusters which have the smallest sum of squares of distance between each object and the cluster center to which this object belongs create a new cluster connecting the indicated two clusters update matrix D by the distance between the new cluster and other clusters remove elements related to previously connected clusters is the number of clusters equal to 1?
obtaining one cluster containing m data YES NO Figure 1. Cluster analysis using the Ward method of minimum variance [73,78].
Additionally, the advantage of the Ward algorithm is that it can be stopped at any moment; it can also achieve a classification represented by the excepted number of clusters. Thus, the final number of clusters should be selected in accordance with the aim of the classification. In order to support the final number of clusters, a lot of approaches have been conducted in literature. The most known are [79]:  a dendrogram is analyzed in terms of the difference in distance between successive clusters. A big value of difference means that the data in the cluster are various. Thus, the division ends when the difference in the distance is maximal  if a clear flattening (log vertical line) can be observed on the dendrogram, it means that in this point the clusters are distant and it is the best point for division  an approach based on the root-mean-square standard deviation

An Electrical Power Network of the Mining Industry and the Source of the PQ Data
The PQ data used in the investigation concerns real measurements made in substations of the copper industry's electrical power network. The 110-kV substation of the mining industry works in a four-section system in cooperation with the four transformers (T1, T2, T3, T4). Normally, all the transformers are supplied from a different 110 kV section. However, during the measurements, the T4 transformer was not loaded. Thus: Additionally, the advantage of the Ward algorithm is that it can be stopped at any moment; it can also achieve a classification represented by the excepted number of clusters. Thus, the final number of clusters should be selected in accordance with the aim of the classification. In order to support the final number of clusters, a lot of approaches have been conducted in literature. The most known are [79]: • a dendrogram is analyzed in terms of the difference in distance between successive clusters. A big value of difference means that the data in the cluster are various. Thus, the division ends when the difference in the distance is maximal The PQ data used in the investigation concerns real measurements made in substations of the copper industry's electrical power network. The 110-kV substation of the mining industry works in a four-section system in cooperation with the four transformers (T1, T2, T3, T4). Normally, all the transformers are supplied from a different 110 kV section. However, during the measurements, the T4 transformer was not loaded. Thus: • substations R-1 work independently • substations R-2 work independently • substations R-3 and R-4 are coupled The presented PQ data concerns four weeks of measurements from 27th of April to 25th of May. The measurements were conducted synchronously with class A PQ recorders [80]. This is more than the classical one week of observation time, and therefore, the PQ data may consist of different working conditions of the analyzed electrical power network of the mining industry [77]. Thus, the different working conditions may be connected to: MAIN LOADS: • welding machines • conveyor belts • drainage pumps DISTRIBUTED GENERATION: • combined heat and power (CHP) • gas-steam units Thus, the PQ measurements include the analysis of the PQ level, which concerns the impact of the DG and main load (welding machine) on the medium voltage (MV) network. The simplified scheme of the copper industry network, showing the localization of power quality recorders installed in selected bays and the localization of DG, is presented in Figure 2. The PQ recorders involve the measurements of transformers at 6 kV side (T1, T2, T3) and an outcoming feeder to a welding machine (WM). The presented PQ data concerns four weeks of measurements from 27th of April to 25th of May. The measurements were conducted synchronously with class A PQ recorders [80]. This is more than the classical one week of observation time, and therefore, the PQ data may consist of different working conditions of the analyzed electrical power network of the mining industry [77]. Thus, the different working conditions may be connected to: MAIN LOADS:  welding machines  conveyor belts  drainage pumps DISTRIBUTED GENERATION:  combined heat and power (CHP)  gas-steam units Thus, the PQ measurements include the analysis of the PQ level, which concerns the impact of the DG and main load (welding machine) on the medium voltage (MV) network. The simplified scheme of the copper industry network, showing the localization of power quality recorders installed in selected bays and the localization of DG, is presented in Figure 2. The PQ recorders involve the measurements of transformers at 6 kV side (T1, T2, T3) and an outcoming feeder to a welding machine (WM). It is important to note that the local generation is connected at the 6 kV level and that it consists of heat, a powerplant (G1-10 MW CHP), and steam-gas generation units (G3-15 MW gas unit and G2−13,5 MW steam unit). During the measurements, G1 was out of order and the level of generation of G2 and G3 was changing. The level of DG power (G1, G2, G3) and active power transformers (T1, T2, T3, WM) at the MV level are presented in Figure 3. It is important to note that the local generation is connected at the 6 kV level and that it consists of heat, a powerplant (G1-10 MW CHP), and steam-gas generation units (G3-15 MW gas unit and G2−13,5 MW steam unit). During the measurements, G1 was out of order and the level of generation of G2 and G3 was changing. The level of DG power (G1, G2, G3) and active power transformers (T1, T2, T3, WM) at the MV level are presented in Figure 3.

Parameters Included to the Input Database
For the implementation of hierarchical cluster analysis, the Ward algorithm was used. The reason for this is due to the fact that the data assigned to clusters are characterized by the smallest variation of results (minimum variance of data in clusters). A data set for clustering consisted of the following PQ parameters: short-term flicker severity (Pst)  asymmetry (ku2)  total harmonic distortion in voltage (THDu)  active power level (P) The indicated database consists of parameters, which are considered in the classical PQ assessment in accordance with the standard EN 50160 [81] but were extended to the active power in the measuring points. The noticeable change was the use of short-term flicker severity in place of long-term flicker severity. This change is connected with the time aggregations of the parameters; the long-term severity has 2 h, and the short-term one has 10 min [82,83]. Thus, the application of short term flicker severity enables a database consisting of parameters that are aggregated with 10 min intervals to be built, as is demanded in the standard of International Electrotechnical Commission (IEC) 61000-4-30 [80]. The analyzed measurement data were divided into flagged and unflagged data in accordance with the flagging concept of the standard IEC 61000-4-30 [80]. The data that was input to the CA were free of voltage events.
Additionally, due to the feature of the Ward algorithm that involves the fact that clustering is conducted in order to connect data concentrated in an average value until the data has similar values (range), the standardization process was proposed. The standardization of the parameters aims to obtain unified values by dividing the current value of a particular element of the time series by their maximum values. The decision concerning standardizing data to the average value reduces the problem with regards to different ranges and units of the PQ parameters. The standardize division 0-1 assures the possibility of comparing the changeability of the parameters.

Parameters Included to the Input Database
For the implementation of hierarchical cluster analysis, the Ward algorithm was used. The reason for this is due to the fact that the data assigned to clusters are characterized by the smallest variation of results (minimum variance of data in clusters). A data set for clustering consisted of the following PQ parameters: short-term flicker severity (P st ) • asymmetry (k u2 ) • total harmonic distortion in voltage (THDu) • active power level (P) The indicated database consists of parameters, which are considered in the classical PQ assessment in accordance with the standard EN 50160 [81] but were extended to the active power in the measuring points. The noticeable change was the use of short-term flicker severity in place of long-term flicker severity. This change is connected with the time aggregations of the parameters; the long-term severity has 2 h, and the short-term one has 10 min [82,83]. Thus, the application of short term flicker severity enables a database consisting of parameters that are aggregated with 10 min intervals to be built, as is demanded in the standard of International Electrotechnical Commission (IEC) 61000-4-30 [80]. The analyzed measurement data were divided into flagged and unflagged data in accordance with the flagging concept of the standard IEC 61000-4-30 [80]. The data that was input to the CA were free of voltage events.
Additionally, due to the feature of the Ward algorithm that involves the fact that clustering is conducted in order to connect data concentrated in an average value until the data has similar values (range), the standardization process was proposed. The standardization of the parameters aims to obtain unified values by dividing the current value of a particular element of the time series by their maximum values. The decision concerning standardizing data to the average value reduces the problem with regards to different ranges and units of the PQ parameters. The standardize division 0-1 assures the possibility of comparing the changeability of the parameters.

Clustering to Indicate Different Working Conditions of the EPN
For the defined input database, the clustering with the Ward algorithm was carried out using the Statistica 13 program (StatSoft Polska, Kraków Polska). Figure 4 presents the CA dendrogram. The time results of clustering are presented in Figures 5-9, which show a defined final number of clusters equal to 2, 3, 4, 5, and 6. This selection of the number of clusters was realized using the dendrogram (Figure 4). The authors decided to indicate the cluster that has a connection distance greater than 100. Thus, no clusters equal or less than 6 were investigated. In the figures, the "virtual" cluster 0 was defined, which represents the data that was flagged in the initial stage. Using knowledge about the object, different working conditions, which may affect the data classification, were defined: • working or non-working of distributed generation (G2, G3)-the knowledge was obtained from a monitoring system of gas-steam units:  Figure 4 presents the CA dendrogram. The time results of clustering are presented in Figure 5-Figure 9 , which show a defined final number of clusters equal to 2, 3, 4, 5, and 6. This selection of the number of clusters was realized using the dendrogram (Figure 4). The authors decided to indicate the cluster that has a connection distance greater than 100. Thus, no clusters equal or less than 6 were investigated. In the figures, the "virtual" cluster 0 was defined, which represents the data that was flagged in the initial stage. Using knowledge about the object, different working conditions, which may affect the data classification, were defined:  working or non-working of distributed generation (G2, G3)-the knowledge was obtained from a monitoring system of gas-steam units:       Figure 4 presents the CA dendrogram. The time results of clustering are presented in Figure 5-Figure 9 , which show a defined final number of clusters equal to 2, 3, 4, 5, and 6. This selection of the number of clusters was realized using the dendrogram (Figure 4). The authors decided to indicate the cluster that has a connection distance greater than 100. Thus, no clusters equal or less than 6 were investigated. In the figures, the "virtual" cluster 0 was defined, which represents the data that was flagged in the initial stage. Using knowledge about the object, different working conditions, which may affect the data classification, were defined:  working or non-working of distributed generation (G2, G3)-the knowledge was obtained from a monitoring system of gas-steam units:                           Table 1 shows a summary of the analyzed working conditions and the assignment of clusters. The three conditions previously mentioned (DG working, reconfiguration, maintenance breaks) were indicated, and one unknown condition was observed. This unknown condition was indicated for the final number of clusters equal to at least 4. The reconfiguration of the EPN connection was indicated for the final number of clusters equal to 5. The impact of the DG and maintenance breaks was observed for all the presented classifications. There is an obvious question concerning which of the input parameters was important with regards to the obtained final classification. Thus, the predictor importance analysis using the Statistica 13 software (in accordance with the guidelines of a StatSoft Polska [78] and Breiman et. al. [84]) was realized for the classification of the 6 clusters. The results are presented in Figure 10. The results show that the highest impact (importance rate > 0.7) is for: • the active power level for the transformers T1, T2, and T3 • the total harmonic distortion in the voltage for transformer T3 and the welding machine-WM • the short-term flicker severity for the transformers T2, T3, and the welding machine-WM Figure 10. Importance rate of the factors to the output of the cluster analysis results for the final number of clusters equal to 6.

Qualitative Assessment of Clusters
A comparison of all the measurement points for each parameter in the database would lead to the analysis of the changeability of 48 parameters for each of the six clusters. Therefore, the authors suggest only analyzing those PQ parameters that were indicated as important with regards to the obtained classification (according to the predictor importance rate). Table 2 contains the comparison of the selected PQ parameters for each cluster in terms of the mean, minimal, maximal, and standard deviation values.

Qualitative Assessment of Clusters
A comparison of all the measurement points for each parameter in the database would lead to the analysis of the changeability of 48 parameters for each of the six clusters. Therefore, the authors suggest only analyzing those PQ parameters that were indicated as important with regards to the obtained classification (according to the predictor importance rate). Table 2 contains the comparison of the selected PQ parameters for each cluster in terms of the mean, minimal, maximal, and standard deviation values.    where: minimal-the minimal value of the parameter that may be found for the observed cluster maximal-the maximal value of the parameter that may be found for the observed cluster mean-the mean value calculated from all the data for the observed cluster standard deviation-the standard deviation calculated from all the data for the observed cluster.
A comparison of the level of the PQ parameters for different clusters is equivalent to the comparison of the different working conditions of an electrical power network. The examples of such a comparison may be as follows:


(c1 with c2) and (c4 with c5)-> comparison of time with the different characters of the company that iss working (exploitation vs. maintenance break). It could be observed that the mean value of Pst for T3 and WM is lower during the maintenance break. Therefore, in terms of flicker severity, the time of maintenance is better.  (c1 with c2) and (c4 with c5)->comparison of time with the different characters of the company that is working (exploitation vs. maintenance break). It could be observed that the mean value of THDu for T3 and WM are lower during the maintenance break. Therefore, in terms of the harmonic content, the time of maintenance is better.  (c1 with c4) and (c2 with c5)-> comparison of time with the different characters of the working DG. It could be observed that Pst for T3 and WM is lower for the time when the DG is working (c1, c2) compared to when the DG is switched off (c4, c5). Therefore, in terms of flicker severity, the time when the DG is working is better.  c3 with all other clusters-> this unknown working condition represents the time when the THDu level for T3 and WM is higher than for the other clusters.  c6 with all other clusters-> the reconfiguration that represents the time when Pst for T2 is very low. This is in agreement with the fact that T2 was underloaded, and therefore, the flicker is small The presented examples about the comparison of the level of the PQ parameters for different clusters assure simplified information concerning the differences between working conditions. However, the working condition for defining the cluster c3 is unknown, but due to the indicated analysis, it is possible to define that during this time there was a higher than normal level of harmonics for T3 and WM. Thanks to this, attention could be paid to this time in order to find the reason for such high harmonic content and to reduce it in the future. Additionally, after automatic classification of the data, it is possible to show the impact of DG on the level of power quality in the electrical power network of the mining industry.
where: minimal-the minimal value of the parameter that may be found for the observed cluster maximal-the maximal value of the parameter that may be found for the observed cluster mean-the mean value calculated from all the data for the observed cluster standard deviation-the standard deviation calculated from all the data for the observed cluster.
A comparison of the level of the PQ parameters for different clusters is equivalent to the comparison of the different working conditions of an electrical power network. The examples of such a comparison may be as follows: • (c1 with c2) and (c4 with c5)-> comparison of time with the different characters of the company that iss working (exploitation vs. maintenance break). It could be observed that the mean value of P st for T3 and WM is lower during the maintenance break. Therefore, in terms of flicker severity, the time of maintenance is better. • (c1 with c2) and (c4 with c5)->comparison of time with the different characters of the company that is working (exploitation vs. maintenance break). It could be observed that the mean value of THDu for T3 and WM are lower during the maintenance break. Therefore, in terms of the harmonic content, the time of maintenance is better. • (c1 with c4) and (c2 with c5)-> comparison of time with the different characters of the working DG. It could be observed that P st for T3 and WM is lower for the time when the DG is working (c1, c2) compared to when the DG is switched off (c4, c5). Therefore, in terms of flicker severity, the time when the DG is working is better. • c3 with all other clusters-> this unknown working condition represents the time when the THDu level for T3 and WM is higher than for the other clusters. • c6 with all other clusters-> the reconfiguration that represents the time when P st for T2 is very low. This is in agreement with the fact that T2 was underloaded, and therefore, the flicker is small The presented examples about the comparison of the level of the PQ parameters for different clusters assure simplified information concerning the differences between working conditions. However, the working condition for defining the cluster c3 is unknown, but due to the indicated analysis, it is possible to define that during this time there was a higher than normal level of harmonics for T3 and WM. Thanks to this, attention could be paid to this time in order to find the reason for such high harmonic content and to reduce it in the future. Additionally, after automatic classification of the data, it is possible to show the impact of DG on the level of power quality in the electrical power network of the mining industry.

Reduction of the Input Database Size-Case Study
The natural question is, "is it possible to reduce or change the structure of the input database without losing the most important information". The first idea is just to exclude some parameters. However, the proposed, complete database includes, all-important points of the classical PQ parameters. Thus, excluding any of them would not seem to be adequate from the technical point of view.
In this research, the objects are represented by similar phase-to-phase values. Thus, the analysis of only one "new-multiphase" value was conducted. Moreover, the way of conducting this may be different. The minimal, maximal, mean, or median value from three phase-to-phase values may be selected. However, in this research, the authors decided to use the mean value. Thus, for each 10 min data of: short-term flicker severity • total harmonic distortion in voltage • active power the mean value from all three phase-to-phase values was calculated.
After such a reduction-from 16 input parameters (complete database) for each measurement point to six input parameters (reduced database)-clustering was conducted. The result of the obtained cluster using the six-parameter database, in comparison to 14-parameter clustering, is presented in Table 3. Generally, the results of this reduction in terms of indicating the same working condition for more than two clusters are positive. The obtained classification has the same result for at least 94.9% of data. The only negative classification was obtained for two clusters. The averaged data during the division to two clusters was not sensitive for DG impact. Additionally, the predictor importance for six clusters was defined. Figure 11 presents the importance rate for both classifications-(a) reduced input database, (b) complete input database. Generally, regarding the 0.7 importance rate level (noticeable importance rate), the same parameters were indicated: • transformer T1-active power • transformer T2-active power • transformer T3-active power, total harmonic distortion in voltage, short-term flicker severity • welding machine WM-active power, total harmonic distortion in voltage, short-term flicker severity The only excluded parameter is the short-term flicker severity for transformer T2. However, the importance rate is close to 0.7.
To summarize, the size of the database has been reduced from 14 parameters to six parameters, and the obtained results are generally similar.

Discussion
The data mining technique presented in the article is the cluster analysis. The Ward algorithm was selected as an example of the hierarchical approach. During the data preparation stage, it was necessary to uniform the data aggregation time (selection of Plt to Pst), as well as to standardize the parameter values. For the prepared data set containing both the PQ parameters and the active power level, cluster analysis was conducted.
As a result of the cluster analysis, a dendrogram was obtained, which was illegible for the initial stages of agglomeration due to a large amount of input data. This is an unquestionable disadvantage of the hierarchical approach, but it is worth noting that it provides a division of data regardless of the final number of the obtained clusters. Additionally, on the dendrogram, there is a simple possibility of selecting the final number of clusters using methods indicated in the literature, e.g., Aggarwal [79].
Another important element of the article was to indicate the conditions that influenced the data division. On the basis of knowledge about the object, the conditions of distributed generation working, reconfiguration, and maintenance breaks were known. However, the obtained classification indicated that, in terms of the PQ level, the relevant condition was not known. It is worth highlighting the fact that the Ward algorithm is sensitive to the impact of the distributed generation on the technical conditions of the electrical power network, which confirms that the research aim was specified correctly.
The next element of the article was the analyses of the parameters that have a higher impact on the data classification. The obtained results indicated the importance of an active power level, as well as the harmonic level and flicker. The voltage variations, voltage, and frequency levels had a small impact on the classification.
Then, after obtaining the importance ranking, a comparison of the clusters in terms of the selected PQ parameters was carried out. The obtained results presented the impact of DG on the EPN. The impact of DG was indicated as positive regarding PQ. The unknown working condition was described as a time with high total harmonic distortion at the voltage level. Thus, the analysis of only this selected period of time may help to decrease the problem with harmonic pollution.
The last part of the research concerned the possibility of reducing the input database without losing the information obtained from the clustering. The authors proposed reducing the three phaseto-phase values to one mean value. Then, the comparison of the reduced input database to the completed one was conducted. The obtained classifications were similar. Around 95% of data was connected to the same clusters for both input databases and classification to more than two groups. The presented approach decreased the size of the input database by 57% (from fourteen to six parameters) without losing any data features.

Discussion
The data mining technique presented in the article is the cluster analysis. The Ward algorithm was selected as an example of the hierarchical approach. During the data preparation stage, it was necessary to uniform the data aggregation time (selection of P lt to P st ), as well as to standardize the parameter values. For the prepared data set containing both the PQ parameters and the active power level, cluster analysis was conducted.
As a result of the cluster analysis, a dendrogram was obtained, which was illegible for the initial stages of agglomeration due to a large amount of input data. This is an unquestionable disadvantage of the hierarchical approach, but it is worth noting that it provides a division of data regardless of the final number of the obtained clusters. Additionally, on the dendrogram, there is a simple possibility of selecting the final number of clusters using methods indicated in the literature, e.g., Aggarwal [79].
Another important element of the article was to indicate the conditions that influenced the data division. On the basis of knowledge about the object, the conditions of distributed generation working, reconfiguration, and maintenance breaks were known. However, the obtained classification indicated that, in terms of the PQ level, the relevant condition was not known. It is worth highlighting the fact that the Ward algorithm is sensitive to the impact of the distributed generation on the technical conditions of the electrical power network, which confirms that the research aim was specified correctly.
The next element of the article was the analyses of the parameters that have a higher impact on the data classification. The obtained results indicated the importance of an active power level, as well as the harmonic level and flicker. The voltage variations, voltage, and frequency levels had a small impact on the classification.
Then, after obtaining the importance ranking, a comparison of the clusters in terms of the selected PQ parameters was carried out. The obtained results presented the impact of DG on the EPN. The impact of DG was indicated as positive regarding PQ. The unknown working condition was described as a time with high total harmonic distortion at the voltage level. Thus, the analysis of only this selected period of time may help to decrease the problem with harmonic pollution.
The last part of the research concerned the possibility of reducing the input database without losing the information obtained from the clustering. The authors proposed reducing the three phase-to-phase values to one mean value. Then, the comparison of the reduced input database to the completed one was conducted. The obtained classifications were similar. Around 95% of data was connected to the same clusters for both input databases and classification to more than two groups. The presented approach decreased the size of the input database by 57% (from fourteen to six parameters) without losing any data features.
The presented in-article object represents a symmetrical network, although, the method may be realized successfully for highly asymmetrical grids. Thus, if any of the phase-to-phase value was changed, the mean value of all parameters also changed. The CA is sensitive for the differences so this situation would also be indicated. The only disadvantage of this method is that there would be no information on which phase caused this situation, thus the analysis of raw data, but for the indicated period of time, is desirable.

Conclusions
The article presents the application of cluster analysis to long-term power quality measurements obtained in an electrical power network of the mining industry with distributed generation. The selected algorithm, due to its sensitivity to data dissimilarity, was the Ward algorithm. The article contains a discussion of the pros and cons of the hierarchical approach.
The article also contains the analysis of the sensitivity of different (known) working conditions of an electrical power network of the mining industry to the obtained classification. Conditions such as the impact of distributed generation, reconfiguration appearance, or the character of the object schedule (exploitation or maintenance breaks) are indicated. Additionally, the ranking of the impact of the parameter on the classification was conducted using predictor analysis. This analysis indicated that the level of active power, harmonic pollution, and flicker are important with regards to the obtained classification.
The obtained classification indicated the unknown working condition. After the comparison with other groups, the unknown condition was indicated as a high harmonic pollution period of time. Thanks to this, it is possible to analyze a short period of time to find the problem with harmonic pollution in an electrical power network of the mining industry.
The article contains the proposition of reducing a database concerning the calculation of one value that represents three phase-to-phase values. The results were similar (close to 95%), and the calculations were reduced by over 57%.
The presented approach of obtaining automatic data classification with regards to different working conditions (especially distributed generation or the harmonic pollution problem) is an important element of a smart grid. It is worth noting that the presented approach is conducted for area-related analysis-four different measuring points that are considered as common input data.