A Case Study on a Hierarchical Clustering Application in a Virtual Power Plant: Detection of Speciﬁc Working Conditions from Power Quality Data

: The integration of virtual power plants (VPP) has become more popular. Thus, research on VPP for different issues is highly desirable. This article addresses power quality issues. The presented investigation is based on multipoint, synchronic measurements obtained from ﬁve points that are related to the VPP. This article provides a proposition and discussion of using one global index in place of the classical power quality (PQ) parameters. Furthermore, in the article, one new global power quality index was proposed. Then the PQ measurements, as well as global indexes, were used to prepare input databases for cluster analysis. The mentioned cluster analysis aimed to detect the short-term working conditions of VPP that were speciﬁc from the point of view of power quality. To realize this the hierarchical clustering using the Ward algorithm was realized. The article also presents the application of the cubic clustering criterion to support cluster analysis. Then the assessment of the obtained condition was realized using the global index to assure the general information of the cause of its occurrence. Furthermore, the article noticed that the application of the global index, assured reduction of database size to around 74%, without losing the features of the data.


Introduction
The concept of renewable energy sources (RES) and energy storage systems (ESS) integration into virtual power plants (VPP) includes different areas. This investigation concerns power quality (PQ) and data mining (DM) issues in VPP.
The article [1] is concerned with standard IEC 61850. It proposed an extension to this standard as a step to enhance the interaction between the controller of RESs and VPPs. As one of the elements, the power quality recorders' issues were included. The demands for them in point of IEC 61850 are discussed. Finally, the proposed methodology was verified in the virtual power plant that consists of HPP, PV, and wind power plant as well as storage systems. The indicated VPP operates on a medium voltage (MV) level. In Pudjianto et al. [2], the virtual power plant is treated as an instrument to enable a cost efficient integration of RES with the present power systems. The article includes the performance analysis of a VPP system from the point of view of different indicators such as energy efficiency, power quality, and security. The analyzed case consists of fuel cells, a The single parameter analysis of each measurement point for such a period of time would be very time-consuming. Thus, the concept of the global index was introduced. Such an approach, in the literature, is known under different names e.g., global power quality index [15,16]; unified power quality index [17,18]; total power quality index [19,20]; or synthetic power quality index [21,22].
This article applied the ADI index proposed in [16], and the newly proposed power quality pollution index (PQPI). Then both the classic PQ parameters and the global indexes were used to define datasets for cluster analysis. Global indexes were applied to reduce the size of the input database, without losing features of the PQ data. Then for those databases, hierarchical clustering was performed. Cluster analysis aimed to detect short term specific working conditions of the VPP from the point of power quality. As a tool to realize this, the cubic clustering criterion was applied for results assessment of hierarchical cluster analysis with the Ward algorithm for indicated databases. Finally, the application of PQPI was presented to highlight the difference from the point of PQ level for clusters.
The contributions of this research are as follows: • The source of the data was multipoint, synchronic, and long-term power quality measurements, that were obtained from a real VPP.

•
The global index approach for PQ issues was discussed, and a new index is proposed.

•
The proposed input databases to cluster analysis are concerned with raw PQ data and global indexes. Global indexes were proposed to reduce the size of the input databases but the reduction has been realized while maintaining existing features of the PQ data.

•
The cubic clustering criterium for hierarchical cluster analysis results was used to detect short-term working conditions of VPP, that were specific in point of power quality.

•
The global index was used for comparative assessment between clusters.
To realize those contributions, the article is organized into five sections. In Section 2 the virtual power plant description, global index proposition, clustering methods, and input databases are presented. Section 3 presents the results of specific working conditions detection in view of power quality using hierarchical clustering. Section 4 presents a discussion of the results. Section 5 concludes the article.

Methodology and Research Object Description
This section is based on four elements. The first element is a description of the real VPP, that operates in Poland. This VPP became a source for long term, synchronic power quality measurements. Then, the global index approach was discussed. Next the long-term measurements and the global index were combined to obtain different datasets. These datasets consisted of classic power quality parameters and global power quality indexes. Then those datasets were used as an input for hierarchical clustering. The assessment of cluster assignment for those measurement data was realized using cubic clustering criterion (CCC). The CCC were applied to select the adequate number of clusters, that will indicate short-term specific working conditions from a power quality point of view. To summarise the proposed approach, the simplified methodology scheme was presented in Figure 1.
points include the HPP, the BESS, the associated MV line, and two LV loads. The measurements were conducted for 182 days: from 1 May 2020 to 28 October 2020. Therefore, they are 26 weeks long from the point of view of classical PQ assessment [14].
The single parameter analysis of each measurement point for such a period of time would be very time-consuming. Thus, the concept of the global index was introduced. Such an approach, in the literature, is known under different names e.g., global power quality index [15,16]; unified power quality index [17,18]; total power quality index [19,20]; or synthetic power quality index [21,22]. This article applied the ADI index proposed in [16], and the newly proposed power quality pollution index (PQPI). Then both the classic PQ parameters and the global indexes were used to define datasets for cluster analysis. Global indexes were applied to reduce the size of the input database, without losing features of the PQ data. Then for those databases, hierarchical clustering was performed. Cluster analysis aimed to detect short term specific working conditions of the VPP from the point of power quality. As a tool to realize this, the cubic clustering criterion was applied for results assessment of hierarchical cluster analysis with the Ward algorithm for indicated databases. Finally, the application of PQPI was presented to highlight the difference from the point of PQ level for clusters.
The contributions of this research are as follows: • The source of the data was multipoint, synchronic, and long-term power quality measurements, that were obtained from a real VPP.

•
The global index approach for PQ issues was discussed, and a new index is proposed.

•
The proposed input databases to cluster analysis are concerned with raw PQ data and global indexes. Global indexes were proposed to reduce the size of the input databases but the reduction has been realized while maintaining existing features of the PQ data.

•
The cubic clustering criterium for hierarchical cluster analysis results was used to detect short-term working conditions of VPP, that were specific in point of power quality.

•
The global index was used for comparative assessment between clusters.
To realize those contributions, the article is organized into five sections. In Section 2 the virtual power plant description, global index proposition, clustering methods, and input databases are presented. Section 3 presents the results of specific working conditions detection in view of power quality using hierarchical clustering. Section 4 presents a discussion of the results. Section 5 concludes the article.

Methodology and Research Object Description
This section is based on four elements. The first element is a description of the real VPP, that operates in Poland. This VPP became a source for long term, synchronic power quality measurements. Then, the global index approach was discussed. Next the longterm measurements and the global index were combined to obtain different datasets. These datasets consisted of classic power quality parameters and global power quality indexes. Then those datasets were used as an input for hierarchical clustering. The assessment of cluster assignment for those measurement data was realized using cubic clustering criterion (CCC). The CCC were applied to select the adequate number of clusters, that will indicate short-term specific working conditions from a power quality point of view. To summarise the proposed approach, the simplified methodology scheme was presented in Figure 1.    as points of connection to the 110 kV polish grid. However, under this investigation one MV network was selected. The network fed from the 110/20 kV station is an overhead cable network. The selected network has earth fault current compensation [24]. The main distributed energy resources that are integrated into the virtual power plant are a 1.25 MW HPP and a 0.5 MW battery ESS. Both are connected to a medium voltage level.
The scheme of the investigated fragment of the VPP is presented in Figure 1. The analyzed fragment consists of a 20 kV distribution network with a 1.25 MW hydropower plant (HPP_MV) and an 0.5 MW battery ESS (ESS_MV). Those energy sources are connected with the HV/MV substation by a 20 kV line (Line_MV). Additionally, representatives of low voltage loads are indicated: LOAD1_LV and LOAD2_LV. LOAD1_LV is connected with the indicated medium voltage line (LINE_MV). LOAD2_LV is connected with the node of the HPP_MV and ESS_MV. This fragment of the VPP is monitored by power quality recorders. Power quality recorders are indicated as "R". The location of these recorders is also included in Figure 2. The HPP_MV and ESS_MV are connected to one node and their PQ recorders use the same voltage transformer. Thus, in this research, they are treated as one point for the PQ level (HPP and ESS_MV) and another for the active power level (HPP_MV and ESS_MV).

Virtual Power Plant That Operates in Poland as a Source of Power Quality Measurements
This article deals with a real VPP that operates in Poland, in a region called Lower Silesia. The virtual power plant consists of a fragment of the distribution network on both medium voltage (MV) and low voltage (LV) [23]. The two substations 110/20 kV are used as points of connection to the 110 kV polish grid. However, under this investigation one MV network was selected. The network fed from the 110/20 kV station is an overhead cable network. The selected network has earth fault current compensation. [24] The main distributed energy resources that are integrated into the virtual power plant are a 1.25 MW HPP and a 0.5 MW battery ESS. Both are connected to a medium voltage level.
The scheme of the investigated fragment of the VPP is presented in Figure 1. The analyzed fragment consists of a 20 kV distribution network with a 1.25 MW hydropower plant (HPP_MV) and an 0.5 MW battery ESS (ESS_MV). Those energy sources are connected with the HV/MV substation by a 20 kV line (Line_MV). Additionally, representatives of low voltage loads are indicated: LOAD1_LV and LOAD2_LV. LOAD1_LV is connected with the indicated medium voltage line (LINE_MV). LOAD2_LV is connected with the node of the HPP_MV and ESS_MV. This fragment of the VPP is monitored by power quality recorders. Power quality recorders are indicated as "R". The location of these recorders is also included in Figure 2. The HPP_MV and ESS_MV are connected to one node and their PQ recorders use the same voltage transformer. Thus, in this research, they are treated as one point for the PQ level (HPP and ESS_MV) and another for the active power level (HPP_MV and ESS_MV).

Global Power Quality Index
Recent research on power quality considers different areas. One of these areas is a simplification of the assessment using global values. Thus this article is concerned with the application of the global index-aggregated data index (ADI) [16], as well as the proposition of a new power quality pollution index (PQPI). ADI consists of seven 10-min PQ parameters-frequency (f), voltage (U), an envelope of voltage, short term flicker severity

Global Power Quality Index
Recent research on power quality considers different areas. One of these areas is a simplification of the assessment using global values. Thus this article is concerned with the application of the global index-aggregated data index (ADI) [16], as well as the proposition of a new power quality pollution index (PQPI). ADI consists of seven 10min PQ parameters-frequency (f), voltage (U), an envelope of voltage, short term flicker severity (Pst), unbalance (ku2), total harmonic distortion in voltage (THDu), and maximum harmonic distortion. However, it was decided to exclude frequency as a customization step of the index to VPP issues as proposed in [16]. Thus, the ADI index corresponds to: an envelope of voltage deviation obtained by the difference between the maximum and minimum of 200-ms U values identified during the 10-min aggregation interval, • short term flicker severity indicator, • asymmetry indicator, • total harmonic distortion in voltage indicator, • a maximum of the 200-ms value of total harmonic distortion of voltage indicator, identified in the 10-min aggregation interval [16].
The mentioned indicators are in response to a 10-min aggregation interval proposed by standard IEC 61000-4-30 [25]. They use the mean value from three phase values to calculate one as a representative. Those factors of ADI are based on the differences between the measured 10-min data and the recommended limit as a division. The limits may be taken differently and based on the object. For VPP that operates in the distribution network standard EN 50,160 [26] was selected. The applied limits based on EN 50,160 [26] are: ADI index responds to the voltage level as well as an envelope of voltage. The same situation is with total harmonic distortion and maximum total harmonic distortion. So in view of data features, they are similar. Thus, in this article, the power quality pollution index (PQPI) is representative of the ADI factor but with the reduction of voltage and harmonic distortion indicators. This reduction would retain the data features of those parameters using the envelope of voltage and maximal THDu. Thus, PQPI includes following indicators: • voltage distortion that responds to the envelope of voltage, • unbalance distortion that responds to the asymmetry of voltage, • flicker distortion that responds to short term flicker severity, • harmonic distortion that responds to maximal total harmonic distortion in voltage.

Input Databases Description
During the investigation, three different databases were analyzed. The data for each parameter or indicator were used in a 10-min aggregation interval. Generally, the applied variables represented classical PQ parameters, global indexes as well as the active power level (P). The indicated databases are presented in Table 1. To summarize the database construction a simplified scheme is presented in Figure 3. To summarize the database construction a simplified scheme is presented in Figure 3.

Hierarchical Clustering
Clustering is one of the data mining (DM) techniques, that aims to divide data into groups that represent similar features [27]. Clustering may be realized in two approaches: a hierarchical or a nonhierarchical [28].
The nonhierarchical algorithms aim toward assigning all observations to the earlier known number of clusters [29]. The most commonly used methods are e.g., K-mean, K-median, or expectation maximization [30].
The hierarchical methods constitute x classes of y observations. Hierarchical methods are also realized by two approaches: agglomerative or divisive. In this research, the agglomerative approach was selected. Generally, the agglomerative approach represents a set of observations, when each piece of data is treated as a separate cluster during the first step. Then, the data are connected into new clusters until one single cluster is established. This single cluster contains all the data [31]. The agglomerative methods to obtain clusters are single linkage, complete linkage, average linkage, weighted pair-group average linkage, unweighted pair-group centroid linkage, and the Ward method [31,32].
In this article, hierarchical clustering was selected. It determines if the connection is better realized by a single data point or by a group of similar data (achieved in the previous agglomeration) to get a final classification. In this article hierarchical clustering is realized using the Ward algorithm. Ward algorithm cluster analysis is carried out to connect data concentrated in an average value until the data has a similar value (range). The hierarchical CA algorithm, that uses the Ward method of minimal variance, is based on six steps [31,33].
• step 1: Initiate an agglomeration clustering -> divide into x clusters from x data -> calculate the distance between each pair of clusters -> create symmetrical Dis matrix, that consists of distances. • step 2: find the one pair of clusters that has the smallest squares sum of the distances between adequate object and the related cluster center of the object. In the Ward method, the indicated "finding pair of clusters which have the smallest sum of squares distance between the object and the cluster center to which this object belongs" is obtained using Equation (1) [31,33].
Dis ik = n i + n k n i + n j + n k * dis ik + n j + n k n i + n j + n k * dis jk + −n k n i + n j + n k * dis ij , where: [31,33] Dis pr : the distance of a new cluster to cluster of number "k", k: the proceed numbers of cluster from "i" to "j", dis ik : the distance of a primary cluster "i" from cluster "k", dis jk: the distance of a primary cluster "j" from cluster "k", dis ij : the common distance of primary clusters "i" and "j", n: number of a single object inside each object.
The big problem in cluster analysis, irrespective of the method, is to determine the final number of clusters [34]. The solution in literature for the Ward method is the cubic clustering criterion (CCC). This criterion is obtained by comparing an observed coefficient of determination (R 2 ) to the approximate expected R 2 . It is realized using an approximate variance-stabilizing transformation [35]. The positive values of the cubic clustering criterion inform us that the obtained coefficient of determination is greater than would be expected if sampled from a uniform distribution and, therefore, indicate the possible presence of clusters. The features of the cubic clustering criterion are [35]: • Extremum of CCC for cluster number greater than two or three indicate good clustering. • CCC can have several local extremums if the data have a hierarchical structure.

•
If CCC values are negative for at least 2 clusters, the distribution is probably unimodal or long tailed.

•
Very negative values of the CCC (e.g., −30), could be caused by the outliers. The last feature was a contribution to apply CCC criteria to detect short-term working conditions of VPP that are specific (outliners) in view of PQ.

Hierarchical Clustering of Power Quality Measurement Obtained from the Virtual Power Plant
In this section, the comparison between different databases was performed on the basis of hierarchical clustering with the Ward algorithm. The assessment of cluster assignment was realized using the cubic clustering criterion. Then for the selected number of the cluster and the PQ, comparison between clusters was performed using PQPI.

Comparison between Databases Using Cubic Clustering Criterion
The PQ measurement time was from 1 May 2020 to 29 October 2020. The analyzed number of PQ points were five: Line_MV, HPP_MV, ESS_MV, Load1_LV, Load2_LV. However, due to the fact that the HPP_MV and ESS_MV PQ records were obtained from the same voltage transformer they were treated as the same point (HPP and ESS_MV). The only difference was the active power level. So, for the analysis of HPP and ESS_MV, there is one more active power parameter than the number of measurement points. For the observed time period, 26 weeks, there is an analysis of 26,208 single 10-min data points. However, the coverage of data point is equal to 97.7% due to measurement device problems. Thus 25,069 10-min data points were accessible [24]. However, as a preprocessing of PQ data, the voltage events connecting data were also excluded as suggested in [36]. Thus, the final number of 10-min data points was 24,612.
Then, for such defined measurement dataset, the input databases for hierarchical clustering were prepared. The database was a matrix that has 24,612 rows (10-min aggregated data) and a different number of columns. The number of columns was connected with the measurement points and the number of variables for each database (check Figure 2). The size of each database set is as follows: The investigated datasets concern long term measurements. During this measurement time, the specific working conditions (like high/low harmonic content, high low voltage level or asymmetry), are indicated. Such states are mainly, short. Thus, it may be treated as an anomaly in view of the general assessment. The selection of such anomalies (specific short-term working conditions) by analyzing every 10-min data of each PQ parameter separately may be time-consuming. Thus, in this article the application of the cubic clustering criterion (CCC) is proposed.
The CCC was used on the databases (A, B, C) for hierarchical clustering with the Ward algorithm. Under the investigation, the minimum number of clusters was equal to two and maximum was equal to ten. Ten, as the maximum value, was selected on the basis of justification presented in e.g., [37] or [38]. The results were presented in Table 2. The results in point of CCC values were different for each database for different clusters, but for each database the extremal value was indicated for a final number of Energies 2021, 14, 907 9 of 13 clusters equal to 4. It means that this division assured clusters that represent data which are very different from one another. Thus, four clusters were selected as the most appropriate to detect the specific short-term working conditions for the investigated measurement. Furthermore, as for all databases that indicated extremum for CCC the least numerous input database was selected to further investigate (database C), which uses PQPI indicators and an active power level.

Results of Hierarchical Clustering
As it was indicated in the previous subsection the optimal selection in view of the size of the input database is those that consist of PQPI indicators and active power level (database C). Thus, in this section, that database was used for hierarchical clustering with the Ward algorithm.
The main result of the hierarchical clustering is a dendrogram. The dendrogram for the selected database for 26 measurement weeks is presented in Figure 4. On the vertical line, the connection distance between clusters is presented. On a horizontal line, each of 24,612 single 10-min data points is indicated. The results in point of CCC values were different for each database for different clusters, but for each database the extremal value was indicated for a final number of clusters equal to 4. It means that this division assured clusters that represent data which are very different from one another. Thus, four clusters were selected as the most appropriate to detect the specific short-term working conditions for the investigated measurement. Furthermore, as for all databases that indicated extremum for CCC the least numerous input database was selected to further investigate (database C), which uses PQPI indicators and an active power level.

Results of Hierarchical Clustering
As it was indicated in the previous subsection the optimal selection in view of the size of the input database is those that consist of PQPI indicators and active power level (database C). Thus, in this section, that database was used for hierarchical clustering with the Ward algorithm.
The main result of the hierarchical clustering is a dendrogram. The dendrogram for the selected database for 26 measurement weeks is presented in Figure 4. On the vertical line, the connection distance between clusters is presented. On a horizontal line, each of 24,612 single 10-min data points is indicated. As it was indicated in the previous subsection, division into four clusters has an extremum value for cubic clustering criteria. Based on the dendrogram, it was indicated that  As it was indicated in the previous subsection, division into four clusters has an extremum value for cubic clustering criteria. Based on the dendrogram, it was indicated that for four clusters one is with a small number of 10-min data. The numbers for each cluster are presented and the cluster assignment in the time-domain is presented in Figure 5. It can be observed that cluster 4 is a short-term condition that is represented by only 165 10-min data points. Those 165 10-min data points represent around 0.7% of the measurement time. for four clusters one is with a small number of 10-min data. The numbers for each cluster are presented and the cluster assignment in the time-domain is presented in Figure 5. It can be observed that cluster 4 is a short-term condition that is represented by only 165 10min data points. Those 165 10-min data points represent around 0.7% of the measurement time.

Qualitative Assessment of Hierarchical Clustering Results Using the Global Index
In this subsection, the qualitative assessment for indicated clusters is performed. The

Qualitative Assessment of Hierarchical Clustering Results Using the Global Index
In this subsection, the qualitative assessment for indicated clusters is performed. The assessment is realized using PQPI. Thus, the comparison for voltage indicator (a), unbalance indicator (b), flicker indicator (c), and harmonic distortion indicator (d) was presented in Figure 6. The assessment goes towards obtaining knowledge about what PQ parameters and for which measurement point there was a reason for indicating this short-term working condition of VPP. Based on this comparison it may be concluded that cluster 4 represents:

Discussion
This article is concerned with a virtual power plant that operates in Poland. The presented research is based on synchronic measurements from five PQ recorders. The PQ measurements were performed at both medium and low voltage levels. The PQ data consists of 182 days (26 weeks) (from 1 May 2020 to 28 October 2020). Thus, these data represent long term data during which different working conditions may occur. This dataset, in work [24], was used to compare the working conditions that were defined a-priory. The states were connected with HPP and ESS working condition, and level of active power. Thus in this work, the conditions were obtained without any predefinition but based on data features.
Those long-term measurements were used as the input to define different PQ databases. The proposed databases were based on classical PQ parameters, PQ global indices as well as the level of active power. Database A represents classical PQ parameters and active power. It concerns 20 variables for each measurement point. Database B uses the

Discussion
This article is concerned with a virtual power plant that operates in Poland. The presented research is based on synchronic measurements from five PQ recorders. The PQ measurements were performed at both medium and low voltage levels. The PQ data consists of 182 days (26 weeks) (from 1 May 2020 to 28 October 2020). Thus, these data represent long term data during which different working conditions may occur. This dataset, in work [24], was used to compare the working conditions that were defined a-priory. The states were connected with HPP and ESS working condition, and level of active power. Thus in this work, the conditions were obtained without any predefinition but based on data features. Those long-term measurements were used as the input to define different PQ databases. The proposed databases were based on classical PQ parameters, PQ global indices as well as the level of active power. Database A represents classical PQ parameters and active power. It concerns 20 variables for each measurement point. Database B uses the ADI components separately and active power level. It concerns seven variables for each measurement point. Database C consists of PQPI and active power level. It concerns five variables for each measurement point. It is important to notice that both global indexes (ADI and PQPI) aim to reduce the amount of data that are analyzed. However, this minimization should not cause one to lose the data features.
The research aimed to detect short-term working conditions in view of power quality. This task was realized using hierarchical clustering with the Ward algorithm. For those three databases, the clustering was realized using the cubic clustering criterion to realize the qualitative assessment of the clustering process. As it is known from the literature if CCC has a negative value it may mean that clustered data has anomalies. For all databases, the CCC was negative for final clusters equal to the range of 2 to 10. Additionally, the extremum for each database was the same and equal to four clusters. Thus, for further investigation, the database with the smallest number of variables (database C) was selected. Furthermore, the final number of clusters was selected as four.
Then, for the indicated circumstances, the comparison between clusters for each measurement point was performed. The PQPI indicators such as voltage indicator, unbalance indicator, flicker indicator and harmonic distortion indicator was used. Based on a comparison of the above mentioned parameters it was concluded that this short term condition (anomaly) was connected with problems of voltage, unbalance, and flicker. The harmonic distortion does not have a significant impact on this condition. It is very important to notice that this short-term condition was not connected with the voltage event. All data that contained voltage events were excluded during the preprocessing of measurements.
The appliances of PQPI index and hierarchical clustering indicated the short-term condition and measurement points, for which it occurs. Thus, the general information about the outcome in PQ problems was also indicated. However, using all this information, it is impossible to define the reason for this condition. Thus, the proposed solution seems essential as a first step of the investigation to define the time period of specific working condition occurrence.
Furthermore, to generalize the discussion of the results, it is important to notice that: • The investigation was realized in real VPP that operates in Poland, but it also may be applied to other VPP, only if long term power quality data would be available.

•
The investigation was based on four measurement points but it may be conducted also for other numbers of points. The minimum number is one, and maximum is limited to the computer computing capabilities.

•
In the investigation the extremal value for CCC was for four clusters. However, if the other measurement data would be applied to this methodology, another number would be obtained. However, the most crucial aspect is to select the division when CCC has an extremum. So, the results should be treated in view of proposition of the methodology as well as investigation of the real case study.

•
The proposed global index was directed only to selected voltage issues (voltage level, flicker, unbalance and harmonic distortion) and active power level. However, there is a possibility to add other parameters to the global index to make the division more sensitive to other phenomes like current parameters or reactive power.

Conclusions
The article proposes the application of clustering to long term power quality data obtained from a virtual power plant. The synchronic, multipoint measurements were used as common input to prepare different databases. The databases were based on both classic PQ parameters and global indexes as well as active power level. The selected PQ global indexes (ADI and PQPI) enabled reduction of the size of the input databases and retrain the features of the PQ data.
The application of the global indexes for clustering the input dataset reduced the size by around 74%. The results for 26 weeks clustering in view of cubic clustering criterion had an extremum for the same number of clusters and indicated specific short-term working conditions of the virtual power plant in view of PQ.
Additionally, using the proposed global index PQPI helped decide which measurement points and which group of parameters had caused the specific working condition. However, it is not possible to define the reason for such a situation. Thus, the single parameter for a single measurement point assessment is still needed even though the time period of this short condition is strictly defined. So, the application of global index and hierarchical clustering may be treated as a first step for deeper analysis.