In this section, the proposed PMU data-compression process is applied to real-world wide-area power systems.
5.1. Evaluation of Proposed Method by Application
Figure 5 describes the DBSCAN-based clustering results of frequency signals in event state in
Figure 1. The original signals involve event information, as well as seven signals including bad data, as shown in
Figure 1a. By applying DBSCAN, 11 signals were excepted as outliers, and every seven signals including bad data were successfully removed (
Figure 5b). Meanwhile, 183 other signals constructed clusters, Clusters 1–6, as can be seen in
Figure 5c–h. A notable point is that 163 PMU signals were clustered into Cluster 1 because frequency is a global parameter. A high compression ratio for Cluster 1 was expected because of its large number of PMU signals and correlated signature. However, PMU signals in a local area depart from the main grid and exhibit individual responses to the event. Thus, few PMU signals correlated to each other were clustered into Clusters 2–6. In these clusters, low compression ratios were expected due to few dimensionalities to be compressed. However, the low compression ratios do not significantly influence the entire compression performance.
While in the ambient state, there was one signal containing bad data. This PMU signal was successfully excepted as outlier, and this signal was not compressed. In addition, 193 other PMU signals were aggregated as Cluster 1, which implies that the compression ratio could be expected to be high.
A clustered subdataset was decomposed by wavelet transform, and PCA was applied to detail matrices
(
). The mother wavelet and decomposition level used for multiscale decomposition of the data matrix were set at db2 and 5, respectively. In Reference [
5], the db2 wavelet and decomposition level 5 were shown to the optimal result for the maximum value of the wavelet energy used as an indicator of information in PMU data.
The numerical results of all clusters for voltage and frequency signals are summarized in
Table 2 and
Table 3. In the ambient period, signals containing bad data were excepted as outliers, and voltage and frequency datasets were compressed with a
of 18.22 and 15.37, respectively. In the event period, bad data were also successfully removed, and there were five clusters for voltage and six clusters for frequency. The large number of clusters in the event period was derived from the fact that PMU signals were uncorrelated due to the unique responses of the local area. Voltage signals construct dispersed clusters compared to frequency signals (see Clusters 1 and 3), because voltage is a local variable, as discussed in
Section 3. In addition, the
s of clusters in event state had low values, as expected in
Figure 5. This originated from a large number of PCs being selected for both types of data to capture the transient phenomena. Note that the results of a high
in ambient state and low
in event state exactly match with the compression strategy presented in
Section 2.
To allow visual interpretation, the dimensionality reduction and reconstruction process of a detail matrix is depicted in
Figure 6.
Figure 6a shows the original detail matrix
(
) of the Cluster 1 frequency dataset. The global influence of the event was well-captured by the large detail coefficients. In addition to these global characteristics, the detail coefficients of the PMU signal showed individual characteristics. As a result, ten PCs accounted for 99.99% of the total variance. By selecting these PCs, the original dimensionality of 165 was reduced to 10. The selected PCs and corresponding scores are shown in
Figure 6b,c. The reconstructed detail matrix is shown in
Figure 6d. It can be seen that the information of the original matrix was well-retained. Reconstructed matrix
can then be obtained in the time domain through IDWT.
For the evaluation of a DBSCAN-based procedure, other clustering methods, such as
k-means clustering and fuzzy
k-means clustering, are analyzed.
Figure 7 shows the Dunn index (DI) of frequency signals, an indicator of clustering performance [
29], according to the different number of clusters. A higher DI implies a dataset is well-clustered. As shown in
Figure 7a, two clusters are the optimal number of clusters for an ambient dataset. On the other hand, five clusters are optimal in an event period, as shown in
Figure 7b. One can see that, though both
k-means and fuzzy
k-means require numerous iterations to find the optimal number of clusters, DBSCAN automatically provided the optimal number of clusters, as summarized in
Table 3, using the preset density parameters as discussed in
Section 4.
Reconstruction results with and without clustering analysis are depicted in
Figure 8, which shows the
values of every PMU signal in the event period. Without clustering, huge distortions (green circles) were observed in PMU signals as peaks in black circles. This implies that, though MSPCA first extracts individual characteristics by wavelet decomposition, linearized PCA can ignore each piece of event information in high-frequency sub-bands. However, by clustering analysis, these distortions are significantly reduced by partitioning the original dataset into correlated subdatasets. As a result,
values with clustering (red dots) were low and relatively even when compared with results without clustering. Therefore, it is confirmed that clustering analysis before compression can improve reconstruction accuracy and guarantee the preservation of local phenomena in wide-area power systems.
5.2. Comparison with Existing Approaches by Case Studies
In order to verify the efficiency and robustness of our proposed method, the existing individual- and comprehensive-compression methods were compared by application to real-world data. For individual compression, DWT-based compression presented in Reference [
5] was analyzed to confirm whether MSPCA distorts the unique characteristics in a PMU signal or not. The PCA–DWT combined compression method in Reference [
11] was compared to show that MSPCA accurately extracts hidden dimensionalities of PMU data in large-scale power systems.
Figure 9 and
Figure 10 provide examples of the reconstructed voltage and frequency signals of the DWT, PCA–DWT, and MSPCA compression methods, respectively.
As shown in
Figure 9 and
Figure 10, DWT almost ignored transient phenomena such as voltage fluctuation and frequency oscillation. The reason is that DWT compression thresholded almost detail coefficients related to the variations, and a signal is mainly reconstructed by using low-pass filtered data and approximation coefficients. Reconstructed signals by PCA–DWT, on the other hand, seem to preserve transient information compared to the results of DWT compression. However, the proposed MSPCA provided near-perfect reconstruction, as shown in
Figure 9d and
Figure 10d. This result implies that, though PCA–DWT is an efficient way to compress PMU data, just discarding coefficients below a threshold can distort the transient phenomena of a local area. MSPCA does not just discard coefficients of low values, but also extracts hidden dimensionality at each scale, as shown in
Figure 6.
The numerical results in
Table 4 and
Table 5 also show the efficacy of the proposed method. Both DWT and PCA–DWT provided better compression ratio for voltage and frequency event data. However, MSPCA derived much lower reconstruction error for both types of data. The maximum value of
especially implies that MSPCA can preserve significant distortions of PMU signals in a local area.
Most of the time, a power system operates in an ambient state, and the can be expected to be higher than that of the event cases studied in this paper, since most states of the PMU dataset are from ambient periods. To verify the overall performance of the proposed method, the PMU data collected during the 24 h encompassing the discussed cases were compressed. Over 24 h, the four events in the utility data log were successfully detected from both voltage and frequency data. A further four voltage-only events were detected, and frequency-only events were also detected.
Overall compression results are analyzed in
Figure 11 and
Figure 12.
Figure 11 shows
distribution for the interval-selected datasets.
distribution using multiscale compression is broader and has a higher median value than that of DWT and PCA–DWT. This adaptive
results from multiscale compression adaptively selecting PCs according to the time-varying characteristics of the PMU signals. The PMU data for 24 h were compressed with a
of 14.41 for voltage and 15.11 for frequency. Multiscale compression also has narrower distribution with lower
s than DWT for both voltage and frequency as shown in
Figure 12. By simultaneously taking the compression ratio and accuracy, the proposed method is shown to provide efficient and robust results, because DBSCAN automatically clustered correlated subdatasets, and MSPCA efficiently reduced dimensionality while preserving individual information.
Computation time of the proposed technique is measured for implementation and real application using MATLAB.
Table 6 shows the averaged computation time over 24 h according to data types and power-system conditions. Run times for processing DBSCAN are generally longer than those of MSPCA because DBSCAN requires calculating distances between all signals in a dataset. However, the total computation times for all cases do not exceed windowed times of ambient (1 min) and event (4 s) conditions. Therefore, the proposed technique can compress PMU data without time delay and latency to compression of subsequent windowed data.