Complementary Feature Extractions for Event Identiﬁcation in Power Systems Using Multi-Channel Convolutional Neural Network

: This paper presents an event identiﬁcation process in complementary feature extractions via convolutional neural network (CNN)-based event classiﬁcation. The CNN is a suitable deep learning technique for addressing the two-dimensional power system data as it directly derives information from a measurement signal database instead of modeling transient phenomena, where the measured synchrophasor data in the power systems are allocated by time and space domains. The dynamic signatures in phasor measurement unit (PMU) signals are analyzed based on the starting point of the subtransient signals, as well as the ﬂuctuation signature in the transient signal. For fast decision and protective operations, the use of narrow band time window is recommended to reduce the acquisition delay, where a wide time window provides high accuracy due to the use of large amounts of data. In this study, two separate data preprocessing methods and multichannel CNN structures are constructed to provide validation, as well as the fast decision in successive event conditions. The decision result includes information pertaining to various event types and locations based on various time delays for the protective operation. Finally, this work veriﬁes the event identiﬁcation method through a case study and analyzes the effects of successive events in addition to classiﬁcation accuracy. image features are used to train the CNN classiﬁer for the second event as well as the single event. The simulation for the analysis of the second event is performed using the same event. To implement the CNN-based event classiﬁcation technique, an optimal CNN structure is designed using the grid search approach. The CNN structure comprised two convolution layers, a pooling layer, and a ﬁlter comprising eight (3 × 3) matrices with a stride depth of 3. To satisfy the purpose of introducing the CNN structure based on the ﬁnal design of the CNN structure, the accuracies of classiﬁcation and failure area estimation are analyzed based on the data acquisition time.


Introduction
Efficient use of limited energy resources is one of the most important tasks in the future power industry, where social and economic issues are encountered such as the construction of new transmission lines, acquisition of large-scale renewable energy sources, and concentration of loads in metropolitan areas. In addition, the risk of disturbances or outages is increasing owing to the significantly increased scale and density of modern power grids. One of the important technical issues encountered in addressing the reliability of national power grids is the development of the wide-area monitoring, protection, and control (WAMPAC) systems. The WAMPAC system monitors a wide-area power grid using various status information of the power system, including real-time data acquired through a phasor measurement unit (PMU), and provides the static/dynamic stability of the power grid. It is a technology that promotes a stable system operation by protecting and controlling the system. Whereas the traditional wide-area monitoring system (WAMS) monitors and controls the power grid using status information acquired in units of seconds, a PMU-based WAMPAC system can use the status information acquired in milliseconds, rendering it more precise and accurate for monitoring and controlling the power grid [1][2][3][4]. In this regard, these PMU data are widely installed to build power big data. However, PMUs have not yet incorporated application technologies such as automatic system protection, wide-area power outage prevention, which are expected in the PMU design stage. Overall, PMU-based systems can improve recovery times by providing high-precision measurements to identify errors, and these unique characteristics of PMUs are likely to be applied successfully to preventive awareness schemes [5,6].
PMU data are applied to achieve fast decision and high accuracy in system operations, which is consistent to introduce PMUs for power system analysis. In this regard, researchers in the field of situational awareness plan to apply data-driven approaches based on the development of PMU applications. Examples of relevant previous studies are as follows: A CNN application is proposed for the monitoring of power quality with PCA based preprocessing [7]. The extraction of features for parameter identification using phasor measurement data is proposed in [8]. For dynamic event analysis, general details regarding the real-time monitoring of events are well described in [9]. In addition, correlation analysis is performed for the online monitoring of anomalies and their locations [10]. These studies provide methods for obtaining meaningful information from non-stationary synchrophasor data, as well as guidelines for system operation, and can be improved by implementation conditions in real-world operation.
Because PMUs measure the real-time voltage, current, and frequency in the form of phasor data, they can identify the faults or events from dynamic signatures in an almost real-time manner [11]. In this regard, this study aims to deal with the variability of the power system, achieve rapid decisions, and analyze successive events relevant to an actual power system implementation. As data acquisition time and accuracy exhibit a trade-off relationship, the main contribution of our study is the complementary sequencing decision process, as well as a fast decision process for successive events conditions. The design process of event identification involves a complementary sequence to provide fast identification and validation, and the proposal content includes a quick judgment accuracy analysis and output category design. For the analysis of power information in the form of an event signal, the dynamic signals of the power grid measurement can be recognized by applying the convolutional neural network (CNN) [12,13]. In other words, the contribution of this work includes a proposal of two separate structures of data preprocessing methods and a multichannel CNN structures to provide validation, as well as a fast decision in successive event conditions using a single event database. Additionally, the design of categories includes the classified sort of event phenomena in power systems considering the voltage, frequency, real and reactive power disturbances, and classified groups of transmission lines and electrical buses. Finally, this work presents a method for fast event identification using the subtransient features and high-accuracy event identification using fluctuation signatures.
The remainder of this paper is organized as follows: Section 2 provides a state-of-theart of deep learning applications for PMU analysis and discusses the challenges arising from the real-time cognitive process of power system events. In Section 3, data preprocessing and classification-based event identification are described. The results and discussion of event identification are provided in Section 4. Finally, the conclusion is presented at the end of paper.

State-of-the-Art
In recent years, driven by the vast amount of data associated with PMU-based WAMS operations, innovative methods in the field of data-driven approaches, such as machine learning are beginning to utilize the information provided and extract valuable knowledge for system operators [14][15][16]. Instead of declaring a complex analytic model, learning how to recognize patterns and identify features appears to be an alternative solution for overcoming the challenges imposed by processing vast amounts of raw data associated with WAMS operations. As a class of machine learning algorithms, deep learning is used to model high-level abstractions hierarchically from raw data. Previous studies pertaining to deep learning application of power system signals include as follows: Generative-adversarialnetwork-based dynamic security assessments are developed considering missing data [17]. The artificial neural network is developed to predict the system behavior and to classify the contingencies or disturbances [18]. In using deep learning for event identification, pseudocontinuous quadrature-wavelet-transform-based feature extraction provides information regarding classified power system phenomena [19].
Among the deep learning methods, CNN is a specialized technique for image identification due to the ability to learn the distinctive features that represent the type of object, and it is widely applied to automatically extract the information targets in an image [20]. As power system information in a wide area is constructed as image format following the time and location domains, CNN application is suitable for two dimensional power system data. In this regards, the examples of CNN based power system analysis are as follows: a onedimensional CNN is used to process PMU signals and model parameter calibration [21], a convolutional autoencoder-based load-feature analysis [22] are conducted. Especially, for the CNN-based event identification, examples include that the CNN classification based fault line localization using voltage signals [23], successive event identification by predicting the first event signal, and shallow CNN [24], CNN classification for the false data injection attack, as well as the power system events [25]. On the other hand, CNN is vulnerable to viewpoint changes that look at the object. When the angle or size of the image information is changed, it cannot be recognized properly. Thus, accurate event detection and pre-processing stage are required for the CNN application. In addition, the learning process of CNN has a problem in that it cannot process the order relation of incoming information in real time because it is interested only in the features that information has no order of importance. Thus, this work tries to handle the limitation of CNN based real time monitoring at the data pre-processing stage.
Synthetically, deep learning used in PMU system provides the analytic tools for complicated power system and their operation. In addition, fault or event-based situational awareness can improve system reliability by supporting suitable protective operations and operate optimally to isolate and minimize the affected area. On the other hand, the data-driven approach requires learning data that is an important factor to determine the performance and reliability of the proposed method. However, it is difficult to generate a database using a real-world test model while assuming successive events. Therefore, a study should be conducted to improve the classification accuracy of successive fault signals that can be analyzed in a single event database. In this regard, this work improves data processing based on deep learning by establishing preprocessing and complementary classification for fast and accurate identification using a signal event dataset. Finally, it is considered as real-world applications by presenting an event analysis method using a short window and complementary sequencing while considering the various failures.

Problem Formulation
The data acquisition environment of the wide-area PMU measurements is located in the main power plants and nodes of the power system for acquiring voltage, current, and frequency information in the form of time-series phasor data [26]. When a power system event or fault occurs, its dynamic characteristics are represented in the PMU signals based on the type and location characteristics of the event. For example, Figure 1 shows the typical event signals of load and generation losses measured through multiple PMUs, where time-series signals are acquired from multiple sensor PMUs. Because these sensors are distributed in space, the signal sets exhibit different pattern characteristics based on the event types and locations. Hence, the set of measurement devices through test and training exhibits unity in measurement. The decision time steps of the event incorporate the data acquisition time, fast decision, and validation with high accuracy. The window frame of the data acquisition is set to T w , where T s , T e are the start and end times of the event measurement (T w = T e − T s ), respectively; T d is the time difference between the events shown in Figure 1. Subsequently, the measured dataset is represented as follows: where V PMU i and F PMU i are the measured voltage and frequency signals from the i-th PMU at the n-th sample time t n , respectively. The combined voltage and frequency signal sets are represented as follows: where V set , F set composes the channel data from the voltage magnitude and frequency that are matched with the two-dimensional (2-D) colormap in Figure 1. When an event is recognized through a narrow-band window of a waveform, the accuracy may be inferior to that from analyzing the entire signal between instrument points. Therefore, the process may be inefficient because of the low accuracy in a single-event situation, where a single event is considered to exhibit a large T d value. Meanwhile, when T d is a small value, the event signals exhibit mutual influences between the data acquisition windows. Hence, the flexibility in window size is considered, where short-time windowing benefits successive event analysis, whereas long-time windowing provides high accuracy. In particular, events occurring in the operating power system can appear in a continuous form; if event information can be determined only by acquiring a signal for a short time after an event occurs, then the event that can be applied to the power system operating with continuous event classification and variability. This can be applied as an analysis technique for the complementary method.
The bottom of Figure 1 shows a temporal process that recognizes information after an event occurs and the data acquisition time. The additional data acquisition time windows support the validation stage with improved accuracy. Furthermore, it is designed as a complementary sequence to improve the accuracy of the instant classification and the detection of single events that have sufficient time difference, as well as to classify continuous events. In addition, the successive event classifier categorizes information that can be recognized through transient signals.

CNN Classifier
To use the 2-D information shown in the colormap of Figure 1, the CNN can be used to perform image identification, where images are received through convolutional operations and filtering techniques are applied during classification. The basic CNN structure is composed of several layers, as shown in Figure 2. It includes a convolution layer that receives an image as input data and applies an image filtering technique, which is convolved across the width and height of the input volume to compute the dot product between the filter entries and the input, thereby yielding an activation map of that filter. The CNN architecture generally yields a non-linear activation function and a pooling layer. The purpose of the pooling layer is to reduce the amount of data or the image size following the convolution layer. The convolution and pooling processes are repeated based on the CNN design. After repeating the convolution and pooling layers, the flattened layers convert the data type to a fully connected (FC) neural network. By the convolution and padding phases, the neurons in the fully connected layer are connected to all neurons in the immediately preceding layer. Thereafter, the data are normalized through the activation function Softmax through a FC layer that vectorizes the data. The Softmax function used for the output can perform multiple classifications. This layer identifies larger patterns by combining the features of the previous layer learned from the image. The final FC layer classifies the image based on the design categories. This layer assigns inputs to one of the mutually exclusive classes and computes losses using the probabilities returned by the Softmax activation function for each input.
Depending on the target system and database, the design of hyperparameters determines the CNN performance. The first argument of the convolutional layer is the height and width of the filter used by the training function when scanning along with the input data with the filter size and number of filters, as well as the number of neurons connected to the same area. In the padding step, name-value pair arguments are used to add padding to the input feature map, including the stride. The designed output size parameter of the last FC layer is equivalent to the number of classes in the target data. In this example, the output size is 10, corresponding to 10 classes. The additional principle of a well-known CNN, which is available in previous studies, is not described in detail herein [27]. In addition, the hyperparameters used in the proposed CNN classifier and case study are described in Sections 3 and 4, respectively.

Data Preprocessing
In advance of inserting data into the CNN structure, preprocessing is required, where the acquired data are sorted by the measurement time and distributed space domains. The measured original windowing time data are represented by the PMU and time sample domains for the voltage and frequency data, respectively. Based on the time-spatial analysis of PMU-based power information, a snapshot configuration is acquired during preprocessing to capture the dynamic signature. The design of the data size and domain during preprocessing improves the performance and construction of the deep neural network. The use of transient period signals that include the signatures of signal fluctuation improves the accuracy by increasing the data amount, while, the only use of short time subtransient signals provides a fast decision by reducing the data acquisition time. Overall, in our approach, separated preprocessing domains for short-time and long-time data acquisition are considered separately in the entire process. Two approaches can be used to design the data domain.
(1) PMU vs. time domain: The PMU and time domain matrix snapshot captures the signal fluctuation in each PMU signal that is composed of multiple 1-D PMU signals for voltage and frequency signals. Owing to the time-varying conditions in power system operations, the operation and bios conditions are ambiguous. Hence, it is crucial to capture the dynamic signature in the subtransient period when the event occurs. The wavelet technique is useful for the analysis of non-stationary PMU signals and is used to acquire the short-term characteristics of a signal during preprocessing [28]. The detail coefficients from the wavelet analysis are coefficients that represent the abrupt changes in time-localized signals, and a wavelet transform technique is applied to each time-series data. The defined discrete wavelet transform (DWT) function obtains the 1 level detail coefficients of the original discrete signal x[k] as follows: where the designed wavelet basis φ serves as a high-pass filter, and m is the inner product where the 1-D DWT(v PMU i ) sets construct the i-th row of the I MG1 V,F for the voltage and frequency channels. A wavelet basis must be designed to shift the inner product of the signal. For the design of the wavelet basis, developed lists of well-used wavelet bases are available, such as the Daubechies wavelet basis. Because it is difficult to simply compare the accuracy of different wavelet basis in a CNN with the same structure, the selection of wavelet basis follows the previous study that conducts the performance evaluations of wavelet bases for PMU signal [29]. As shown in Figure 3c  (2) PMU vs. PMU domain: The complementary preprocessing data are derived via correlation analysis between PMU time-series signals, where the correlation matrix snapshot shows the relationship pertaining to the distributed dynamics. The matrix size depends on the useable PMU installation status and the construction of the symmetric correlation matrix. The correlation operation between 1-D signals is represented as follows: where corr(c n , c m ) implies the calculation of the correlation coefficients between the N L length windowing signals c n and c m , and the correlation matrix that involves the similarities of two event signals from each sensor position. The constructed snapshots are expressed as follows: where I MG2 V,F is an #PMU × #PMU symmetric matrix composed of correlation coefficients for the voltage and frequency channels, respectively. Furthermore, a correlation matrix is formed through the structure of multiple sensors, as shown in Figure 3e

Multi-Channel CNN Classifier
The simultaneous use of voltage and frequency information requires a CNN with a two-channel input structure. Hence, the proposed CNN structure is designed as a twochannel input structure, as shown in Figure 4, and convolution is performed through the same filter structure, as follows: where the output of convolutional layers s p and the following pooling layers are determined by the filter size, number of filters, and input size. For the proposed CNN structures, the image size in the image input layer is specified. At this time, the size of the input image depends on the PMU measurement structure of the target power system, and based on this size, appropriate hyperparameters are designed for each target power system. The filter size and number are designed using the grid search approach. Therefore, the CNN hyperparameter structure used is described in the case study. In addition to the convolution step, the activation functions used in the placement and FC layer steps used in the proposed CNN structure are the rectified linear unit layer. The batch normalization layer is followed by a non-linear activation function, which is the most frequently used activation function in the CNN operation. When the perceived information is being processed, the Softmax activation function normalizes the output of the FC layer, where the output of the Softmax layer comprises positive numbers with a sum of one. This value can be used as the classification probability for the classification layer. Classification is performed in the form of multiple outputs using the Softmax function. In addition to the multichannel input, the output categories are designed for different event types and locations for classification. However, the output categories do not contain multiple channels. Two separated CNNs are constructed to accommodate the types and locations as C1 and C2 classifications. Whereas the C1 and C2 classifications involve different domains in the input data, the categories of the types comprise the number of events in general power systems, and the location categories depend on the target power systems.

Event Identification Process
The entire process includes preprocessing, CNN-based power information recognition, and multichannel CNN design. The flowchart of the overall flow design process is shown in Figure 5. In the design of the preprocessing stage, the Daubechies wavelet basis is selected for the sychrophasor voltage and frequency signal, where the Daubechies basis is tested for the power system signals and indicated high performance [30]. The PMU dataset is not a design factor in this work, but an environment of the target power system. In other words, the input data sizes are determined by the target system conditions. The input data are considered in the decision of the CNN classifier, where the design of the hyperparameter of the CNN is determined by the input data size based on the grid search method. The details of the designed CNN are described in the case study. During online monitoring, the dynamic signature in the PMU signal indicates the detectable event. After a delay in the data acquisition time, the decision for C1 classification is performed for a fast protective operation, and validation is performed via C2 classification. In the case of successive events, a second event identification is performed after the additional data acquisition time. The second event acquisition time depends on the first event and its subtransient periods. The first test and validation network are separated because the connected four-channel neural network downgraded the detection performance. Hence, two separate classifiers are designed for event identification, where the identification information pertains to the event types and locations for system protection.
The target category might be classified as the real and reactive power effects of the voltage and frequency signal. In this study, the output categories are designed as statuses in generation and load variance, line faults, etc. In addition, for location identification in wide-area power systems, the grouping of the power system area for classification is key. This is because it is difficult to acquire the training data from every point or estimate the location using a regression method in a wide-area power system. In addition, the system exhibits varying power system conditions; hence, classification-based location estimation can be considered a suitable method. The grouping method involves network connection and node grouping. The details of the categories are described in the case study. Finally, using the designed event detection process, the system event information can be identified under additional successive event conditions.

Case Study
A database for the application stage of data-driven approaches in power systems must be secured; however, it is difficult to perform random tests on actual power systems. An alternative method is to build an abnormal signal database using the IEEE Standard model and conduct a study regarding the application of a CNN-based rapid event identification method and real-world operation data acquisition. In this study, the target power system used is the IEEE standard 68 bus model that represents the reduced order equivalent of the New England test system linked with New York power system. The simulations are performed using PSS/E software for various events in a power system, in which the details about model information composed of 16 machines and 5 areas, can be found in [31] including network, equipment, parameters. The target categories for the classification, such as the event types and locations, are calibrated to accommodate the target system, where the detailed partition of the location group provided a high resolution while requiring more training data. The details of the information and the event identification results are described in the following subsections.

Target System Analysis
For iterative simulations, Python executes iterative PSS/E simulations for IEEE test systems. In this study, analytic event categories are organized into six events, and 4908 event cases are generated through repeated execution by changing the event conditions such as the type, cycle, location, and rate of change. In our previous study pertaining to event location identification [28], the event location categories are analyzed using the clustering method. This is advantageous because similar dynamics guarantee the location of the group estimation. However, because similarity-based clustering cannot guarantee the uniform distribution of the group area range, uniform location grouping should be adapted to the frequency of occurrence.
In this regard, the location categories in the target system are categorized as follows: (1) Event types: The events are categorized into six types as branch trip (BT), voltage reference change (VR), generation loss (GL), real power load loss (LL), line to ground fault (LG), reactive power loss (RL). (2) Event locations: In this work, the electrical bus groups are uniformly partitioned with the connectivity and the event location between partitioned areas belonging to the starting area. The used data channels are matched with the voltage magnitude and frequency, except for the current data. In addition, by configuring the scenario for the PMU channel installation, they are configured to utilize the PMU channel set instead of the entire node monitoring. Studies regarding the optimal PMU location in the IEEE 68 bus system have been conducted [26], where one of the PMU monitoring methods is used to monitor each generator output stage [32] for the efficient monitoring of the target system.
Hence, the PMU set is configured into 10, 16, and 23 measurement sets, where the 16 PMU set monitored the generation input data. In case that 16 PMU dataset is used as an example, each time sequence matrix and correlation matrix comprised a (16 × 20) and (16 × 16) matrix these are matched with number of PMUs and time samples, respectively. The typical input data of the CNN classifier are shown in Figure 6a-e, where each image involves the features reflecting event types in I MG1 and I MG2. Please note that these figures do not reflect the representative images of each event type, but one of the examples in each event type, which is affected by the scale, location, and type. In addition, these image features are used to train the CNN classifier for the second event as well as the single event. The simulation for the analysis of the second event is performed using the same event. To implement the CNN-based event classification technique, an optimal CNN structure is designed using the grid search approach. The CNN structure comprised two convolution layers, a pooling layer, and a filter comprising eight (3 × 3) matrices with a stride depth of 3. To satisfy the purpose of introducing the CNN structure based on the final design of the CNN structure, the accuracies of classification and failure area estimation are analyzed based on the data acquisition time.

Numerical Result
The result of single event classifications for the event types and location groups is shown in Figure 7, where the accuracy rates are obtained under three different PMU set conditions. As expected, the C1 classifier can capture the event information after a short data acquisition time. In Figure 7, the results of type and location estimation by the C1 classifier indicate a high accuracy between 0.12 and 0.5 s. However, the longer the data acquisition time in C1 classification with a fixed feature size, the lower is the accuracy owing to the meaningless information. Meanwhile, the C2 validation stage indicates a low speed for the classification after 2.8 s; however, a stable, high accuracy is guaranteed. For the single event identification, C1 classification can be used for fast identification and C2 classification for validation as summarized in Table 1. In particular, for the location group estimation, the C2 classifier performed better than the C1 classifier because the cross-correlation matrix captured the effect between each PMU.   For the implementation of successive event analysis, three event types (GL, LL, LG) are used for the condition of the first event, and subsequent successive events are analyzed via a second event type and the time difference (T d ) between the first and second events. Figure 8 shows the trend of the identification accuracy for three types of second events based on the time composed of the event time delay and data acquisition time, where the windowing delay is considered. The trend presented in Figure 8 shows the unconfirmed result for several second events during the subtransient period, whereas it shows the confirmed result for the second event identification even in the transient period. In addition, the subtransient period duration affected the first event type. In Figure 8a,b, first events GL, LL, and LG resulted in different fluctuations, where GL+LL successive events possessed an additional delay time for distinguishing the second event. Meanwhile, as shown in Figure 8c,d, the second event analysis using the C2 classifier is not applicable to the case of a VR second event because of the low fluctuation in the VR event. For the event location estimation, Figure 9 also shows the tendency of C1 and C2 classification accuracy with and without VR event. Hence, based on the second-event test result, the flowchart of the design can be constructed, as shown in Figure 5.

Discussion
The numerical result can be used to construct the proposed monitoring scheme as well as validate the event identification in each C1 and C2 classification. During C1 classification, variability is immediately detected when a failure occurred; hence, the decision can be achieved within a short data acquisition time. Meanwhile, when C2 classification is used for validation, additional data acquisition time is required to derive a meaningful correlation coefficient even after the transient signal is generated. In the accuracy analysis, it is interpreted that the C1 classification indicated high performance in the time period after the correlation coefficient value is obtained. By verifying the second event for various time differences, information can be analyzed not only in continuous events, but also in fluctuating system conditions. Based on the test results from the target system, the decision time factor can be designed in the flowchart for the applications. Table 2 shows the results of the expected accuracy of the second event based on a combination of the first and second events. The results in the table show the effects and relationship between event types and their delay to identification, where the VR second event is not applicable in the validation stage, and the GL first event shows an additional time delay in the subtransient signal. For the complementary use of the CNN algorithm, a method separating each data acquisition time period is considered appropriate for the typical subtransient signal and the second event signal, based on the time difference of T d . From the results of C1 and C2 classification, the data acquisition time and accuracy exhibit a trade-off relationship, in which the design process of event identification involves a complementary sequence to provide fast identification and validation, and the proposal content includes a quick judgment accuracy analysis and output category design in each complementary stage. Finally, data-analysis-based studies do not require model information except when it is difficult to acquire a real system database.

Conclusions
In this paper, the use of PMU data for monitoring the transmission network of a power system is presented. Through a data-driven approach, a complementary decision process corresponding to event identification based on types and locations is achieved. Algorithms are developed for the actual power system implementation, and the proposed structure of the multichannel CNN classifiers yielded almost real-time event identification and detection conditions under successive event occurrences. As the feature combination of successive events is difficult to obtain, the single-event features are extended to successive event identification. The results confirmed the performance in terms of accuracy and time delay in various conditions, although the required time difference between the events is a restriction. In addition, a categorical proposal is provided for successive event identification. In the future, a technical use plan for PMUs can be proposed using the big data analysis method and the application technology presented herein. In addition, through cooperation between PMU data and device control facilities based on real-time system operation data, the base technology for the integrated monitoring and control of future power grids can be secured. Finally, a monitoring system that combines and processes real-time and real-world data of power grids should be implemented in the future.