Harmonic Loads Classiﬁcation by Means of Currents’ Physical Components

: Electric load identiﬁcation and classiﬁcation for smart grid environment can improve the power service for both consumers and producers. The main concept of electric load identiﬁcation and classiﬁcation is to disaggregate various loads and categorize them. In this paper, a new practical method for electric load identiﬁcation and classiﬁcation is presented. The method is based on using a power monitor to analyze a real measured current waveform of a grid-connected device. A set number of features is extracted using the currents’ physical components-based power theory decomposition. Using currents’ physical components ensures a constant number of features, which maintains the signal’s characteristics regardless of the harmonic content. These features are used to train a supervised classiﬁer based on two techniques: artiﬁcial neural network and nearest neighbor search. The theory is outlined, and experimental results are shown. This paper demonstrates high accuracy performance in identifying an electric load from a designated database. Furthermore, the results show a deﬁnite classiﬁcation of an untrained operation state of a device to the closest trained operation state, for example, the excitation angle of a dimmer. In a comparative study, the method is shown to outperform other state-of-the-art techniques, which are based on harmonic components.


Introduction
In the last few years, the field of power grid technologies and monitoring has attracted a lot of attention. The need for energy conservation, responsible management of the demand response of the grid and the ability to implement advanced technologies in this new smart grid has resulted in emerging research and investment in power grids all over the world [1]. Smart grid is a general term that refers to the combination of various communication technologies, smart meters and control abilities to create a well-managed electrical grid. This new grid will have innovative abilities, such as optimizing energy production and consumption, while taking into account renewable energy as well as electric cars. Smart meters as well as smart monitors (which also measure and calculate power quality parameters) are an essential part of this new system [2].
Classification is considered an important decision-making task for many real-world problems in various fields, including smart grid technologies. Classification will be used when an object needs to be classified into a predefined class or group, based on the attributes of that object. Load data in a smart grid contain much valuable knowledge; therefore, electric load identification and classification play important roles in decision making regarding power systems. Since the electricity consumption patterns are unique for each user and device, classification is a possible task. Electric load identification and classification can improve the production planning and increase personalized power service for electricity consumers and producers [3]. By analyzing a load's classification, users can understand consumption patterns and adjust their electricity behavior to be more economical and optimal. Thus, electricity consumption will be used more efficiently, and the cost will be reduced significantly [4]. Therefore, the extraction of precious information from electric load data for load classification is an important research direction. Load classification is used for widespread applications such as load reliability [5], bad data identification [6], marketing polices and tariff setting [7] and load forecasting [8][9][10].
The main concept of electric load identification and classification is to separate various loads, which are measured by smart meters, to groups of similar patterns. When the accuracy of the classification is high, the analysis and decisions can lead to a better optimization. A considerable number of works focus on load classification based on various algorithms and different patterns have been suggested in the last two decades. For example, harmonic components are used to classify load patterns in [11,12]. The authors of [11] suggested using the first 25 harmonics of the current to classify loads using artificial neural network as a classifier, while, in [12], the load profiles of harmonics sources are estimated by using independent component analysis. It should be noted that the number of harmonics components are not constant and therefore it might affect the performance of the classification. In [13], the authors presented an application of support vector clustering to electrical load pattern classification. The authors of [14] tried to identify the load characteristics of nine appliances in three different operation states (In use, Standby, and Off) by support vector machines as an application of a smart home electric energy saving system. In [15], the authors presented a new method for extracting features in the wavelet domain and used them for classification of washing machines vibration transient signals. The method is tested on specific device vibration/states, but without other devices included in the system. More methods are follow-the-leader and self-organizing maps [16], their disadvantage being that the number of clusters must be allocated in advance. In [17], an improved model for k-means is used to solve the problem of randomness of the solution. More common clustering algorithms are: iterative self-organizing data-analysis technique [18] and hierarchical clustering method [19]. Additional approaches for electric load identification can be found in [20].
In this light, this paper describes and implements a new algorithm that functions as a power device classification tool for smart grid applications, based on currents' physical components (CPC) based power theory [21]. The calculated CPC decomposition elements per device are used as features which define a unique signature per waveform. These features are used as inputs to an artificial neural network (ANN) and nearest neighbor classifier.
The integration of CPC theory and supervised classifier for device identification has created a new powerful classifier due to:

1.
A power theory yields a unique harmonic decomposition which has physical meaning. Namely, the features are not simple classifiers but also enable accurate calculation and measurement of the various currents and powers of the load.

2.
CPC enables the generation of a fixed number of features, regardless of the harmonic content of the load. Other methods (e.g., [11,12]) use many features, which imply complex search and computing time.

3.
The method enables identifying various modes of operation of the same device. Other methods show good results when there is significant difference; CPC can classify minor load changes such as excitation angle of a dimmer.
The objective of this paper is to present the theory as well as experimental results of supervised classifier technique combining the CPC theory with the ability to:

1.
Identify a device from known database 2.
Identify a specific operation state of a device The results, obtained by real measurements from a power monitor, show high classification accuracy.
The paper is organized as follows. Section 2 is a theoretical background, which reviews the theory of currents' physical component based power theory and the theory of artificial neural networks. Then, Section 3 is devoted to feature extraction by means of CPC and signal classification using ANN and nearest neighbor. Section 4 contains simulated as well as experimental results, based on the proposed theory. Finally, Section 5 contains some discussion and conclusions.

Currents' Physical Component Theory Overview
The Currents' Physical Component based power theory deals with the calculations of various powers in a single and three phase systems. The concept is applied usually for the calculation and definition of non-active powers in the presence of non-sinusoidal currents and voltages. The spectra of each waveform are used to define the waveform origin by means of CPC theory. CPC theory was developed by L. S. Czarnecki [21]. Since it was first publicized, there have been developments and adaptations of this power theory [22][23][24][25]. In this section, a short overview of the method is presented with emphasis on the components and the powers which are used in this study as features for uniquely identifying an electric load.
Fourier series is the most general way to present time domain non-sinusoidal harmonic voltage [V] where U 0 is the DC component (which is ignored in the rest of the paper since it usually does not exist) and U n is the complex RMS value of the voltage at the nth harmonic, where n ≤ N, and N is an integer. The harmonic Y n admittance for the nth harmonic can be expressed by the complex expression, Multiplying the voltage and admittance yields the resulting current harmonic Fourier series [A] as follows The active power of such a system is derived from the scalar product of Equations (1) and (3) and therefore only the harmonic components which have the same index in both the voltage and the current will contribute to this power [W], where U n and I n are the voltage and current amplitudes at the nth harmonic, and φ n is the angle between the voltage and the current at the same harmonic. After the active power is calculated, the equivalent active conductance of the system can be defined as where u is the RMS value of the voltage, which can be expressed as The active current i a can now be calculated as G e U n e jnw 1 t ).
Subtracting this active current from the general current yields The imaginary part of Equation (8) can be considered as i r , the reactive current, which is π/2 shifted relative to the same voltage harmonic [26,27], (jB n U n e jnw 1 t ).
The remaining part of Equation (8) is called the scattered current i s . This current appears if G n = G e and it is a measure at voltage u(t) of the source current increase due to a scattering of conductance G n around the equivalent conductance G e [26,27] as shown next Nonlinear loads can be considered as harmonic generating loads. In the presence of such loads the direction of energy flow between the source and the load can be decided by investigating the angle between the current and the voltage φ n of the nth harmonic. If |φ n | < π/2, then there is an average energy flow from the source to the load and, if |φ n | > π/2, then the energy flow is from the load towards the source. In this case, the harmonics can be sorted into two groups: (1) N D is the set of harmonics which are originated at the source; and (2) N C is the set of harmonics which are originated at the load. Therefore, the system can be divided into two subnetworks at the measuring point. The first is the source network, denoted by D and include the N D set of harmonics. The second is the load subnetwork, C, which includes the N C set of harmonics. Each subnetwork can be considered to have its own current and voltage, as noted below.
The current which is associated with the N C group is called the load generated current, i c . The total current can now be written as This current can be rewritten with respect to the N D and N C disaggregation as follows: where N is the set of all harmonic content in the decomposed total current. This set is divided into the source and load sets N D and N C , accordingly [28]: Note that the load generated current can be also disaggregated to the i aC , i rC , and i sC components; however, in this work, we limit the amount of features and take into consideration the aggregated i C .
It can be shown that all components in CPC theory are orthogonal and therefore The same procedure can be implemented to the voltage. Thus, the expression for the voltage, taking into account the direction of the energy flow, is: In [28], a detailed discussion is presented regarding the decomposition and analysis of various powers in a non-sinosoidal system. There, the apparent power S can be expressed as: where S 0D and S 0C are the apparent powers of the subnetworks D and C when they are considered as sourceless loads, and S F is the forced apparent power, which is the voltage of each subnetwork (D and C) combined multiplied by the current of the other subnetwork. Namely, the voltage u D is multiplied by i C and vice versa. Now, the powers S 0D and S 0C can be decomposed into the active, reactive and scattered powers. S 0D can be written as: where P aD is the active power, Q rD is the reactive power and D sD is the scattered power of subnetwork D. For the purpose of this paper, we do not decompose S 0C to limit the number of features. Therefore, the total apparent power can be expressed as: There has been a debate about the applicability of this method for measuring power, and whether these components are mathematical or physical [24,29]. However, this paper focuses on the classification and recognition method, and, for these purposes, CPC is a promising method for extracting a fixed, small amount of descriptive and quantity features from the signals, as opposed to using the FFT harmonics, which will have a different number of features for each signal depending on its nonlinearity extent.

Artificial Neural Network
The artificial neural network (ANN) is a technique inspired by the neural structure of the brain that mimics the learning capability from experiences. It means that, if a neural network is trained using past data, it will be able to generate outputs based on the knowledge extracted from the training. Neural networks, a machine-learning method, have been used to solve a wide variety of tasks that are hard to solve using ordinary rule-based programming. When there is a sufficient amount of data to train the ANN, classification produces good results. As mentioned in [30], there are many advantages to using ANN for the power and energy field. In this study, we used the back-propagation algorithm for the neural network [31]. The network contains three layers: input layer, referred to by the index i; hidden layer, referred to by the index j; and output layer, referred to by the index k.
In the training phase, as shown in Figure 1, the training data, features with known labels, are fed into the input layer. In this stage, appropriate weights between the nodes are adjusted iteratively for improving the network until it can perform the task for which it is being trained. After the weights are updated, the network best maps a set of inputs to their correct output.
One of the central ideas in ANN is the transfer function, which is a monotonically increasing, continuous function, applied to the weighted input of a neuron to produce its output. A well-known transfer function is the sigmoid which defined as An exponential function of the derivative is simply Another exponential function with simple derivative is softmax, which is defined as Due to the simplicity of derivative, these functions were chosen as the transfer function of the nodes in the training network. The sigmoid is mainly used in the hidden layer while the softmax is mainly used in the output layer. Next, simplified back-propagation algorithm is describing shortly: 1.
Stage 1: The features are fed into the first layer. The results of the output of the last layer y k are calculated by using initial random weights as follows: where f k is the transfer functions for calculating the result of the output layer nodes and f j is the transfer functions for calculating the output results of the hidden layer nodes. w j,i is the weight from the input layer of node i to the hidden layer of node j and w k,j is the weight from the hidden layer of node j to the output layer node k. These weights are initially randomly chosen and are updated in each iteration.

2.
Stage 2: For each output, we calculate the error signal for a neuron. The idea is to propagate iteratively an error signal δ back to all neurons. The error signal in the output neurons can be expressed as: where E is the total network error function. The error function in classic back-propagation is the mean squared error (MSE) where y k is the target value of the input andŷ k is the computed output of the network on the input.

3.
Stage 3: Using the error signal δ k , the new error signal for the hidden layer δ j is calculated as: Then, the addition to the weight can be calculated and added to the last computed weight: The same process is done for the error signal in the input layer. Now, the new output can be calculated by Equation (22) with the updated weights 4. Stage 4: MSE condition is formulated for keeping the results in a bounded region by choosing a threshold such that E > Threshold. As long as the condition is maintained, the back-propagation algorithm is repeated from Stage 2 to Stage 4.

CPC Features Extraction
The theory of CPC as presented in Section 2 enables a unique representation of currents and voltages in a non-sinusoidal environment. In that aspect, the various currents and powers form a finger print of the waveforms and therefore of their origin (the electric load or any other harmonic anomaly in grid). The input current and voltage waveforms are in the time domain.
First, FFT is used for extracting the various harmonics of the wave forms. Using these harmonics as features of the waveform will result in processing a lot of data, especially in highly nonlinear loads. Therefore, transforming the input data into sets of CPC features reduces the amount of data which represent the currents and voltages waves. The various current harmonics are then sorted into their physical components and the powers are calculated using the CPC theory [23], as presented in Section 2. All values of the currents and voltages are normalized so the measurements and simulations are not voltage level-dependent. The feature extraction module is presented in Figure 2. In Equations (14) and (18), the various currents' and powers' components are presented. For load identification, we limit ourselves to only 10 features, five features of currents and five of powers. The chosen extracted features, as described in Figure 2, are all the five normalized RMS currents' as follows: i 2 , i aD 2 denoted as i a 2 , i rD 2 denoted as i r 2 , i sD 2 denoted as i s 2 and i C 2 . The five power components that were chosen are: P aD denoted as P, Q rD denoted as Q, D sD denoted as D s , S 0C denoted as D C and S the total apparent power. The process of feature generation can be seen in Figure 2 and is summarized as follows.

Classification Using Artificial Neural Network
In this work, the neural network objective is to classify the CPC features into a set of target categories which presents the type of the device. As mentioned in the previous section, there are two important stages in ANN operation:

1.
The training stage with the back-propagation algorithm adjusts the weight of neurons by calculating the gradient of the loss function on known inputs [31]. In this study, the loss function is the mean square error and the transfer function is the sigmoid for the hidden layer and softmax for the output layer.

2.
The inference stage, in which the trained network is used in real time for classify new and unknown single device. Figure 3a shows the complete training process; the process includes a database of voltage and current waveforms of known measured devices. Each waveform is an input to the CPC feature extraction procedure, as explained above and summarized in Figure 2. Each signal is now represented by its CPC components (features). These features are the input of the neural network. Figure 3b shows the inference/classify process on unknown machine waveform. The CPC procedure is obtained in the same manner as was described in training process. These features are the inputs to the trained network, as shown in

Classification Based Nearest Neighbor Search
Nearest neighbor search [32] is considered to be one of the simplest classification algorithms. The algorithm assigns an unlabeled point to the nearest (most similar) sample in a known set by using a distance metric (such as Euclidean distance).
Formally, the nearest neighbor search problem is defined as follows: A set of n known pairs is given (x 1 , y 1 ), (x 2 , y 2 ), . . . , (x n , y n ) such that x i is a known variable in space X and the category y i is assigned to the x i sample. The nearest neighbor goal is to classify a new measurement x , to y based on the closest known y i using a defined measurement metric dist.
x NN is defined as the nearest neighbor of x , In the case dist is chosen to be the Euclidean distance and X is m-dimension space, Equation (29) can be written as In this work, the nearest neighbor search classification process is used on the CPC features. A set of n known CPC vectors with 10 dimensions is given such that each vector is assigned to a device category that is known in advance. When trying to classify an un-labeled waveform, the CPC procedure is activated for feature extraction. The features are used to classify the waveform to the closest known label based on Euclidean distance.

General
For the verification of the proposed method, all the above theory was implemented in Matlab and the Neural Network Pattern Recognition Toolbox (patternnet) [33] was used for the ANN implementation. The results of the simulations are presented in two categories:

1.
Identifying a single device out of a variety of devices 2.
Identifying the operation state of single device For the first category, nine devices were tested: 1.
Power filter with inter-harmonics 2.
Air conditioner 8. Toaster 9. Dimmer The devices were either simulated by MATLAB and SIMULINK or were extracted from real measurements by power meter (SATEC EM720) with 128-1024 samples per period, as presented in Figure 5.
The second category was designed to show the ability of detecting the status of a device. It is well-known that continuously variable device is a problematic device to classify. Therefore, in this paper, we consider the light dimmer device as five different devices. This is done by considering various excitation angles of the dimmer as a separate device and therefore the classification of such a device is possible as shown ahead. The five excitation angles that were used are: 30 • , 60 • , 90 • , 120 • and 150 • .

CPC Feature Extraction and ANN Training Procedure
In Tables 1-3, the devices CPC feature values are summarized. These properties are the inputs of the ANN for identifying and classifying a device. In Tables 1 and 2, the CPC features of eight devices are presented. For each machine, the normalized values of the currents and powers by means of CPC are calculated. The ANN was trained with various excitation angles of the dimmer, which represent different operation modes of the same device. The CPC features of the dimmer in the various excitation angles are summarized in Table 3. It is evident in Tables 1 and 2 that the CPC features are clearly different for each device. Moreover, the CPC features of two operation states of the same dimmer also show noticeable differences based on Table 3.   Next, examples of current waveforms and their CPC features are presented. For each figure, the left part is a simulated current waveform of a nonlinear load and the right part is the CPC features of this device. In Figures 6 and 7, the current waveform of a filter and power diode are presented, respectively, and, in Figures 8 and 9, the current waveform of a dimmer with excitation angle of 60 • and 120 • are shown as representing examples of various operation modes of the same device.     Figure 10 presents the network's architecture and size in MATLAB. In this work, a two-layer feed-forward network was used. The network contains a hidden layer with sigmoid and output layer with softmax as transfer functions. The 10 CPC features are the input to classifier and the 13 devices are the outputs. The devices include the mentioned nine devices, while the dimmer is considered to be five devices, since five different excitation angles are recorded as separate waveforms (note that the dimmer is also considered as one of the nine devices). In addition, the number of neurons in the hidden layer was chosen to be 16 empirically by trial and error. To enable proper training of the ANN, all signals were also altered by adding noise. Each signal was recorded 20 times with various levels of additive white Gaussian noise. The total success of the training process was 99.2%, as can be seen from the generated confusion matrix [34] in Figure 11. In the confusion matrix, each cell on the diagonal of the matrix contains the number of machines that were correctly classified. For example, the number in the second cell on the diagonal indicates that, out of the 20 examples, 19 were correctly classified. The off diagonal cells indicate all misclassification. The cell in Row 12 and Column 3 has a value of 1, namely Machine 12 was classified as being Machine 3. In this case, a dimmer with 120 • excitation angle is misclassified as being a power thyristor. The last cell in the diagonal shows the total success of the training process, in this case 99.2%, which is very good result.

Evaluation Metrics
Most of the evaluation metrics formulate as binary classification tasks, and have four possible outcomes: true positive (TP) is the number of times a device is correctly detected as ON, true negative (TN) is the number of times a device is correctly detected as OFF, false positive (FP) is the number of times a device is wrongly detected as ON, and false negative (FN) is the number of times a device is wrongly detected as OFF. In this work, we use the F-measure metric, which is common in the information retrieval domain. The F-measure [35] is the harmonic mean of precision and recall, which is given by where Here, precision is defined as the correct classification in all positive estimations, and recall is defined as the percentage of ON appliances that are classified correctly.

Simulation Results
Each device was tested 30 times with different noise. To verify the feasibility of the CPC classifier, 11 extra state operations of dimmer excitation angles were used, for which the network was not trained.  Table 4 shows a classification of each untrained angle to the closest trained one. From a database that includes various devices and various dimmer excitation angles, the device with the closest operating mode was chosen.  Figure 12 is an example of a noised waveform of dimmer with excitation angle of 70 • . When comparing the dimmer CPC features with the ones of a dimmer with excitation angle of 60 • excitation angle, as shown in Figure 8, and the features resulting in Table 3, the CPC features of the dimmer with 70 • are very close to the ones of a dimmer with excitation angle of 60 • , and also close to but less than the dimmer with excitation angle of 90 • . Therefore, there is a high probability that the net will classify this dimmer as the dimmer with excitation angle of 60 • option. For validation, Table 5 summarizes the F-measure results based on two methods: the suggested CPC features and 25 harmonic components [11]. The F-measure results based on CPC features is 0.87 for ANN and 0.86 for nearest neighbor, while the the F-measure results based on harmonic components is 0.83 for ANN and 0.84 for nearest neighbor. There are two devices in the CPC methods for which their classification is less accurate when compared to the harmonics components: the power thyristor, which is classified as single phase inverter and vice versa, and also the dimmer with excitation angle of 60 • , which is classified as dimmer with excitation angle of 90 • and vice versa. For power diode, microwave, computer, air conditioner, toaster and dimmer with excitation angle of 30 • the CPC method results outperform the harmonics components method. Even though the number of CPC features is smaller than the harmonics, the accuracy is better and contains more information. Each device has a different number of harmonics, depending on its nonlinearity extent. When more harmonics are calculated per device, there might be over-fitting due to noise. If fewer harmonics are calculated than needed, then some information might be lost. Therefore, some of the device classifications are not accurate for the harmonic components. In addition, we believe that the reason the nearest neighbor classifier is better than ANN classifier in the harmonic components methods is due to over-fitting of the data.

Conclusions
In this paper, CPC-based supervised classifiers are used for electric load identification and classification. The classifier is based on two different techniques: artificial neural network and nearest neighbor. The theoretical background of CPC and the chosen classifier procedure are explained and demonstrated. The experimental results of the simulations are presented in two categories: (1) identifying a single device out of a variety of devices; and (2) identifying the operation state of a single device.
Using CPC implies a constant and small number of features, which keeps the signal's property regardless of the harmonic content. This study has shown high success in identifying a device activation out of 13 input devices stored in the database when compared to harmonic components features. Moreover, an example if using varying excitation angles of a dimmer is proposed to demonstrate the ability of the system to identify the closest operation mode of a device. The results show that the trained classifier identified each untrained excitation angle as the closest to it.
Other methods in the literature (e.g., [11,12]) did not test the classification of an untrained operation. In contrast, the authors of [15] classified a specific device (washing machine) operation, but without including other devices in the data. Future research will include CPC features as electric load signature for classification in larger device databases, power quality tasks and later in non-intrusive load monitoring problems [36,37].