Identiﬁcation of the State of Electrical Appliances with the Use of a Pulse Signal Generator

: The paper presents the novel HF-GEN method for determining the characteristics of Electrical Appliance (EA) operating in the end-user environment. The method includes a measurement system that uses a pulse signal generator to improve the quality of EA identiﬁcation. Its structure and the principles of operation are presented. A method for determining the characteristics of the current signals’ transients using the cross-correlation is described. Its result is the appliance signature with a set of features characterizing its state of operation. The quality of the obtained signature is evaluated in the standard classiﬁcation task with the aim of identifying the particular appliance’s state based on the analysis of features by three independent algorithms. Experimental results for 15 EAs categories show the usefulness of the proposed approach.


Introduction
The Non-Intrusive Appliance Load Monitoring (NIALM or NILM) [1] is a solution for the problem of collecting electrical energy consumption data more accurately than using only typical electricity meters. The methodology (also known as energy disaggregation [2]) is used for power systems analysis, in which demand for energy continuously increases. The purpose of the appliances' load identification is to provide information about the energy consumption of individual devices. This may lead to a decrease in electricity consumption and suppressing environmental pollution [3]. According to [4] the application of NIALM approaches might lead to a reduction of household energy consumption by at least 12%. Another potential application is the diagnostics of electrical appliances [5], like monitoring device degradation or detecting supply network's state in the presence of external disturbances, like voltage spikes, insulation decrease, etc. In the NILM architecture, measurements are done close to the energy meter, in contrast to intrusive systems where every socket or device is equipped with a suitable sensor [6]. When new appliances are plugged into such systems, the measurement hardware is not expanded. Acquired values are typically aggregated currents and voltages [7]. Characteristic features allowing for the identification of a particular Electrical Appliance (EA) are obtained individually during training in the specific deployment location.
The taxonomy of NILM methods considers multiple criteria. Firstly, they can be classified based on the frequency of the measured signals [7,8]. In [16] four types of frequency-based methods were identified: LF (Low Frequency), MF (Medium Frequency),

HF-GEN Method for Determining the Characteristics of EA
Known methods exploiting the analysis of the transient currents and voltages during the appliance's state change are still a minority. Most EA identification algorithms rely on the characteristic features determined in steady states of operation. The transient signal is a result of changing the state of the device (for instance, by turning it on). Electrical signals recorded at the moment of the transient state change must be analyzed. The key to detecting the EA state change is to find proper features of the impulse signal. They should clearly distinguish impulse signals appearing as a result of changes in the states of various EAs. Two problems emerge that significantly limit the applicability of such approaches. First, EAs are switched on with a random voltage phase, so the transient states of the examined EAs also have the random voltage phase. Secondly, EAs in the background influence parameters of the transient signals. Both factors affect the shape of the analyzed impulse and make assigning the specific transient to a device difficult.
In the HF-GEN method, the generated current pulse signal is introduced into the tested power network circuit. The analyzed impulse is therefore the effect of a deliberately created transition state, not related to any EA. The pulse is generated many times at regular intervals. When the EA state changes, the pulse shape also changes, because it is characteristic of the particular EA. Detection of the EA state change consists of observing corresponding changes in the pulse features, which form the EA signature. The latter should unequivocally identify the specific EA. The principle of the HF-GEN method is illustrated in Figure 1. The pulse shape changes between the appliance's "on" and "off" states.

HF-GEN Method for Determining the Characteristics of EA
Known methods exploiting the analysis of the transient currents and voltages during the appliance's state change are still a minority. Most EA identification algorithms rely on the characteristic features determined in steady states of operation. The transient signal is a result of changing the state of the device (for instance, by turning it on). Electrical signals recorded at the moment of the transient state change must be analyzed. The key to detecting the EA state change is to find proper features of the impulse signal. They should clearly distinguish impulse signals appearing as a result of changes in the states of various EAs. Two problems emerge that significantly limit the applicability of such approaches. First, EAs are switched on with a random voltage phase, so the transient states of the examined EAs also have the random voltage phase. Secondly, EAs in the background influence parameters of the transient signals. Both factors affect the shape of the analyzed impulse and make assigning the specific transient to a device difficult.
In the HF-GEN method, the generated current pulse signal is introduced into the tested power network circuit. The analyzed impulse is therefore the effect of a deliberately created transition state, not related to any EA. The pulse is generated many times at regular intervals. When the EA state changes, the pulse shape also changes, because it is characteristic of the particular EA. Detection of the EA state change consists of observing corresponding changes in the pulse features, which form the EA signature. The latter should unequivocally identify the specific EA. The principle of the HF-GEN method is illustrated in Figure 1. The pulse shape changes between the appliance's "on" and "off" states. The following were the experiments' assumptions: • the analysis covers impulse signals generated by the same source, • the source of transients is a pulse signal generator connected to the tested power network, • the pulse signal appears in a specific phase of the supply voltage, • a pulse signal is produced at regular intervals, • the maximum frequency of measured signals is 15 MHz.
The purpose is to find changes in the pulse signal caused by the load change in the supply circuit. The load on the power circuit depends on the set of EAs connected to it. Characteristics of the impulse signal are related to the specific EA, therefore enabling identification of the moment when the particular device is turned on. The block diagram of the HF-GEN method is shown in Figure 2. The first step is the generation of the impulse signal. The generator detects the supply voltage phase and then inputs the pulse signal to the LV (Low-Voltage) circuit. In the second step, the pulse current is measured with the sampling frequency of 30 MS/s. The acquired samples are processed to select their subset acquired during 4 ms after the pulse detection. Next, cross-correlations between the samples' vector and transients patterns stored in the dictionary are calculated. A signature The following were the experiments' assumptions: • the analysis covers impulse signals generated by the same source, • the source of transients is a pulse signal generator connected to the tested power network, • the pulse signal appears in a specific phase of the supply voltage, • a pulse signal is produced at regular intervals, • the maximum frequency of measured signals is 15 MHz.
The purpose is to find changes in the pulse signal caused by the load change in the supply circuit. The load on the power circuit depends on the set of EAs connected to it. Characteristics of the impulse signal are related to the specific EA, therefore enabling identification of the moment when the particular device is turned on. The block diagram of the HF-GEN method is shown in Figure 2. The first step is the generation of the impulse signal. The generator detects the supply voltage phase and then inputs the pulse signal to the LV (Low-Voltage) circuit. In the second step, the pulse current is measured with the sampling frequency of 30 MS/s. The acquired samples are processed to select their subset acquired during 4 ms after the pulse detection. Next, cross-correlations between the samples' vector and transients patterns stored in the dictionary are calculated. A signature

Pulse Signal Generation
The block diagram of the pulse signal generator with two connectors/ports is shown in Figure 3. The first one, the input and output port (I/O), is connected to the tested circuit of the power network with a voltage of 230 V and a frequency of 50 Hz. This port is marked as input (I) because it is used to supply the generator with voltage. It is also treated as the output port (O), because of providing the impulse current signal to the power network. The O-SYN port is used to get the synchronization signal outside the generator. It determines time instances of the pulse signal generation. The synchronization output is used to control the acquisition system. When designing the generator, the following parameters were assumed: • the maximum value of the current pulse signal is 10 A, • the rise time of the pulse is 60 μs, • total pulse duration is less than 1 ms, • interval between successive impulse triggers is less than 1 s.

Pulse Signal Generation
The block diagram of the pulse signal generator with two connectors/ports is shown in Figure 3. The first one, the input and output port (I/O), is connected to the tested circuit of the power network with a voltage of 230 V and a frequency of 50 Hz. This port is marked as input (I) because it is used to supply the generator with voltage. It is also treated as the output port (O), because of providing the impulse current signal to the power network. The O-SYN port is used to get the synchronization signal outside the generator. It determines time instances of the pulse signal generation. The synchronization output is used to control the acquisition system. When designing the generator, the following parameters were assumed: • the maximum value of the current pulse signal is 10 A, • the rise time of the pulse is 60µs, • total pulse duration is less than 1 ms, • interval between successive impulse triggers is less than 1 s.
The pulse signal generator consists of a matching circuit (MC-GEN), an Analog-to-Digital converter (AD-GEN), a Digital Output (DO), a Relay (RE), and an Attached Load (AL). The measurement and generation system is connected to a computer (PC-GEN) on which the Control Software (CS) is running.
The pulse amplitude depends primarily on the voltage phase in which the Attached Load (AL) is connected to the power grid. The pulse amplitude is proportional to the voltage value at the moment of turning the AL on. Setting the constant voltage phase (the same each time) is the biggest challenge. The time instant must be synchronized with the phase of the supply voltage. The process is as follows: the main voltageu(t) is applied to the MC-GEN, which converts the voltageu(t) into the voltageu AD−GEN (t) whose amplitude matches the dynamic range of the AD-GEN input. The AD-GEN converts voltageu AD−GEN (t) to samplesu n with a speed of 250 kS/s. Based on the voltage samplesu n , the CS detects the supply voltage phase. As a result of its operation, the logic signalo n is given to the input of DO, assuming a high value when the impulse signal is generated. DO converts the logic signalo n to the voltageu SYN (t). The O-SYN synchronization output is  put is triggered at the right moment by a high voltage level. The main function of the RE is to apply the supply voltage to AL when a high voltage level appears at SYN () ut .
The pulse shape parameters are determined by the AL. The rise time of the pulse and its total duration depends on the AL impedance. Contrary to the tested EA, AL is a known load with a specific transmittance, temporarily connected to the supply network to change the parameters of the network. In practice, any appliance approved for use in the LV grid, for example, an energy-saving light bulb, can be used as AL. In such a situation, the pulse generator is no different than other appliances in the network. It is a typical load, connected to the network at specified intervals (e.g., 1 s) for a specific time (e.g., 40 ms).

Measurement Method
In the HF-GEN method, the measured signal is the impulse in the current introduced to the tested circuit by the signal generator. The parameters of the signal change with the load of the tested network after introducing the specific EA. This fact is used to detect the change in the EA state.
The measurement system from Figure 4 consists of a transient generator (GEN), an electrical appliance energy receiver (EA), a Current-Voltage Converter (CVC), an Acquisition Card (AC), a computer (PC), software (SW), and memory (MM). The tested EA and The pulse shape parameters are determined by the AL. The rise time of the pulse and its total duration depends on the AL impedance. Contrary to the tested EA, AL is a known load with a specific transmittance, temporarily connected to the supply network to change the parameters of the network. In practice, any appliance approved for use in the LV grid, for example, an energy-saving light bulb, can be used as AL. In such a situation, the pulse generator is no different than other appliances in the network. It is a typical load, connected to the network at specified intervals (e.g., 1 s) for a specific time (e.g., 40 ms).

Measurement Method
In the HF-GEN method, the measured signal is the impulse in the current introduced to the tested circuit by the signal generator. The parameters of the signal change with the load of the tested network after introducing the specific EA. This fact is used to detect the change in the EA state.
The measurement system from Figure 4 consists of a transient generator (GEN), an electrical appliance energy receiver (EA), a Current-Voltage Converter (CVC), an Acquisition Card (AC), a computer (PC), software (SW), and memory (MM). The tested EA and GEN are powered from the network with an RMS voltage of 230 V and frequency of 50 Hz.
The supply network voltageu(t) is provided to GEN through the I/O terminals connected to the phase conductor L1 and the neutral conductor N. The synchronization voltageu SYN (t) is supplied from the synchronization output O-SYN of GEN to the synchronization input of the Analog-to-Digital Converter. High levels ofu SYN (t) determine time instants for pulse generation. The currenti(t) is converted by the CVC intou AD (t) voltage with a level adjusted to the dynamic range of the analog input of the acquisition card (AC) converter, providing samplesi n .
The voltageu SYN (t) also triggers the acquisition of current samples when a pulse is generated. The SW running on PC controls the AC operation and collects the current samplesi n storing them in MM for further analysis. Due to triggering the AC converter acquisition, the amount of data for processing is significantly reduced.  The voltage SYN () ut also triggers the acquisition of current samples when a pulse is generated. The SW running on PC controls the AC operation and collects the current samples n i storing them in MM for further analysis. Due to triggering the AC converter acquisition, the amount of data for processing is significantly reduced.

Selection of Current Samples
The result of data acquisition is the current vector =    1 N ii i (see Figure 5). It contains current samples recorded around (before and after) the pulse manifestation.

Selection of Current Samples
The result of data acquisition is the current vectori = [i 1 . . . i N ] (see Figure 5). It contains current samples recorded around (before and after) the pulse manifestation.  Due to the effectiveness of further calculations, only a selected fragment of the cu rent vector is analyzed. This is because some fragments of the obtained current data not contain useful information. Specifically, the current vector contains data measur prior to generating the current pulse (e.g., current vector samples from 1 to 125,000 Figure 5). The data in this fragment of the current vector bear no information characteris for the tested EA.
The most relevant is the fragment of the current vector near the largest pulse pea Therefore only part of the original vector (i.e., SEL i ) is extracted for analysis. The vec i is filtered by the high-pass filter with a cut-off frequency of 1kHz, which enables eff tive suppression of the 50 Hz component and its harmonics (100 Hz, 150 Hz, and so o Then, the maximum of the high-frequency components (i.e., above 10 kHz) is found. T vector SEL i contains 2700 selected samples around the maximum of the high-frequen components. Figure 6 shows example of the SEL i vector. Due to the effectiveness of further calculations, only a selected fragment of the current vector is analyzed. This is because some fragments of the obtained current data do not contain useful information. Specifically, the current vector contains data measured prior to generating the current pulse (e.g., current vector samples from 1 to 125,000 in Figure 5). The data in this fragment of the current vector bear no information characteristic for the tested EA.
The most relevant is the fragment of the current vector near the largest pulse peak. Therefore only part of the original vector (i.e.,i SEL ) is extracted for analysis. The vectori is filtered by the high-pass filter with a cut-off frequency of 1 kHz, which enables effective suppression of the 50 Hz component and its harmonics (100 Hz, 150 Hz, and so on). Then, the maximum of the high-frequency components (i.e., above 10 kHz) is found. The vectori SEL contains 2700 selected samples around the maximum of the high-frequency components. Figure 6 shows example of thei SEL vector.  i is filtered by the high-pass filter with a cut-off frequency of 1kHz, which enables effective suppression of the 50 Hz component and its harmonics (100 Hz, 150 Hz, and so on). Then, the maximum of the high-frequency components (i.e., above 10 kHz) is found. The vector SEL i contains 2700 selected samples around the maximum of the high-frequency components. Figure 6 shows example of the SEL i vector.

Preparation of a Dictionary of Transients
The dictionary of transients D is a set of selected fragments of the current vectors containing pulses for various appliances: Figure 6. Current vectori SEL for sample measurement data.

Preparation of a Dictionary of Transients
The dictionary of transients D is a set of selected fragments of the current vectors containing pulses for various appliances: wherei DIC are the most interesting fragments of vectors i describing the pulse andLDIC is the number of examples. Figure 7 shows the method of preparing the dictionary. Samples from vectori are selected as in Section 2.3. Then, the initial and terminal indexes of the transition are marked, leading to the structure presented below.  Figure 7 shows the method of preparing the dictionary. Samples from vector i are selected as in Section 2.3. Then, the initial and terminal indexes of the transition are marked, leading to the structure presented below.     The marking process is performed by specifying the initialn START and terminaln STOP indexes. The fragmenti DIC is then extracted as follows: (2) whereN DIC = n STOP − n START + 1 denotes the number of samples ini DIC . Figure 8 shows the example ofi SEL with the marked indicesn START andn STOP (a) and the extractedi DIC (b).
n n denotes the number of samples in DIC i . Figure 8 shows the example of SEL i with the marked indices START n and STOP n (a) and the extracted DIC i (b).
(a) (b)  For each considered EA, 10 examples of transition states were added to the dictionary. They differ in amplitude and shape. The selected number is the compromise between the variety of stored data and the computational effort required to obtain examples. An example is a current vector and corresponding category from the setD CAT (which cardinality determines the number of identified appliances N EA ). Therefore, the number of vectorsi DIC in the dictionary isLDIC = 10·N EA .

Determining the Cross-Correlation
In this stage, the maximum correlation between the measured signali SEL and subsequent dictionary entriesi DIC is found. The vectori SEL is longer than the current vector from the dictionaryi DIC , so the correlation is calculated for all possible shifts betweeni DIC andi SEL .
The vectori SEL hasN SEL = 120, 000 samples (representing the duration of 4 ms for sampling frequency f S = 30 MHz). The correlation will be determined many times for each transition state. Therefore, the method of determining the cross-correlation should be computationally efficient. The determination of the cross-correlation without normalization was considered due to the simplicity and efficiency of calculations. In the discussed problem, the cross-correlation without normalization cannot be used, because the elements of current vectors mainly contain a fundamental component of the current signal with a frequency of 50 Hz. On the other hand, pattern vectors only contain components with frequencies at least 200 times greater than the fundamental component. The 50 Hz frequency component significantly changes the average value of the current vector, and as a result, significantly affects the value of cross-correlation without the normalization. The measure of similarity between sample vectors based on the Pearson correlation coefficient was used. The mean and standard deviation for each fragment of the vectori SEL was calculated, which requires significant computational effort. Therefore, the optimized calculation method [36] was used.
As a result, vectors of correlations r and shiftsc were obtained. Figures 9-11 illustrate the procedure. The measure of similarity between sample vectors based on the Pearson correlation coefficient was used. The mean and standard deviation for each fragment of the vector SEL i was calculated, which requires significant computational effort. Therefore, the optimized calculation method [36] was used.
As a result, vectors of correlations r and shifts c were obtained. Figures 9-11 illustrate the procedure.  The measure of similarity between sample vectors based on the Pearson correlation coefficient was used. The mean and standard deviation for each fragment of the vector SEL i was calculated, which requires significant computational effort. Therefore, the optimized calculation method [36] was used.
As a result, vectors of correlations r and shifts c were obtained. Figures 9-11 illustrate the procedure.    Figure 12a shows the cross-correlation vector r as a function of delay c for the example of measurement data. Figure 12b shows the same relationship for the vector fragment r with the highest correlation values.  Figure 12a shows the cross-correlation vectorr as a function of delayc for the example of measurement data. Figure 12b shows the same relationship for the vector fragmentr with the highest correlation values.  Figure 12a shows the cross-correlation vector r as a function of delay c for the example of measurement data. Figure 12b shows the same relationship for the vector fragment r with the highest correlation values.

Signature Calculation
The signature parameters are the maximum cross-correlation determined between the current vector SEL i and all current vectors ( ) D DIC l i from the dictionary of transients.

Signature Calculation
The signature parameters are the maximum cross-correlation determined between the current vectori SEL and all current vectorsi DIC from the dictionary of transients. Correlation vectors for successive current vectorsi DIC are denoted asr l D . The set of categoriesD CAT from the dictionary of transitions is used to name successive signature features. The idea is presented in Figure 13.
The EA signature contains maximum values of the cross-correlation between the analyzed current vector and the individual dictionary elements. Signature features are determined as the maximum absolute value of the cross-correlationr l D between the analyzed current vectori SEL and the stored current vectori (l D ) DIC : wherex ∈ {1, . . . , N EA }, y ∈ {A, B, C, D, E, F, G, H, I, J}.
The computed cross-correlation with the marked maximum value for sample measurement data are presented in Figure 14.
A signatures l consists ofP HF−COR = 10·N EA features, arranged in a specific order. Names of features and their acronyms are listed in Table 1. The computed cross-correlation with the marked maximum value for sample measurement data are presented in Figure 14.
whereLSP is the total number of transients processed.

Signature Quality Assessment Method
The signature well describes devices if its features allow for distinguishing between them. Feature vectors for the same appliance should be similar to each other. The purpose of the signatures quality assessment is to verify if they can be used to identify appliances.
The process is presented in Figure 15. Division of available data into training and testing sets is important. The K-fold Cross-Validation (CV) withK = 10 was used here. The data set is split K times into training and testing subsets (with the ratio of 9:1) in such a way that each EA is represented by the single signature in the testing set. The training sets were used to extract knowledge for the intelligent classifier, while the testing ones were applied to verify their generalization abilities. The classification accuracy was averaged on all trials.  For each round of the CV, each classifier is trained and tested separately (see Figure  16). This way all approaches can be compared. Also, their fusion may be applied if necessary. Each algorithm has specific advantages and hyperparameters. For instance, DT during training selects features based on which rules are constructed. This is the problem for kNN, where the subset of signature values must be manually selected or weighted. Also, the number of neighbors influences diagnostic accuracy. One CV round produces four vectors: • actual appliances identifiers in the testing set-   For each round of the CV, each classifier is trained and tested separately (see Figure 16). This way all approaches can be compared. Also, their fusion may be applied if necessary. Each algorithm has specific advantages and hyperparameters. For instance, DT during training selects features based on which rules are constructed. This is the problem for kNN, where the subset of signature values must be manually selected or weighted. Also, the number of neighbors influences diagnostic accuracy. One CV round produces four vectors: • actual appliances identifiers in the testing set-y

Decision Tree
The DT is a tool storing knowledge in the form of a tree ( Figure 17). Nodes indicated by circles represent tests on the selected feature (in our case, one of the signature parameters) and its threshold value (like x1 > 15). The result of the test redirects the analyzed vector of features to the node one level below until the terminal node (leaf) is reached. The leaves (rectangles) represent appliance categories. Classification of the example is then based on exploring the tree from the root (yellow node) to one of the leaves. Tests performed at each node indicate which way to take next. Generation of the DT is done using one of the machine learning algorithms like C4.5 or CART, which differ in the method of selecting tests for nodes.

Neural Network
The ANN is widely used in classification. The feed-forward structures, like multilayered perceptrons or RBF networks, are the most popular. Their hyperparameters include the number of hidden layers or the number of neurons in them HL s . Also, the output layer category coding is important, depending on the activation functions (like sigmoidal ones or softmax). The optimal structure of ANN is then found to maximize the classification accuracy for the minimum number of neurons. Knowledge extraction is performed using gradient-based algorithms.

Decision Tree
The DT is a tool storing knowledge in the form of a tree ( Figure 17). Nodes indicated by circles represent tests on the selected feature (in our case, one of the signature parameters) and its threshold value (like x 1 > 15). The result of the test redirects the analyzed vector of features to the node one level below until the terminal node (leaf) is reached. The leaves (rectangles) represent appliance categories. Classification of the example is then based on exploring the tree from the root (yellow node) to one of the leaves. Tests performed at each node indicate which way to take next. Generation of the DT is done using one of the machine learning algorithms like C4.5 or CART, which differ in the method of selecting tests for nodes.

Decision Tree
The DT is a tool storing knowledge in the form of a tree ( Figure 17). Nodes indicated by circles represent tests on the selected feature (in our case, one of the signature parameters) and its threshold value (like x1 > 15). The result of the test redirects the analyzed vector of features to the node one level below until the terminal node (leaf) is reached. The leaves (rectangles) represent appliance categories. Classification of the example is then based on exploring the tree from the root (yellow node) to one of the leaves. Tests performed at each node indicate which way to take next. Generation of the DT is done using one of the machine learning algorithms like C4.5 or CART, which differ in the method of selecting tests for nodes.

Neural Network
The ANN is widely used in classification. The feed-forward structures, like multilayered perceptrons or RBF networks, are the most popular. Their hyperparameters include the number of hidden layers or the number of neurons in them HL s . Also, the output layer category coding is important, depending on the activation functions (like sigmoidal ones or softmax). The optimal structure of ANN is then found to maximize the classification accuracy for the minimum number of neurons. Knowledge extraction is performed using gradient-based algorithms.

Neural Network
The ANN is widely used in classification. The feed-forward structures, like multilayered perceptrons or RBF networks, are the most popular. Their hyperparameters include the number of hidden layers or the number of neurons in thems HL . Also, the output layer category coding is important, depending on the activation functions (like sigmoidal ones or softmax). The optimal structure of ANN is then found to maximize the classification accuracy for the minimum number of neurons. Knowledge extraction is performed using gradient-based algorithms.

K-Nearest Neighbors
The kNN classifier is one of the simplest non-parametric classification methods. Using the distance measure, k examples from the dictionary closest to the classified feature vector are found. The analyzed example is assigned to the categories supported by the majority of k voting vectors. The hyperparameters include the value of k, voting strategy, and the distance measure selection. This is the only one of the applied classifiers not extracting knowledge from data during the machine learning process. The problem here is determining the significance of available features, for example, by using the information capacity or correlation methods. In the presented research the DT was used to preselect them for the Euclidean measure calculation between each pair of examplesl 1 andl 2 : whereS denotes the signature array andp DT is the number of the signature features selected by DT.

Classification Accuracy
The standard method of evaluating the classifier in the multi-category identification problem is the confusion matrix. To determine the overall quality, the accuracy should be calculated as the number of correctly identified examples from the testing set. This can be done for each categoryn EA separately: or on the whole set (ofN EA categories):

Experimental Results
The following section discusses details of experiments, including the laboratory test stand, collected data, and classification results.
The HF-GEN method was tested in the laboratory conditions on a fixed set of 15 appliances. For each of them, 150 current pulses were recorded. From the vectori in the transient statel a signature vectors l was obtained. The signature setS contains allLSP signature vectors.
The used EAs included a vacuum cleaner, a slow juicer, an "Osram" light bulb, the "Philips" light bulb, an "Omega" light bulb, a "Lexman" lamp with four bulbs, a laptop, irons, sharpeners, grinders, kettle, jigsaw, coffee machine, air conditioner and planer.

Laboratory Test Stand
The measurement system consists of the single analyzed electrical appliance (EA), a current-voltage converter of the SCT-013-020 (CVC) type, the Advantech PCIE-1744 data acquisition card (AC), signal generator (GEN), and computer (PC) with the LabVIEW-based virtual instrument (SW) installed. The EA was connected to the power network. The CVC was installed on the L1 cable supplying EA through the resistorR = 47 Ω. The GEN input-output (I/O) connectors were connected to L1 and N power cables. The AC was configured in such a way that the high level of the sync voltageu SYN (t) applied to the synchronization input would trigger the acquisition of the signalu AD (t) fed to an analog input. The signalu AC (t) was recorded for 10 ms since the occurrence of the high level of synchronization voltageu SYN (t). The AC sampling rate was 30 MS/s. The data streami n containing the samples was captured by a SW running on a PC and saved in the *.tdms file format. The pulse signal generator consisted of AL, i.e., a lamp with an "Osram" LED bulb, (type AB30526) and a "Relpol" relay (type RM699V-3011-85-1005-RE), voltage transformer (MC-GEN), Advantech PCIE-1816H acquisition card containing an analog-to-digital converter (AD-GEN) and digital output (DO) and a computer (PC-GEN) running the virtual instrument (CS). AL was connected to the supply network via the RE relay, while MC-GENto the supply network via the L1 and N conductors. The CVC of the type SCT-013-020 type converts voltageu(t) tou AD−GEN (t). Its measuring range is about 120A. Laboratory tests proved that SCT-013-020 allows for accurate measurements of signals with frequency up to 400 kHz which is enough for the presented HF-GEN method. The voltageu AD−GEN (t) was fed to the analog input no 0 (AD-GEN) of the acquisition card.
AD-GEN samples voltageu AD−GEN (t) at 250 kS/s. Based on them, CS detects the voltage phase by actuating a logic signalo n . The AL is switched on when the voltageu(t) reaches the value of300 V. DO converts logic signalo n to voltageu SYN (t). The RE becomes closed when the high state appears onu SYN (t).

Measurement Procedure
During experiments, the following measurement procedure was implemented: 1.
Connecting to the power grid and switching on EA under test; 2.
Setting CS so that the GEN generates a pulse signal 150 times (in the case of its acquisition for quality evaluation) or at least 10 times (for the transients' dictionary); 3.
Setting the SW to acquire all current pulses; 4.
Starting the impulse generation and acquisition process; 5.
Switching off the tested device.
These steps are performed for each tested EA. A separate series of measurements is carried out with no EA connected (only steps 2-4 are then taken).

Analysis of Measured Current Vectors
As a result of measurements for 16 categories (15 types of EA and no-EA),150 × 16 = 2400 vectors of current samples were collected. Details of the recorded vectors are in Table 2. Each current vectori has 300,000 samples (representing duration of 10 ms).     The average current valuei (l) SEL depends on the analyzed EA. For instance, examplesl ∈ {151, 1201, 1651, 2251} representing vacuum cleaner, iron, kettle, and planer are characterized by relatively high power. The pulses are generated for the voltageu = 300 V when the instantaneous current levels of EAs are close to the maximum value. Therefore, a high average current value is observed here.
The direction of the pulse current is always the same. It results from forcing the voltage phase at the moment of generating the pulse.
All waveforms presented in Figures 18 and 19 are characterized by a rise in the average current value starting approx. atn SEL = 30,000. In turn, for then SEL = 34,000 . . . 50,000 current values drop until reaching the level as before the pulse appearance.
For examplesl ∈ {1, 1201, 1501}, the multiple contact of the RE is visible in the form of many similar oscillations which quickly converge to the average current valuei (l) SEL . For category 0 (no EA), this oscillation is visible for then SEL = 29, 400, while for category 8 (iron) it is forn SEL = 30,000, and for category 10 (grinder), 4 such oscillations are visible forn SEL ∈ { 28,800, 29,500, 30,000, 30,500 }.
The starting point for further analysis is the current vector obtained for category 0 when no EA is connected. The shape of the pulse for the examplel = 1 (Figure 18a) is the impulse response of AL after switching on the supply voltage. All other current vectors are the impulse response of the system in which two electricity receivers are simultaneously connected to the power supply: AL and the tested EA. The change in the shape of the current waveformi (l) SEL is proportional to the influence of the tested EA on the total impedance of these two parallelly connected loads in the supply network.
The examplel = 1951 recorded for the coffee machine has a characteristic shape, especially in the arean SEL = 30, 000 . . . 31, 000, where rapid changes in the instantaneous current values are visible, and the characteristic for many other examples of quasi-periodic oscillations cannot be found.
A vacuum cleaner(l = 151), slow juicer(l = 301), jigsaw(l = 1801), and planer(l = 2251) reduce the frequency of current oscillations, and increase the number of visible oscillations, which is unique for each EA. Specifically, for examplel = 151, five oscillations have period of approximately 940 samples corresponding to a frequency of 31.9 kHz. For the examplel = 301(slow juicer), four periods exist (923 samples each) which corresponds to a frequency of 32.5 kHz. For the examplel = 1801 (jigsaw), five periods of 500 samples correspond to a frequency of 60 kHz. For the examplel = 2251 (planer), there are six periods, each 610 samples long, which corresponds to a frequency of 49.2 kHz. All these categories have motors, which may shape the current pulse.

Dictionary of Transients
The measurement data for the transient dictionary does not coincide with the measurement data used to train and test the classification algorithms. The set of measurement data used in the transient state dictionary was prepared independently of the data set described in Section 4.3. Selection of current vectors for the dictionary does not disturb the obtained classification results.
To prepare the dictionary of transients, the procedure presented in Section 2.4 was used. For each of sixteen categories, 10 examples of transient currenti were collected, from which current vectorsi SEL were obtained. The resulting dictionary of transients is presented in Table 3. The most important fragments of current vectorsi

Signature Parameters
Based on the determined correlation vectors for each pair of the transition (for = 1 2400 l ) and the dictionary example =  The COR_1_A feature ( Figure 24) can be used to distinguish between three catego- In all waveforms, a quasi-periodic oscillation is present, disappearing after about three periods. A characteristic of these vectors is a rising edge on which the oscillation is located. In the examplel D = 7, the slope is visible forn DIC = 1 . . . 2000. The rising edge is a characteristic feature of applied AL.
For dictionary examples representing the vacuum cleaner ( Figure 21), three types of waveforms can be distinguished. They differ mainly in the shape of the initial part of the vectori DIC (n DIC = 1 . . . 1000). The first type is present in examplesl D = 11 andl D = 19. The second type is visible in examplesl D ∈ {14, 15, 18}, while the third one-in examplesl D ∈ {12, 13, 16, 17, 20}. The oscillation frequency in the second part of the vectori DIC is lower than in the no EA case (category 0 in Figure 20). Duration of the oscillation between samplesn DIC = 1000 andn DIC = 4000 is the same for all vectors in this category with period of about 940 samples, which corresponds to a frequency of 31.9 kHz. Figure 22 represent lamps with the "Osram" bulb. Here, the multiplecontact phenomenon of the relay is visible, especially for the examplel D = 40, where four similar oscillations are present in the first part of the current vector.

Waveforms in
Vectors for the planer (Figure 23) are different from other appliances, and at the same time, they are similar to each other. Their distinguishing feature is the shape of the first part of the vectori DIC (n DIC = 1 . . . 300). Here examplesl D ∈ {151, 153, 154, 155} have one maximum above the slope of the oscillation. It is present around the samplen DIC = 100.

Signature Parameters
Based on the determined correlation vectors for each pair of the transition (for = 1 2400 l ) and the dictionary example =  The COR_1_A feature ( Figure 24) can be used to distinguish between three categories: 1 (vacuum cleaner), 2 (slow juicer), and 13 (coffee machine). A majority of examples belonging to these categories have the value of the COR_1_A in the range between 0.97 and 1.0. Almost all observations of the remaining categories assume values of this feature in the range 0.85-0.97, so it is not suitable for distinguishing between them.
The COR_9_F ( Figure 25) feature is characterized by high values for almost all examples. Only three vectors from category 3 (lamp with the "Osram" bulb) have COR_9_F values below 0.85. Observations for all other categories assume values of this feature in the range 0.85-1. Even though the COR_9_F feature was determined for the dictionary examples belonging to the grinder category, the value distribution of this feature is similar for each category. Therefore it is not useful in most cases.

Signature Parameters
Based on the determined correlation vectors for each pair of the transition (for = 1 2400 l ) and the dictionary example =  The COR_1_A feature ( Figure 24) can be used to distinguish between three categories: 1 (vacuum cleaner), 2 (slow juicer), and 13 (coffee machine). A majority of examples belonging to these categories have the value of the COR_1_A in the range between 0.97 and 1.0. Almost all observations of the remaining categories assume values of this feature in the range 0.85-0.97, so it is not suitable for distinguishing between them.
The COR_9_F (Figure 25) feature is characterized by high values for almost all examples. Only three vectors from category 3 (lamp with the "Osram" bulb) have COR_9_F values below 0.85. Observations for all other categories assume values of this feature in the range 0.85-1. Even though the COR_9_F feature was determined for the dictionary examples belonging to the grinder category, the value distribution of this feature is similar for each category. Therefore it is not useful in most cases.  The COR_1_A feature ( Figure 24) can be used to distinguish between three categories: 1 (vacuum cleaner), 2 (slow juicer), and 13 (coffee machine). A majority of examples belonging to these categories have the value of the COR_1_A in the range between 0.97 and 1.0. Almost all observations of the remaining categories assume values of this feature in the range 0.85-0.97, so it is not suitable for distinguishing between them.
The COR_9_F (Figure 25) feature is characterized by high values for almost all examples. Only three vectors from category 3 (lamp with the "Osram" bulb) have COR_9_F values below 0.85. Observations for all other categories assume values of this feature in the range 0.85-1. Even though the COR_9_F feature was determined for the dictionary examples belonging to the grinder category, the value distribution of this feature is similar for each category. Therefore it is not useful in most cases.

Classification Results
This section presents results of three classifiers' operation for the available data partitioned using the K-fold cross validation (where K = 10).

Neural Network
The application of ANN required selecting the optimal number of neurons in the hidden layer. For that purpose, the network was trained many times, with the number of neurons in the hidden layer ranging from 1 to 24. The classification error as a function of the number of neurons in the hidden layer is in Figure 26.

Classification Results
This section presents results of three classifiers' operation for the available data partitioned using the K-fold cross validation (where K = 10).

Neural Network
The application of ANN required selecting the optimal number of neurons in the hidden layer. For that purpose, the network was trained many times, with the number of neurons in the hidden layer ranging from 1 to 24. The classification error as a function of the number of neurons in the hidden layer is in Figure 26.
The total classification error is minimal for 15 neurons in the hidden layer. Further increase of this parameter does not significantly affect the classification error.
The confusion matrix NN C and the classification accuracy  EA ,NN n for 16 examined categories in the optimal ANN structure are presented in Table 4.  The overall classification accuracy was as follows: