New Time-Frequency Transient Features for Nonintrusive Load Monitoring

A crucial step in nonintrusive load monitoring (NILM) is feature extraction, which consists of signal processing techniques to extract features from voltage and current signals. This paper presents a new time-frequency feature based on Stockwell transform. The extracted features aim to describe the shape of the current transient signal by applying an energy measure on the fundamental and the harmonic frequency voices. In order to validate the proposed methodology, classical machine learning tools are applied (k-NN and decision tree classifiers) on two existing datasets (Controlled On/Off Loads Library (COOLL) and Home Equipment Laboratory Dataset (HELD1)). The classification rates achieved are clearly higher than that for other related studies in the literature, with 99.52% and 96.92% classification rates for the COOLL and HELD1 datasets, respectively.


Introduction
The massive use of electrical appliances in industrial, residential, and commercial buildings continues to increase. This fact requires a continuous increase in energy production, with the growth in electric energy needs in the last decade reaching 3.4% per year [1,2]. In addition, the diversity of the devices used has increased, especially with the remarkable evolution of technology, most notably in the last few decades. This truth imposes an urgent need to control the energy consumption related to these loads in order to more closely monitor consumption and to act when necessary in the case of anomalies. Some studies estimate the ability to save 20% of the energy consumed if we monitor energy consumption in real time so that we can act in cases of anomalies, most notably in households [3,4]. According to [5], the best way to optimize energy savings is to monitor the consumption per device [1]. This essentially requires load desegregation techniques that can be performed by intrusive or nonintrusive load monitoring methods (ILM or NILM) [6]. Nonintrusive load monitoring (NILM) consists of determining the individual energy consumption of appliances connected to the electrical grid [7]. This can be done by measuring current and voltage signals from one measurement point in order to apply signal processing and machine learning methods to desegregate appliances. The nonintrusive mode is therefore simpler to set up, contrary to the intrusive mode in which measuring equipment must be installed next to each type of load and requires a large number of measuring points. One crucial step in the NILM is feature extraction from voltage and current signals [8]. The objective is to obtain relevant features capable of discriminating between the different types of loads (e.g., linear, nonlinear, and multi-state loads ) and at the same time have physical meanings. The NILM features extracted can be divided into two categories: the steady state and transient state. The authors in [8] gave a summary of the proposed NILM features used in the literature. Beyond classical features used for NILM as the step changes in real power (P) and reactive power (Q) from steady stage [7,9], which quickly showed their limits in the case of nonlinear loads or in the detection of low-power loads, harmonic-based features were massively explored to remedy to these limitations as macroscopic transients combined with harmonic descriptors [10][11][12]. These also faced certain limitations, particularly that they do not allow a user to discriminate efficiently the nonlinear and multi-state loads [8].
On the other hand, the time-frequency tools proved their utility in studying nonstationary signals, for which statistical characteristics such as a current transient change with time.
In the past, time-frequency tools [13][14][15][16][17] were used in the NILM. These studies tried to extract statistical descriptors of time-frequency representation by describing the spectral envelope via short-time Fourier transform (STFT) [18] or by extracting the energy of the wavelet coefficients in each discrete wavelet transform (DWT) level, but none of them focused on the shape of the transient at targeted frequencies, which can be directly related to the physical nature of the load. Other studies were interested in the shape of the current raw signals [19] or the use of V I trajectories in order to more robustly represent the shape of the current and the voltage signals [20]. However, these approaches still suffered from limitations related to the fact that the descriptors are based on the time domain, which can be very sensitive to noise.
Therefore, the aim of this study is to propose qualitative features from the timefrequency domain in order to describe the shape of the electrical transient current signals on a specified frequency range. In other words, we focus on targeted features based on the transient current signals as opposed to traditional approaches consisting of extracting descriptors massively [21] or new blind approaches based on deep learning [22][23][24]. Moreover, traditional classifiers are used to validate the proposed features since the objective of this paper is not to bring originality to the classification tools but rather to focus on evaluation of the new proposed features. Concerning the available data, for a few years, different datasets have been released, most of them containing signals sampled at low frequencies (1 Hz or less). The low sampling frequency limits the study of descriptors based on harmonics. However, other datasets do exist in the literature with higher sampling frequencies [25][26][27][28][29][30] to study the transient state. In this paper, in order to validate the proposed approach, we test the proposed features on two datasets: Controlled On/Off Loads Library (COOLL) [26] and Home Equipment Laboratory Dataset (HELD1) [29]. The description of these two datasets are given later. This paper is organized as follow: Section 2 presents the proposed time-frequency features, Section 3 describes the classification tools used in this study in order to evaluate the proposed features. Section 4 is consecrated to the application of the proposed features on two existing datasets: the COOLL [26] and HELD1 datasets [29]. The obtained results are compared with related results in the literature [31] for the COOLL and [29] HELD1 datasets. Finally, Section 5 presents the conclusion and some perspectives related to future work.

The Proposed Time-Frequency Features
In this section, we give a brief overview on Stockwell transform before presenting the proposed features based on Shannon energy applied on time-frequency.

The Stockwell Transform (ST)
The proposed time-frequency features are based on a time-frequency analysis that can be obtained with S-transform [32], defined as a hybrid version between continuous wavelet transform (CWT) and short-time Fourier transform (STFT) [33]. It improves frequency resolution at low frequencies and time resolution at high frequencies in time-frequency representation. For a given signal x(t) ∈ L 2 (R), the S-transform can be defined as follows: where τ ∈ R is a time translation and w is a Gaussian window function of time and frequency. It is chosen as follows: The S-transform is a multi-resolution transform where the dilation σ is a function of frequency and is defined as follows: Using this logic for the variation in window width as a function of frequency, we favor temporal resolution for high frequencies since the width of the window is narrower in this case and vice versa for low frequencies.

The Proposed Time-Frequency Features
We propose a set of features, denoted β f n that characterize the transient signal.
The values β f n depend on the shape of the transient signal, which is directly related to the physical nature of the load. Therefore, β f n is calculated on the frequency voices denoted by f n (fundamental and harmonics frequencies), which can be given as f n = {50, 150, 250, 350, . . . n × 50 Hz }. The proposed method is summarized in Figure 1. The voice can be seen as the local instantaneous frequency of a signal [32]. By rewriting Equation (1) as a convolution product between the signal x(t) and the Gaussian window w(t) for a particular frequency f n , the voice calculated on a signal x(t) can be written as follows: where α is related to the frequency translation of the spectrum of signal x(t) and X( f ) is the Fourier transform of the signal x(t). This rewriting allows us to take advantage of the Fast Fourier Transform (FFT) algorithm in terms of low computing complexity to generate the time-frequency coefficients. For each localized time and frequency region in the time-frequency plan, the corresponding complex number can be given as follows: Before computing the features β f n , the voice related to f n needs to be normalized as follows: The β f n feature at a specified frequency f n is then computed by applying the Shannon energy on the module of the corresponding voice: The logarithmic term in Equation (7) aims to reduce the high and low variation impacts, which usually correspond to noise, and allow us to reduce the intraclass variations. Figure 2 shows the variation in the applied energy measure (Shannon energy) on a linear normalized signal (amplitude varies linearly between 0 and 1). The amplitude of the signal is emphasized in the transformed signal between amplitudes 0.3 and 0.8 and is attenuated elsewhere.
The advantage of this approach is the rigorous summary of the shape of each envelope of the transient current at the fundamental frequency and its harmonics by a single discriminant feature instead of using an exhaustive feature extraction process.

Electrical Load Classification Methodology
To test the performance of the proposed features, we tested them on two distinct databases: COOLL [26] and HELD1 [29]. A brief description of each database and the corresponding results are given in the following sections. The calculated time-frequency features are β 50 , β 150 and β 250 . Taking into account the nature of the loads used in the two bases, it is not necessary to delve further in the harmonic features, but this can be adapted depending on the data and the nature of the used loads. We added to the set of β features the P max feature, which is the maximum value of active power in the transient period. Therefore, the total set of transient features F used for evaluation in this paper can be given as follows: To evaluate the proposed features, two classical supervised classifiers are used: the k-NN classifier with the euclidean distance and the decision tree with the Gini splitting criterion. For K-NN classifiers, 4 values of K are tested (K = 3, 5, 7, and 9). To avoid a random choice from the pair training and validation sets, a cross validation strategy is applied. More precisely, a 10-fold cross validation is used for both classifiers to calculate the classification rate. Therefore, each model is trained using 9 of the folds as training data and the remaining fold is used for the test. Then, a loop is processed to ensure that each fold is used as test data. This validation is repeated 100 times for each feature's combination, and the corresponding results are presented in Section 4.

Applied Analyzing Time-Window
In order to calculate the size of the time window in which we calculate the proposed features, a threshold on the active power is applied. The value of the threshold is set empirically set to 1.3 W. This value can be adjusted depending on the power of tested appliances. The threshold defines the start and the end instants τ 1 and τ 2 of the time window, which correspond to the beginning and the end of the transient period. Figure 3 shows the transient current envelope extracted from the fundamental frequency 50 Hz based on a theoretical model presented in [5] with τ 1 and τ 2 instants.

Time (s)
Amplitude (A)   We highlight here that the proposed features in this study do not require any model to estimate them. They are based only on the measured current signal.

COOLL Dataset
Controlled On/Off Loads Library (COOLL) is a high-sampled current and voltage measurement dataset for individual appliance consumption. In total, there are 42 appli-ances of 12 types measured with 100 kHz sampling frequency. The total measurements in the COOLL dataset is 840 current signals. For each appliance, 20 controlled measurements were made. Each measurement corresponds to a specific action delay ranging from 0 to 19 ms with a step of 1 ms in order to cover the whole time-cycle duration of the 50 Hz main voltage [26]. This base is very useful for testing the robustness of the proposed features against the initial conditions that might affect the waveform of turn-on transients.

HELD1 Dataset
The Home Equipment Laboratory Dataset (HELD1) contains current and voltage signals of turn on and turn off events from 14 different appliances corresponding to 18 different consumers [29]. The sampling rate is 4 kHz, and the total used signatures in this paper is 1365 transient current signals corresponding to 100 on/off events for each device (except for the white refrigerator, where there are only 65 measured events).

COOLL
The classification rate for the COOLL dataset increases when combining the β features (see Figure 5). P max alone gives a high classification rate (89.88%) already with the K-NN for K = 9; see Table 1. It is important to point out that the power-based features are inherently very discriminating (except for load with very close powers). The combination of betas alone without power information show comparable performances (89.88% with the decision tree). This shows the efficiency in describing the transient waveform, which is completely decoupled from power information. The K-NN performance decreases when K increases for the combined β 50 , β 150 , and β 250 features (see Figure 5). A reverse phenomena occurs for individual features (P max or β 50 alone). The combination of both information (P max , β 50 ) reaches the highest performance: 99.52%, which is higher than the classification rate reached in the literature (98.57%) based on the same dataset in which the authors apply features from a transient model and use the K-NN classifier to classify loads [31]. The confusion matrix for a selected iteration in the K-fold cross validation process obtained on the COOLL dataset shows errors where the Drill load is classified as a Hedge trimmer (see Figure 6). The addition of betas corresponding to higher harmonic shapes does not improve the classification rate; this is strongly linked to the nature of the loads in the dataset.

HELD1
As in the case of the COOLL dataset, the classification rate of HELD1 shows an increasing trend when aggregating the β features. The highest classification rate 96.92% is reached when combining the P max with β 50 and β 150 features and the decision tree classifier. For individual features (P max and β 50 ) and for couple combined features (β 50 and β 150 , and P max and β 50 ), the K-NN classifier gives better results than the decision tree. The influence of K for the K-NN is not very significant except for β 50 , where an improvement in the performance of the classifier can be seen when k is greater than 5 (see Figure 5). For very small values of K, the classifier tends to be overfit by trying to classify isolated points. The same phenomena can be observed for COOLL data and for individual features. The errors obtained on the HELD1 data are due to the fact that the white refrigerator is classified 7 times as a blue refrigerator and that the last one is classified twice as a white refrigerator (see Figure 6).

On the Performance of the Proposed Features
In the case of both bases, the proposed features when tested alone show progressive performance (see Figure 5). As we mentioned before, statistically describing the frequency voices of the transient phases tells us about the physical nature of the load that is completely disconnected from the energy consumption information that is expressed by power. Since the objective of this paper is to explore the advantage of extracting transient phase features, the power information is calculated as the maximum active power (called P max ) reached in the transient phase. In both cases (COOLL and HELD1), when adding P max information to the feature's vector, it significantly enhances the classification rate. For COOLL dataset, the improvement is of the order of 10% and 13% for HELD1. The confusion matrices (see Figure 6) clearly show the contribution of the β n features depending on the nature of the loads. It seems that, for the HELD1 dataset, it is necessary to increase to β 150 . The classification rate obtained with P max , β 50 is 95.38%, while the results when adding β 150 (P max , β 50 , β 150 ) is 96.92%, which is the higher classification rate for this dataset. In contrast, for the COOLL datset, it is already enough at β 50 , with the addition of β 150 and β 250 not improving the results; on the contrary, it deteriorates them slightly from 99.52% for P max , β 50 to 99.17% for P max , β 50 , β 150 and P max , β 50 , β 150 , β 250 . Adding the β 250 feature does not improve the performance for COOLL. This means that the information provided by the higher β is too noisy and scattered, which may be due essentially to the nature of the loads that do not have high enough harmonics.

Conclusions and Perspectives
This paper presented new transient features based on the time-frequency domain. The objective of the proposed features ws to characterize the turn-on electrical load transients by describing the shape of the frequency harmonic voices. The Stockwell transform was applied in this study to generate the frequency voices. Indeed, other transforms could be used and compared, but the purpose of this paper was to validate the relevance of the proposed approach independent of the used time-frequency transform. By combining the proposed features with P max , which is the maximum active power in the window applied on the transient phase, very high classification rates can be achieved. To validate the proposed approach, the called F feature set was applied on two public datasets, COOLL and HELD1, with 99.52% and 96.92% reached for the classification rates, respectively. It turns out that, for COOLL data, only the first two features (P max and β 50 ) are sufficient to reach high performance; for HELD1 data, the feature β 150 must be added to these last features to significantly enhance the classification rate. The β features and their number depend on the nature of the data and the contribution of the harmonics in their loads. Therefore, the proposed number of harmonics features can be adapted according to the data. The obtained results outperform the existing studies in the literature based on the same data [29,31]. It is important to highlight that the presented methodology does not requires a model to estimate the transient parameter, as in [31], which can be changed depending of the nature of the electrical loads. The tested data are constituted by controlled turn-on loads for both datasets. As future work, more datasets will be tested and the proposed features will be implemented in an embedded system in order to test the classification accuracy in real time.