Automatic Implementation of a Self-Adaption Non-Intrusive Load Monitoring Method Based on the Convolutional Neural Network

: Non-intrusive load monitoring (NILM) is an e ﬀ ective way to achieve demand-side measurement and energy e ﬃ ciency optimization. This paper studies a method of non-intrusive on-line load monitoring under a high-frequency mode of electric data acquisition, which enables the NILM to be automated and in real-time, including the short-term construction of a dynamic signature library and continuous on-line load identiﬁcation. Firstly, in the short initial operation phase, load separation and category determination are carried out to construct the load waveform library of the monitoring user. Then, the continuous load monitoring phase begins. Based on the data of each user’s signature library, the decomposition waveforms are classiﬁed by convolutional neural network models that are constructed to be suitable for each signature library in order to realize load identiﬁcation. The real-time power consumption status of the load can be obtained continuously. In this paper, the electricity data of actual users are collected and used to perform the experiments, which show that the proposed method can construct the load signature library adaptively for di ﬀ erent users. Meanwhile, the classiﬁcation of the convolutional neural network model based on a library constructed in actual operation ensures the real-time and accuracy of load monitoring.


Introduction
The study of demand side management (DSM) is significant for the rational allocation of power resources and the improvement of terminal power efficiency [1]. As an important means to solve the shortage of power supply and support energy development strategies, the long-term and effective implementation of power DSM can reduce environment pollution and enterprise costs, becoming an effective way to promote the sustainable development of society and the economy.
DSM focuses on improving the efficiency of terminal power consumption and adjusting the mode of power consumption to reduce the dependence on power supply. Recently, intelligent DSM has attracted the attention of people [2]. The effective monitoring of load of users can provide data support for intelligent DSM, grasp the power consumption situation and users' behavior in real-time, and guide users to arrange the usage of power consumption reasonably, so as to improve the efficiency of energy utilization [3].
Traditional load monitoring adopts an intrusive method via the installation of an information-collecting device to each user's electrical equipment. Although the collected data are accurate and reliable, the intrusive method has poor achievability due to the high-cost hardware, complicated installation in the early stage and low acceptance of users. In order to overcome this issue, non-intrusive load monitoring (NILM) [4] technology was proposed by Professor Hart in the • Automatic execution of the monitoring process. After constructing the load signature library for individual users in the early short running phase, the user-specific neural network model is trained based on the data in the library belonging to each user, and then the method of load online identification for each user is realized, which provides a feasible scheme for the automatic implementation of NILM. Moreover, it can solve the problems of weak universality and unsatisfactory identification accuracy of a pre-constructed signature library. • More stable signatures. The convolutional neural network is used to extract the two-dimensional signatures of a load current, so as to reduce the influence of noise and harmonics on the one-dimensional signature data of a load, strengthen the stability of the extracted signatures, and further improve the accuracy of load identification.

Principle and Implementation Structure of Non-Intrusive Load Monitoring
Non-intrusive load monitoring collects a user's power consumption data at the residential power entrance outside the user. All power consumption information of the user is concentrated in the obtained mixed signal data, so the load identification is very important to acquire the detailed power consumption of each load. The signal of each switching load should be separated from the collected mixed signal, and then the load identification is implemented supported by the information in the signature library.
The signature library is the premise of effective identification. In the actual monitoring process, it is difficult to get the independent information data of the load directly, because it is impossible to make the load run independently without disturbing the user. Furthermore, there are various brands of loads and different circuit environments, which may cause a variety of changes on load waveforms, as shown in Figure 1. It is impossible to realize load identification by a priori signature library containing all varied waveforms of different users. Moreover, the classification of separated loads will seriously affect monitoring efficiency. Therefore, the construction method of a library should be universal for the users with different loads. At the same time, complex pre-collection and intervention should be avoided to make the process of NILM automatic and effective.
Processes 2020, 8, x FOR PEER REVIEW  3 of 21 process. Then, the load identification is realized by the neural network based on the two-dimensional current data of the load in the library.
In addition, to further focus these problems in regard to NILM research, the contributions of the proposed method are as follows: • Automatic execution of the monitoring process. After constructing the load signature library for individual users in the early short running phase, the user-specific neural network model is trained based on the data in the library belonging to each user, and then the method of load online identification for each user is realized, which provides a feasible scheme for the automatic implementation of NILM. Moreover, it can solve the problems of weak universality and unsatisfactory identification accuracy of a preconstructed signature library. • More stable signatures. The convolutional neural network is used to extract the twodimensional signatures of a load current, so as to reduce the influence of noise and harmonics on the one-dimensional signature data of a load, strengthen the stability of the extracted signatures, and further improve the accuracy of load identification.

Principle and Implementation Structure of Non-Intrusive Load Monitoring
Non-intrusive load monitoring collects a user's power consumption data at the residential power entrance outside the user. All power consumption information of the user is concentrated in the obtained mixed signal data, so the load identification is very important to acquire the detailed power consumption of each load. The signal of each switching load should be separated from the collected mixed signal, and then the load identification is implemented supported by the information in the signature library.
The signature library is the premise of effective identification. In the actual monitoring process, it is difficult to get the independent information data of the load directly, because it is impossible to make the load run independently without disturbing the user. Furthermore, there are various brands of loads and different circuit environments, which may cause a variety of changes on load waveforms, as shown in Figure 1. It is impossible to realize load identification by a priori signature library containing all varied waveforms of different users. Moreover, the classification of separated loads will seriously affect monitoring efficiency. Therefore, the construction method of a library should be universal for the users with different loads. At the same time, complex pre-collection and intervention should be avoided to make the process of NILM automatic and effective. The identification accuracy of the unsupervised-based algorithm is unsatisfactory, while the supervised-based depends on the labeled training data. In the existing study of NILM, as the prior The identification accuracy of the unsupervised-based algorithm is unsatisfactory, while the supervised-based depends on the labeled training data. In the existing study of NILM, as the prior knowledge (which is used as the training data) of the load is required in advance, the methods are feasible for a specific user and lack universality for the changes of load data in different users. Thus, a method that can adapt to these changes is required.
Sampling with a high frequency, waveform-based identification method is recommended due to the high accuracy and processing efficiency. Complete waveform and signature information can be obtained in the high-frequency acquisition mode. However, in actual acquisition processes, large amount of data is difficult to store, and the noise and harmonics on the grid side have an impact on the load waveform and signature information, which degenerates the accuracy of identification using one-dimensional data. Considering the limitation of communication capability and economic cost, the identification process is suitable for online execution and, thus, the identification accuracy depends on the real-time processing.
In response to the abovementioned problems in NILM, this paper studies a method that includes two stages: The first step is to analyze the collected mixed signals, classify and determine unknown loads without supervision, and build the load signature library specifically for users adaptively during the short-term process; the second step is to train the convolutional neural network model based on the data of users' libraries constructed in the first step to form the identification model suitable for each independent user, so as to realize load identification. The implementation structure of this paper is shown in Figure 2. knowledge (which is used as the training data) of the load is required in advance, the methods are feasible for a specific user and lack universality for the changes of load data in different users. Thus, a method that can adapt to these changes is required. Sampling with a high frequency, waveform-based identification method is recommended due to the high accuracy and processing efficiency. Complete waveform and signature information can be obtained in the high-frequency acquisition mode. However, in actual acquisition processes, large amount of data is difficult to store, and the noise and harmonics on the grid side have an impact on the load waveform and signature information, which degenerates the accuracy of identification using one-dimensional data. Considering the limitation of communication capability and economic cost, the identification process is suitable for online execution and, thus, the identification accuracy depends on the real-time processing.
In response to the abovementioned problems in NILM, this paper studies a method that includes two stages: The first step is to analyze the collected mixed signals, classify and determine unknown loads without supervision, and build the load signature library specifically for users adaptively during the short-term process; the second step is to train the convolutional neural network model based on the data of users' libraries constructed in the first step to form the identification model suitable for each independent user, so as to realize load identification. The implementation structure of this paper is shown in Figure 2.

Principle of Electrical Signal Separation
The collected current signal is summed by the current signals on each load branch in operation. It is assumed that the collected current data are denoted by I(t). When M loads operate simultaneously, the mixed current in non-intrusive mode can be shown as Formula (1).

Principle of Electrical Signal Separation
The collected current signal is summed by the current signals on each load branch in operation. It is assumed that the collected current data are denoted by I(t). When M loads operate simultaneously, the mixed current in non-intrusive mode can be shown as Formula (1). where I i (t) is the current signal when the load i operates separately at the time t; n(t) is the noise in the circuit of the user. The waveform separation is carried out for the electric data which are collected from the power entrance of the independent user. The collected data include the signal of loads of only one user. Two runs of the load switching action cannot be completed simultaneously as there would be a certain time difference between them; thus, the current separation model can be established as the sum of the current signals at two different moments (i.e., before and after the moment of load switching). At the last moment, the load current in the circuit is recorded as I(t). When the load k is switched, the mixed current I new (t) is superimposed by I(t) and the current I k (t) when the load k runs alone, as shown in Formula (2).
where I k (t) represents the current when the latest switching load k runs independently. I(t) can be treated as if there are only two signals in the circuit at this time. They are the circuit signal I(t) before the moment of load switching, and the circuit signal I new (t) after the moment of load switching.

Construction Principle of Load Signature Library
After load separation, the independent waveforms I k (t) and U k (t) can be obtained for load k, but the categories of the waveforms are unknown. The information in the library should include waveform value and the corresponding category label. It is inevitable to judge the load categories of load k, and then a dynamic load signature library can be constructed. There are no prior available data if the library is constructed automatically without interference of users. Moreover, for loads of the same type, the change of model, brand, and even operation environment will cause the variation in load waveform. The load waveform has an infinite variety of forms theoretically, so the problem of load label attachment is attributed to the classification with infinite classes under unsupervised conditions. However, when the label category is infinite, it is very difficult to classify.
Although the waveforms of appliances are variable, the common appliances categories are enumerable. If the loads have the same category, the changing waveforms will share some common signatures. For an independent user, the physical model and waveform of the same load type are fixed, and the operating environment and habits of loads are relatively stable, so the infinite category classification problem is transformed into a limited category problem. It is suitable for each independent user to construct a signature library, which is focused on in this paper. To ensure that the method of library construction is universal for users, the signatures extracted from the unknown load are used as the criteria for load category labeling, so as to realize the classification. The load labeling problem becomes a supervised classification.
The independent load waveforms and signatures of unknown categories in users can be obtained through the proposed method in Section 2.2. The load classifier determines the category of unknown loads by the separated waveform and extracted signatures only. Thus, without prior knowledge, the problem of category classification is transformed into the posteriori knowledge-solving problem, in which the load category under the condition that the load waveform and signatures are known needs to be determined. Here, the Bayesian classification model is a suitable method. On the premise of known sample characteristics, the Bayesian classification model can quantify the probability of samples from each category, and then select the category with the largest posterior probability as the classification result. In addition, due to the variety of independent load waveforms and signatures extracted from different users, the generalization is very important for the classification model. Considering the limited load categories and quantities in one user, the data scale can be limited. With strong generalization, the Bayesian model performs well for limited scale data. Therefore, the Bayesian classification model is established for loads classification in [20]. As for the posteriori knowledge, the signatures are used to calculate the prior probability of the load category. It is assumed that F k is the signature calculated from the signal U k and I k . The probability of load k that belongs to the category ω n can be obtained by Formula (3). P(ω n |F k ) = P(F k |ω n )P(ω n )/P(F k )n = 1, 2, . . . , N where N is the category number of the user. Formula (3) shows that the prior probability P(ω n ) is converted to the posterior probability P(ω n | F k ) by the obtained stable signature vector F k , that is, when the class of load k under the known condition of F k belongs to the probability of ω i , the most probable category is the label of load k, as shown in Formula (4).
where L k represents the classification result, which is the category label of load k. In this way, unknown category loads separated in succession can be labeled. Then, the waveforms, signatures and categories can be recorded in the library to complete the adaptive library construction of an independent user.

Convolutional Neural Network Identification Model
After forming the user's library, the independent load waveform of a user is continuously identified based on the data in the library, so as to determine the user's load operation status at the current moment. Stored in the library, the N kind of loads and the corresponding record information are expressed as follows.
where the information includes the separated signals of voltageÛ, currentÎ, and label L. The library has been constructed completely at this time. Real-time identification belongs to a supervised classification problem. The data in the library are suitable for the unique independent users, and the load to be classified is the load in the library. Thus, as the training data, the information in the library enables the classification model for supervised training, which can greatly reduce the invalid sample data for training, build a useful classification model specifically for the independent user, and cut down the impact of over-fitting on the identification results.
Due to the existence of noise and harmonics on the power grid side, extracted from the one-dimensional data of the load waveform, the signature information fluctuates greatly, subsequently affecting the identification process. However, as the load circuit is composed of non-linear components such as diodes, thyristors, transistors, motors and so on, it will also cause the distortion of load current waveform. The harmonic and distortion of the current influenced by the non-linear components in the circuit can also be regarded as the typical signatures of load identification [6][7][8][9]20,21]. Direct filtering may destroy the original load signatures useful for identification. Therefore, it is difficult to determine whether the distortion in the current is caused by noise or harmonics accurately and filter it directly. Considering that the waveform of the same independent load is not only relatively stable in a steady state under the high frequency acquisition mode, but is less disturbed by noise and grid side harmonics, the one-dimensional waveform data of a load current stored in the dynamic library are transformed into two-dimensional image data. The image can keep the basic shape and outline of the original current waveform. The amplitude values of current waveform are transformed into the pixel values in the image. The waveform distortion caused by noise or harmonic alters only the position of the pixel points in the original waveform locally and slightly (i.e., changes a few local pixel values of the image), rather than the shape and outline of the current waveform. In the image recognition process, the two-dimensional data can be recognized mainly based on the image features, including contour, shape, contrast and relative position of the marked features. Therefore, the dimensional converting of the waveform for recognition will reduce the influence of noise or harmonics on the recognition results.
Since the one-dimensional current data are transformed into two-dimensional image data, the load identification problem is transformed into the identification problem of the two-dimensional image data. Convolutional neural networks have outstanding performance on two-dimensional image data processing. Images can be input into the network directly to avoid the complexity of data reconstruction in the signature extraction and classification processes. Convolutional neural networks automatically extract multiple image signatures through multiple convolution kernels, approximate complex mapping functions through multi-layer non-linear transformation, and then classify current waveform images to realize real-time load identification. Besides, the distortion of the current waveform caused by noise or harmonic only alters the position of the pixels in the original waveform, resulting in the local translation, rotation and scaling of the waveform position. However, these influences can be weakened by the convolutional neural network with the characteristic of translation invariance. (Invariance means that when the input data are changed locally and slightly, most of the outputs after the pooling function will be not changed. It is extremely significant when we focus on whether a feature appears in two-dimensional data rather than at its location.) As an important layer of the convolutional neural network, the function of the pooling layer is that the output of the network at a certain location in image is replaced by the statistical characteristic output of the pixel value in the surrounding area of that location. Contrastingly, in the constructed convolutional neural network, the average value of the surrounding pixel is extracted by the pooling layer, which weakens the influence of the pixel points affected by the noise and harmonics in the image. Furthermore, multiple pooling layers in the convolutional neural network gradually reduce the influence caused by noise and harmonics. Moreover, in convolution operation, parameter sharing ensures that it is unnecessary to learn a set of parameters for each position in the two-dimensional data, which reduces the computational complexity, training time and storage space of the parameters.
In this paper, the one-dimensional data in a labeled library are converted into the two-dimensional waveform image as a training set, and the convolutional neural network model is trained by supervised learning. The test data consist of the two-dimensional waveform image converted from the one-dimensional data of the separated current signal which is obtained from the mixed signal in real-time. Finally, the load can be identified online by the convolutional neural network, of which the model structure includes convolutional, pooling, a fully connected layer and non-linear function, as shown in Figure 3.
Processes 2020, 8, x FOR PEER REVIEW 7 of 21 marked features. Therefore, the dimensional converting of the waveform for recognition will reduce the influence of noise or harmonics on the recognition results.
Since the one-dimensional current data are transformed into two-dimensional image data, the load identification problem is transformed into the identification problem of the two-dimensional image data. Convolutional neural networks have outstanding performance on two-dimensional image data processing. Images can be input into the network directly to avoid the complexity of data reconstruction in the signature extraction and classification processes. Convolutional neural networks automatically extract multiple image signatures through multiple convolution kernels, approximate complex mapping functions through multi-layer non-linear transformation, and then classify current waveform images to realize real-time load identification. Besides, the distortion of the current waveform caused by noise or harmonic only alters the position of the pixels in the original waveform, resulting in the local translation, rotation and scaling of the waveform position. However, these influences can be weakened by the convolutional neural network with the characteristic of translation invariance. (Invariance means that when the input data are changed locally and slightly, most of the outputs after the pooling function will be not changed. It is extremely significant when we focus on whether a feature appears in two-dimensional data rather than at its location.) As an important layer of the convolutional neural network, the function of the pooling layer is that the output of the network at a certain location in image is replaced by the statistical characteristic output of the pixel value in the surrounding area of that location. Contrastingly, in the constructed convolutional neural network, the average value of the surrounding pixel is extracted by the pooling layer, which weakens the influence of the pixel points affected by the noise and harmonics in the image. Furthermore, multiple pooling layers in the convolutional neural network gradually reduce the influence caused by noise and harmonics. Moreover, in convolution operation, parameter sharing ensures that it is unnecessary to learn a set of parameters for each position in the two-dimensional data, which reduces the computational complexity, training time and storage space of the parameters.
In this paper, the one-dimensional data in a labeled library are converted into the twodimensional waveform image as a training set, and the convolutional neural network model is trained by supervised learning. The test data consist of the two-dimensional waveform image converted from the one-dimensional data of the separated current signal which is obtained from the mixed signal in real-time. Finally, the load can be identified online by the convolutional neural network, of which the model structure includes convolutional, pooling, a fully connected layer and non-linear function, as shown in Figure 3.

Methodology
In order to achieve the above idea, this paper will introduce the following two stages. In the first stage, switching events are detected and load signals are separated from the collected mixed data. Then the categories of separated loads are judged to form a load signature library. Then, the

Methodology
In order to achieve the above idea, this paper will introduce the following two stages. In the first stage, switching events are detected and load signals are separated from the collected mixed data. Then the categories of separated loads are judged to form a load signature library. Then, the convolutional neural network is trained by the data in the library. After short-term model training, the classification model suitable for the load signature library is automatically formed, so that the separated load signals separated can be identified in real-time.

Event Detection and Load Separation
The constructed signature library needs to include the waveform information of each load. However, in the non-intrusive mode, the collected electrical signal of a user is summed by multiple signals of different electrical appliances which are simultaneously in the open state. Thus, the mixed signal requires signal separation. The load switching events can be measured by the current intensity.
In [20], if there is an obvious difference in the current intensity of a period compared with that of the previous one, the load switching event can be considered to be increasing.
Then, considering the difference of the load phase when the loads are switched into the user circuit, direct extraction of a switching load signal may result in an error regarding the information of the independent load. The method in [20] extracts the mixed signal on the sample points of the same voltage value in different electrical periods-before and after the switching events. Using the one-dimensional data processing method, signal separation can be realized. The load signal separation method is simple and effective. Therefore, this paper adopts the method in [20] to separate the load electrical signal quickly.

Category Determination of Unknown Load Waveform
After obtaining the separated waveform of load k, the load waveforms require pre-classification. All the unknown loads separated from the mixed signal of independent users are clustered rapidly by load signatures. The process can not only avoid the load waveforms detected and separated repeatedly due to the actual multiple switching by users, but also narrow the range of load categories, so as to reduce the operation burden of subsequent load category determination.
In the initial stage of library construction, the number of load categories and operation modes in each user are unknown. Correspondingly, the number of different categories of waveforms is also unknown. However, for the same user, the waveform of the same load is relatively fixed, and the difference of signatures extracted from the waveform of the same load in each switching is small, while the difference between the waveform of different loads is relatively large. Thus, the unknown load is clustered quickly by the inherent signatures extracted from the separated waveforms, which can greatly reduce the repeatability of the extracted load waveform.
After load signature normalization, the cluster can be achieved by discriminant function, as shown in Formula (6).
where F k * is the normalized signature and δ ω * is the signature of the ω-th load stored in the current time signature library. D k,ω is the discriminant distance of the signature between load k and category ω. If the minimum value of D k,ω is less than δ, the type of load k has already been stored in the library. If the minimum value of D k,ω is greater than δ, the new load is found and its waveform and signatures are recorded in the library. The method can ensure that the load information in the library has no repeatability. At this time, the category of unknown load waveforms needs to be determined, so as to complete the load information in the signature library. In practice, there are abundant load brands and models among the actual users. Various waveforms cause a signature value fluctuation in different degrees. The method in [20] considers the fluctuated load signatures and calculates the probability of different load signatures. Then, using the Bayesian model and the multiple signatures, the category probability of loads belonging to different load categories is obtained. Eventually, by Formula (4), the most possible load category is selected to solve the problem of category determination for unknown load waveforms.

Load Identification of the Convolutional Neural Network Based on the Signature Library
After the first stage of the short-term adaptive library building process, the second stage of the sustainable load monitoring begins. Based on the data in the library, this paper transforms a one-dimensional load current signal into two-dimensional image data of periodic current, and the detailed process is divided into the four following steps, where the load current vectorÎ k in one period of steady state is taken as the basic data: • The sampling point serial number and the current amplitude are taken as abscissa and ordinate, respectively. Connect each data point in turn and draw the binary image of load current waveform; • A unified range of coordinate axes is selected for binary image, ensuring that the waveform images of different loads are displayed in the same range; • Hide the axis in the binary image; • An appropriate image resolution is selected to display the image clearly. The resolution of the binary image is adjusted to adapt to the input of the convolutional neural network.
Then, the input sample of the convolutional neural network can be obtained. In this way, one-dimensional current waveforms can be transformed into two-dimensional image data. Meanwhile, the contour and shape of the waveform can be preserved completely, meaning that they can be directly input into the constructed convolutional neural network for identification.
The training process of the convolution neural network includes forward and backward propagation, in which forward propagation completes signature extraction and sample classification, and backward propagation completes classification error calculation and weight updating.
In forward propagation, signature extraction is realized by convolution and pooling. The upper output is taken as the input of the layer. After calculating in the convolution kernel, the output of the layer is obtained by further calculation with the non-linear activation function, as shown in Formula (7).
where X j l−1 represents the j-th signature graph of layer l − 1, κ ij l represents the convolutional kernel function of the j-th signature graph mapped from the (l − 1)-th to the l-th layer, f () is the activation function, b j l is the bias parameter, and * represents the convolution. Pooling layer calculation is shown in Formula (8).
where X j l represents the j-th signature graph of layer l. w j l and b j l are the parameters of weight and bias, sample is pooling function and f is activation function. The pooling layer aims to map the signatures to a smaller range and reduce the dimension of the convolutional signature map. Signature information of load signals with relatively weak power is susceptible to noise and harmonic interference from the power grid. In this case, the maximum pooling will result in only extracting the affected signatures while ignoring the actual signature information of the signal, which in turn will impact the identification effect. Therefore, this effect is weakened by replacing the maximum pooling layer with the average pooling layer. Softmax layers are usually used as output layers in multi-classification problems, which can output classification results directly in the form of probability vectors. The calculation formula is shown in Formula (9).
where j represents the result of class j classification and N represents the category of load.
Back propagation depends on the error between the classification result of forward propagation and the given sample label. According to the chain rule, the weight and the error in each layer are updated, as shown in Formulas (10) and (11).
∂E total /∂w = (∂E total /∂out)(∂out/∂net)(∂net/∂w) (10) where ∂E total /∂w represents the partial derivative of the loss function E total to the parameter w, which is updated in each iteration, and ξ represents the learning rate of the convolutional neural network, which determines the magnitude of each adjustment. The model is trained iteratively with the labeled data in the library, and the connection weights of each layer and the parameter matrices are fully adjusted until the data in the database are exhausted. When the model is trained completely, the load identification can be realized in real-time. The separated waveforms of independent loads will be successively acquired in real-time according to the load separation method in the Section 3.1.1. The waveform data will be converted into two-dimensional images and identified online by the convolution neural network as test data.
The implementation process of the steps in this chapter is shown in Figure 4.
where, j represents the result of class j classification and N represents the category of load. Back propagation depends on the error between the classification result of forward propagation and the given sample label. According to the chain rule, the weight and the error in each layer are updated, as shown in Formulas (10) and (11).
where, ∂Etotal/∂w represents the partial derivative of the loss function Etotal to the parameter w, which is updated in each iteration, and ξ represents the learning rate of the convolutional neural network, which determines the magnitude of each adjustment. The model is trained iteratively with the labeled data in the library, and the connection weights of each layer and the parameter matrices are fully adjusted until the data in the database are exhausted. When the model is trained completely, the load identification can be realized in real-time. The separated waveforms of independent loads will be successively acquired in real-time according to the load separation method in the Section 3.1.1. The waveform data will be converted into twodimensional images and identified online by the convolution neural network as test data.
The implementation process of the steps in this chapter is shown in Figure 4.

Experiment and Analysis
Actual user data are collected and used for validity verification in this paper. Figure 5 shows the schematic diagram of the experimental system, which is designed for non-intrusive data acquisition from the actual user. By the proposed method in Section 3, the collected data are processed to realize the effective load monitoring. Actual user data are collected and used for validity verification in this paper. Figure 5 shows the schematic diagram of the experimental system, which is designed for non-intrusive data acquisition from the actual user. By the proposed method in Section 3, the collected data are processed to realize the effective load monitoring. The specific experimental parameters are as follows: the access voltage of the acquisition device is 220 V, and the sampling frequency is 10 kHz. Identification objects include a rice cooker (EC); an electric kettle (EK); a water heater (WH); a water dispenser (WD); a laptop computer (LA); a television (TV); air-conditioning systems A (AC-A), B (AC-B), C (AC-C); a vacuum cleaner (VC); a refrigerator (RE); a microwave oven (MO). Table 1 lists the detailed value of the related threshold involved in the experiment.  Figure 6 shows the current separation signal of the load in the experimental environment, and gives the template current for comparison. Blue lines denote the current separation signals which are the periodic current waveforms operating in a stable state separated by the method in Section 3.1.1. Red lines denote the standard currents which are obtained by only switching on a single electrical appliance in the experiment. The high coincidence between the separated signal and the standard current indicates that the load separation has high accuracy. Since the categories of separated load waveforms are unknown before the load labeling, a-l is used to represent the waveforms of loads in this paper. The specific experimental parameters are as follows: the access voltage of the acquisition device is 220 V, and the sampling frequency is 10 kHz. Identification objects include a rice cooker (EC); an electric kettle (EK); a water heater (WH); a water dispenser (WD); a laptop computer (LA); a television (TV); air-conditioning systems A (AC-A), B (AC-B), C (AC-C); a vacuum cleaner (VC); a refrigerator (RE); a microwave oven (MO). Table 1 lists the detailed value of the related threshold involved in the experiment.  Figure 6 shows the current separation signal of the load in the experimental environment, and gives the template current for comparison. Blue lines denote the current separation signals which are the periodic current waveforms operating in a stable state separated by the method in Section 3.1.1. Red lines denote the standard currents which are obtained by only switching on a single electrical appliance in the experiment. The high coincidence between the separated signal and the standard current indicates that the load separation has high accuracy. Since the categories of separated load waveforms are unknown before the load labeling, a-l is used to represent the waveforms of loads in this paper. After effective load separation and clustering of independent load waveforms, the categories of loads are judged, and the library of independent users is constructed. Table 2 shows the categories, numbers and actual pre-classification of electrical appliances involved in this paper. The load number is used to represent the category discrimination label of the load. After effective load separation and clustering of independent load waveforms, the categories of loads are judged, and the library of independent users is constructed. Table 2 shows the categories, numbers and actual pre-classification of electrical appliances involved in this paper. The load number is used to represent the category discrimination label of the load. The probability distribution of the several unknown loads separated from load separation parts is shown in Figure 7. It shows the label probability of one load belonging to each category of electrical appliances in the experiment, and quantifies the possibility of the load category by the label probability. The label corresponding to the maximum probability is determined as the category label of the load. The threshold ν of unknown class is denoted by the red straight line. If the maximum label probability is still lower than the threshold line, the load will directly be placed in the "unknown load" category. The probability of possible load labels under their pre-classification is shown in Figure 7. The probability outside their pre-classification is 0. In this paper, the label with the highest probability in the pre-classification of unknown loads is regarded as the result of its category. The waveforms and label probability results are shown in Figure 8. It compares the labels and their probabilities for unknown loads with their real labels. It can be seen that the waveforms are labeled correctly. The probability of possible load labels under their pre-classification is shown in Figure 7. The probability outside their pre-classification is 0. In this paper, the label with the highest probability in the pre-classification of unknown loads is regarded as the result of its category. The waveforms and label probability results are shown in Figure 8. It compares the labels and their probabilities for unknown loads with their real labels. It can be seen that the waveforms are labeled correctly.

Effectiveness Verification of Load Identification Based on the Convolutional Neural Network
Before the real-time load identification, the convolutional neural network model is trained by the category-labeled data from the established library. After being obtained, the one-dimensional current data (as shown in Section 4.1) are transformed into two-dimensional image data through the data dimension conversion method described in Section 3.2. Because the maximum current of most common household inserts is limited to 10 A, the maximum operating current of most electrical appliances is usually less than or close to 10 A. Therefore, the vertical axis range of the coordinate axis selected in this paper is from −11 A to +11 A, and the horizontal axis range is the sampling point of one current period when the load is in steady-state operation. The maximum current of a few electrical appliances exceeding 10 A applies the same operation, and the identification result is not affected. The probability of possible load labels under their pre-classification is shown in Figure 7. The probability outside their pre-classification is 0. In this paper, the label with the highest probability in the pre-classification of unknown loads is regarded as the result of its category. The waveforms and label probability results are shown in Figure 8. It compares the labels and their probabilities for unknown loads with their real labels. It can be seen that the waveforms are labeled correctly.

Effectiveness Verification of Load Identification Based on the Convolutional Neural Network
Before the real-time load identification, the convolutional neural network model is trained by the category-labeled data from the established library. After being obtained, the one-dimensional current data (as shown in Section 4.1) are transformed into two-dimensional image data through the data dimension conversion method described in Section 3.2. Because the maximum current of most common household inserts is limited to 10 A, the maximum operating current of most electrical appliances is usually less than or close to 10 A. Therefore, the vertical axis range of the coordinate axis selected in this paper is from −11 A to +11 A, and the horizontal axis range is the sampling point of one current period when the load is in steady-state operation. The maximum current of a few electrical appliances exceeding 10 A applies the same operation, and the identification result is not affected.
The labeled appliances in the library are re-numbered to represent the categories of the load in the process of convolutional neural network identification. The load re-numbering of the library established in Section 4.2 is shown in the Table 3. The labeled appliances in the library are re-numbered to represent the categories of the load in the process of convolutional neural network identification. The load re-numbering of the library established in Section 4.2 is shown in the Table 3. The numbers of convolutional layers and pooling layers are extremely important for the classification accuracy of the model. Figure 9 shows the classification accuracy under different numbers of convolutional and pooling layers. When the number of convolutional and pooling layers is less, the parameters are insufficient for the accurate classification of the sample. With an increasing number of layers, the effectiveness of the model's classification process is clearly improved. However, when the layers continue to increase, the increased training parameters raise the difficulty and time of model training. Limited by the current training methods, the increase in layers is more likely to make the classification results fall into the local optimum, leading to over-fitting and other problems. As shown in the figure, when the convolution and the pooling layer are both set as 3, the model is most effective, and the classification accuracy of the test sample can reach 96.73%. Therefore, considering the accuracy and training time of the model, the number of both the convolutional and pooling layers in the convolutional neural network model is determined as 3. In addition, under this optimal layer structure, the kernels 1, 2, and 3 are set as convolutional layers 1, 2, and 3 respectively, as shown in Table 4.
As shown in the figure, when the convolution and the pooling layer are both set as 3, the model is most effective, and the classification accuracy of the test sample can reach 96.73%. Therefore, considering the accuracy and training time of the model, the number of both the convolutional and pooling layers in the convolutional neural network model is determined as 3. In addition, under this optimal layer structure, the kernels 1, 2, and 3 are set as convolutional layers 1, 2, and 3 respectively, as shown in Table 4.  Table 4. Kernel sizes and numbers in this paper. 1  3 × 3  20  2  5 × 5  20  3 12 × 12 100

Kernel Sizes Numbers
In general, 3 × 3 is the popular choice of kernel size in the convolutional neural network, which is determined by the empirical value in the experiment. Specifically, the kernel size of the convolutional kernel is set to be larger than 1 × 1 to enhance the receptive field. A kernel with an even size cannot ensure the same size of the feature map in the input and output. In the case of the same  In general, 3 × 3 is the popular choice of kernel size in the convolutional neural network, which is determined by the empirical value in the experiment. Specifically, the kernel size of the convolutional kernel is set to be larger than 1 × 1 to enhance the receptive field. A kernel with an even size cannot ensure the same size of the feature map in the input and output. In the case of the same receptive field, the required parameters and computation are increased with the size expansion of the convolutional kernel. Thus, the kernel size of 3 × 3 is used for the first convolutional layer. In order to extract the output image feature of the previous convolutional layer further, the kernel size of the convolutional layer increases gradually, so the second convolutional layer size is set as 5 × 5. In addition, the third layer is the last convolutional layer, followed by the fully connected layer. The input of the fully connected layer needs to be one-dimensional data, which have the same dimension with the output of the third convolutional layer. However, the output dimension of the convolutional layer depends on the input data dimension and kernel size. Considering that the input data dimension of the third layer is 12 × 12, the kernel size of the third convolutional layer is set as 12 × 12. Besides, the size of 12 × 12 is set to reduce the parameter numbers of the fully connected layer significantly. As for kernel numbers, if there are less numbers in the convolutional layers, the extracted image features are not enough for identification, and the model struggles to achieve the desired performance. On the contrary, if the kernel number is set to be oversized, it will incur the problem of model parameters and training speed increasing significantly, as well serious over-fitting problems. Thus, the kernel numbers are empirical values obtained by repeated experiments.
After the determination of the model structure, the model parameters become significant factors in the training process. The parameters of learning rate and epoch are related to the convergence and training speed of the model. The learning rate µ represents the amount of weight updating in each time. If the set value of learning rate is too high, the loss function and model will struggle to converge. On the contrary, if the learning rate is too small, the updating of weights and the change to the model cost will be very small each time, resulting in significantly more epoch times. Epoch times are the training times of all sample data. Figure 10 shows the cost value of the convolutional neural network model training under different learning rates. It can be seen that the identification model tends to converge and the convergence speed is faster at a learning rate of 0.05. When the number of epoch is 500 (i.e., epoch = 500), the loss values in the model are all below 0.003.
training speed of the model. The learning rate μ represents the amount of weight updating in each time. If the set value of learning rate is too high, the loss function and model will struggle to converge. On the contrary, if the learning rate is too small, the updating of weights and the change to the model cost will be very small each time, resulting in significantly more epoch times. Epoch times are the training times of all sample data. Figure 10 shows the cost value of the convolutional neural network model training under different learning rates. It can be seen that the identification model tends to converge and the convergence speed is faster at a learning rate of 0.05. When the number of epoch is 500 (i.e., epoch = 500), the loss values in the model are all below 0.003. Figure 10. The convergence process of cost under different learning rates. Figure 11 illustrates the parameters of the convergence process in the model. Six parameters are selected for display. The parameters kernel_c1 and bias_c1 are one of the weights and one of the biases in the second convolution layer, respectively. The parameters kernel_f1 and bias_f1 are one of the weights and one of the biases in the third convolution layer, respectively. Besides, weight_f1 is one of the weight parameters between the third convolution layer and the fully connected layer. The parameter of weight output is one of the weights between the fully connected layer and the softmax layer. The detailed values of the above six parameters under different epoch times are shown in Table  5.  Figure 11 illustrates the parameters of the convergence process in the model. Six parameters are selected for display. The parameters kernel_c1 and bias_c1 are one of the weights and one of the biases in the second convolution layer, respectively. The parameters kernel_f1 and bias_f1 are one of the weights and one of the biases in the third convolution layer, respectively. Besides, weight_f1 is one of the weight parameters between the third convolution layer and the fully connected layer. The parameter of weight output is one of the weights between the fully connected layer and the softmax layer. The detailed values of the above six parameters under different epoch times are shown in Table 5.  It can be seen that the parameters show a gradual increasing trend as the epoch time increases. There is no significant change in the above parameters when the epoch value is greater than 500. It can be considered that the model is trained to converge when the epoch reaches 500. Thus, this paper chooses the number of epoch as 500 to ensure the training efficiency of the model.
In order to display the model identification results, 12 separated current waveforms in Figure 6 are identified by the convolutional neural network model. Figure 12 shows the model input data after dimension conversion by the method proposed in Section 3.2.  It can be seen that the parameters show a gradual increasing trend as the epoch time increases. There is no significant change in the above parameters when the epoch value is greater than 500. It can be considered that the model is trained to converge when the epoch reaches 500. Thus, this paper chooses the number of epoch as 500 to ensure the training efficiency of the model.
In order to display the model identification results, 12 separated current waveforms in Figure 6 are identified by the convolutional neural network model. Figure 12 shows the model input data after dimension conversion by the method proposed in Section 3.2. It can be seen that the parameters show a gradual increasing trend as the epoch time increases. There is no significant change in the above parameters when the epoch value is greater than 500. It can be considered that the model is trained to converge when the epoch reaches 500. Thus, this paper chooses the number of epoch as 500 to ensure the training efficiency of the model.
In order to display the model identification results, 12 separated current waveforms in Figure 6 are identified by the convolutional neural network model. Figure 12 shows the model input data after dimension conversion by the method proposed in Section 3.2. In the process of identification using the convolutional neural network model, the signature maps extracted from the input data after the first convolutional operation are shown in Figure 13. It can be seen from the figure that the contour edge and other features of each load current waveform in Figure 12 are strengthened and extracted by the kernel in the convolutional layer. In the process of identification using the convolutional neural network model, the signature maps extracted from the input data after the first convolutional operation are shown in Figure 13. It can be seen from the figure that the contour edge and other features of each load current waveform in Figure 12 are strengthened and extracted by the kernel in the convolutional layer. After processing of pooling and activation further, the signature maps are shown in Figure 14. It reduces the feature dimensions of the images in Figure 13, and makes non-linear mapping on the feature image, so as to extract the advanced features for identification. After processing of pooling and activation further, the signature maps are shown in Figure 14. It reduces the feature dimensions of the images in Figure 13, and makes non-linear mapping on the feature image, so as to extract the advanced features for identification. Table 6 gives the classification confusion matrix of the algorithm. The column of the confusion matrix represents the identification label of each load category, the row represents the real label, and the diagonal value represents the accuracy of the correct classification of the load. The identification accuracy increases with the background color deepening. After processing of pooling and activation further, the signature maps are shown in Figure 14. It reduces the feature dimensions of the images in Figure 13, and makes non-linear mapping on the feature image, so as to extract the advanced features for identification.  Table 6 gives the classification confusion matrix of the algorithm. The column of the confusion matrix represents the identification label of each load category, the row represents the real label, and the diagonal value represents the accuracy of the correct classification of the load. The identification accuracy increases with the background color deepening.   In order to present the accuracy of results of the proposed method, the collected data of day1-day3 are identified by the proposed method, and the power consumption ratio of different loads is presented in Figure 15. As a comparison, smart sockets are installed to the monitoring appliances to obtain the true power consumption, which is shown in the right part of the figure. It can be seen that the total consumption difference between the calculated one and the true one is less than 0.3 kW. The consumption ratio of each load is nearly the same as the true value given by the socket, and the load has the correct label.   In order to present the accuracy of results of the proposed method, the collected data of day1-day3 are identified by the proposed method, and the power consumption ratio of different loads is presented in Figure 15. As a comparison, smart sockets are installed to the monitoring appliances to obtain the true power consumption, which is shown in the right part of the figure. It can be seen that the total consumption difference between the calculated one and the true one is less than 0.3 kW. The consumption ratio of each load is nearly the same as the true value given by the socket, and the load has the correct label. In addition, the algorithms proposed by Chao et al. [19], Srinivasan et al. [21], Ahmadi et al. [22] and the genetic algorithm are selected to compare against the proposed method. The above typical algorithm shows good performance on NILM and the feasibility of the proposed method is proved by a comparison of the algorithms. In reference [21], the neural network is also applied to NILM in In addition, the algorithms proposed by Chao et al. [19], Srinivasan et al. [21], Ahmadi et al. [22] and the genetic algorithm are selected to compare against the proposed method. The above typical algorithm shows good performance on NILM and the feasibility of the proposed method is proved by a comparison of the algorithms. In reference [21], the neural network is also applied to NILM in [21] as proposed by Srinivasan. A typical neural network is used to verify the effectiveness of load identification through the model training of neural networks. Different from our work, the neural networks are trained to extract harmonic signatures from the current for load identification. Then, Ahmadi et al. [22] propose a graph signal processing (GSP) approach for NILM. The graph is also formed by steady-state signatures of loads. It poses the load disaggregation problem as a single-channel blind source separation problem to perform low-complexity classification for load identification. It proves that the load can be identified by processing the signal of a load graph, but it is different from the transformation method of the graph signal in the proposed method. Similarly, with our method, the convolutional neural network is also applied to NILM to form a three-step non-intrusive load monitoring system (TNILM) in Chao' work [19]. Due to the purpose, dimension, structure, input and output data of the convolutional neural network, the proposed algorithm outperforms that in Chao' work. In addition, the traditional intelligent algorithm is widely used in non-intrusive load identification. Genetic algorithm optimization is a conventional intelligence algorithm. Thus, as a supplement, genetic algorithm optimization is used as another method of load identification after the library construction to replace the convolutional neural network in this paper for comparison. Figure 16 shows the performance comparison curves with the above mentioned method. The comparison of the algorithms' accuracy of load identification is shown in Figure 16a. The increasing load categories have less influence on the algorithm in this paper. The operational efficiency curves are shown in Figure 16b. In the actual stage of load identification, the proposed method has higher operational efficiency and a stable time of load identification. Represented by the violet line, the TNILM in Chao' work [19] includes the convolutional neural network and a multi-label classifier, so it has a relatively long operation time. Denoted by the blue line, the running time of the algorithm in Srinivasan' work [21] increases rapidly with a rising load number. Represented by the green and orange lines, respectively, the GSP algorithm and the genetic algorithm optimization have more stable operational efficiency, but are overall slower than the algorithm in this paper.
Processes 2020, 8, x FOR PEER REVIEW 19 of 21 intrusive load monitoring system (TNILM) in Chao' work [19]. Due to the purpose, dimension, structure, input and output data of the convolutional neural network, the proposed algorithm outperforms that in Chao' work. In addition, the traditional intelligent algorithm is widely used in non-intrusive load identification. Genetic algorithm optimization is a conventional intelligence algorithm. Thus, as a supplement, genetic algorithm optimization is used as another method of load identification after the library construction to replace the convolutional neural network in this paper for comparison. Figure 16 shows the performance comparison curves with the above mentioned method. The comparison of the algorithms' accuracy of load identification is shown in Figure 16a. The increasing load categories have less influence on the algorithm in this paper. The operational efficiency curves are shown in Figure 16b. In the actual stage of load identification, the proposed method has higher operational efficiency and a stable time of load identification. Represented by the violet line, the TNILM in Chao' work [19] includes the convolutional neural network and a multi-label classifier, so it has a relatively long operation time. Denoted by the blue line, the running time of the algorithm in Srinivasan' work [21] increases rapidly with a rising load number. Represented by the green and orange lines, respectively, the GSP algorithm and the genetic algorithm optimization have more stable operational efficiency, but are overall slower than the algorithm in this paper.

Conclusions
Considering the accuracy and real-time of NILM during actual operation, this paper studies an effective identification method based on the convolutional neural network. Under the high frequency data acquisition mode, this paper adopts the load separation model to obtain the current and voltage waveforms of independent loads and records the corresponding label information using the Bayesian classification model. Then, the convolutional neural network model is briefly trained by the two-dimensional load data in the library to form a classification model suitable for each signature library, realizing long-term load identification in real-time.
In this paper, the two-dimensional image load data are used for identification. This type of data can preserve the contour and shape signature of a waveform completely and avoid the complexity of data reconstruction as in signature extraction and classification. Furthermore, the contour and shape signatures extracted by the convolutional neural network reduce the influence of noise or harmonics on the identification results. The measured data are used to verify the algorithm proposed in our work. The method performs better than the other compared algorithms. With the increase in load categories and number of users, the advantages of the proposed algorithm are clear. The overall accuracy is higher than 92% and the operation time is less than 1.25 s. Thus, the proposed method can identify the switching load effectively using the convolutional neural network and the obtained corresponding power consumption of each load can be calculated accurately. The whole process provides a complete implementation idea for NILM, which can be automatically executed without intervention.
In future work, the dynamic loads with various transients should be considered. Additionally, the influence of load phase information requires further research, and required the resolution and accuracy for measurement may be another research point.

Conflicts of Interest:
The authors declare no conflict of interest.