The core idea of this paper is to utilize features extracted by convolutional neural networks (CNNs), then eliminate the influence of background loads by concatenate networks. The interference to recognition is divided into two parts: the voltage fluctuation and multiform background loads. When the voltage fluctuation is regarded as noise, the estimation result on the UK-DALE dataset shows that the signal-to-noise ratio (SNR) of voltage waveforms is 53 dB. For the measurement error, current and voltage waveforms have the SNR of 90 dB [
9]. On the other hand, background loads might cause a strong noise on the target appliance. For example, for a 1000 W appliance, the SNR is 10 dB when background loads are only 100 W, and the SNR is 0 dB when background loads are 1000 W. It is obvious that the influence of background loads is much stronger than that of the voltage fluctuation. Therefore, the main purpose of this paper is to eliminate background loads.
3.1. Problem Statement
Originally, the inspiration about this paper comes from the parallel circuit shown in
Figure 1. Assume that the main circuit current is
, and the branch currents are
and
respectively, and the background load and the target load are
and
respectively. When the switch
K is turned off, we have the equation
, and
is known. After the switch
K is turned on, we have the equation
according to Kirchhoff’s current law (KCL), and
is known. In the ideal case, it will be ignored that the effect of
on
and the fluctuation of
branch, thus
, further
can be expressed by:
However, experiments shown in
Figure 2 prove that Equation (
1) does not strictly hold.
Figure 2a shows the microwave spectrogram without background loads, which is measured in the laboratory.
Figure 2b shows the microwave spectrogram in the UK-DALE dataset,
Figure 2c shows the spectrogram calculated by spectral estimation based on the previous hypothesis of Equation (
1). Although there is a strong correlation between
Figure 2a and
Figure 2c, spectral estimation does not eliminate the background load precisely, there are still intermittent spectral lines before the appliance is turned on. Besides, compared to
Figure 2a, some detail components are eliminated in
Figure 2c after the appliance is turned on. These phenomena prove that the background load has a certain degree of stationarity, but this does not mean it is exactly unchanged, and
is not equal to
. Accordingly, the background load needs to be estimated more reasonably.
On this issue, this paper proposes a novel approach, concatenate deep neural network, to estimate the features of
indirectly, which can be represented as:
where
is the function to extract the similar part of two features
,
is the function to extract the different part of two features
,
X with different subscripts are features of spectrograms computed by the corresponding current, which can be calculated by:
where
is the spectrogram. The function
is used to extract the features of the mixed load spectrogram and the background spectrogram.
The inspiration of function
and
borrows from Code Division Multiple Access (CDMA), which allows multiple users to transmit independent information within the same bandwidth simultaneously, and the orthogonal spreading code is used to distinguish and extract signals from different users [
31].
In the circuit model, the main circuit current
is given by:
where
K is the number of branches. The branch current
is expressed as:
where
is the rated current of the load on the
k-th branch, which is a constant, and the
represents the time-varying noise function of the load.
Since appliances are relatively independent in construction and operation, their noise functions are almost uncorrelated with each other and can act as the spreading code in CDMA.
The
i-th branch current is recovered by multiplying the noise function:
As the spreading code in CDMA is designed, the noise functions are assumed to be orthogonal in the ideal case, where
Therefore the branch current is restored as:
Unfortunately, unlike CDMA, it is difficult to get the precise “spreading code” for each appliance in practice, and background loads usually consist of multiple loads. Therefore the method of network fitting is used to recover branch current. The first step is to extract the spreading code, and the second step is to reconstruct the feature of the branch current. Then the simplified form of Equation (
4) is:
where
and
represent the branch current of background loads and the load to be recognized (i.e., target load), respectively.
and
are weakly correlated. In practice, it is easy to get the previous background loads
before the target load is turned on. On the hypothesis that background loads are stationary in a short time, the relationship between the noise functions
,
and
is stated as:
In fact, such estimation is not rigorous, because stationarity does not mean complete equality, and weak correlation does not mean strict independence. Thus the similarity learning module
is equiped to fit Equation (
10), and the feature of background loads is obtained by:
For the branch current,
can be calculated by subtraction through Equation (
9). For the corresponding feature, the difference learning module
is used to fit the subtraction operation and obtain the feature of
by:
In summary, the model consists of an embedding module, a similarity learning module, a difference learning module and a classifier, which realizes the complete process of feature extraction, feature selection, and classification.
3.2. Model Architecture
Based on the previous hypothesis, each spectrogram image is split into two blocks representing the background load and the mixed load as shown in
Figure 3. Two blocks are input into the network simultaneously, which can be seen in
Figure 4.
The CNNs in the embedding module are placed at the front end of the network to convert the image matrix into a vector, which is
in Equation (
3), and here the module comprises two networks
with the same structure and different parameters. The similarity learning module
is used to generate the similar part in the concatenate feature and get the background feature behind the mixed feature, we refer to this as the “implicit background”, distinct from the “explicit background” extracted from the background-only load. The concatenation is channel-wise. The difference learning module
converts the features of the mixed load and the “implicit background” to the target feature
. The final classifier determines the label of the target load through the target feature maps. The loss function is the cross-entropy function:
where
is the
i-th bit of the one-hot label
y of
C classes, and
is the
i-th element of the network output
Z with the softmax activation, which can be represented as:
where
is the classifier to map the obtained target spectrogram feature to the
C-dimensional vector, and softmax is the activation.
Figure 5 presents the detail architecture of the proposed network with the embedding module omitted. Similarity learning module seems to be a residual block even though it is not identical. The shortcut connection here aims to convey information about the mixed feature. Each convolutional layer (Conv) comprises a 256-filter convolution of kernel size
and stride 1. The kernel size is determined by the embedding vector computed by the CNN. In all experiments, the size of the embedding vector (the background feature and the mixed feature) is
. The similarity learning module is followed by an average-pooling layer (AvgPool) with kernel size
to compress the feature map in height and width. The output size of two fully connected (FC) layers in the difference learning module is 256. For the classifier, two FC layers are 64 and
C dimensional, respectively.