Smart Non-Intrusive Appliance Load-Monitoring System Based on Phase Diagram Analysis

: Much of today’s power grid was designed and built using technologies and organizational principles developed decades ago. The lack of energy resources and classic power networks are the main causes of the development of the smart grid to efficiently use energy resources, with stable and safe operation. In such a network, one of the fundamental priorities is provided by non-intrusive appliance load monitoring (NIALM) in order to analyze, recognize and determine the electricity consumption of each consumer. In this paper, we propose a new smart system approach for the characterization of the appliance load signature based on a data-driven method, namely the phase diagram. Our aim is to use the non-intrusive load monitoring of appliances in order to recognize different types of consumers that can exist within a smart building.


Introduction
The power grid systems have grown from simple, localized networks to large, physically widespread networks.Despite its importance to modern society, the energy sector has been slower than other industries to adapt to digital technology due to its size and need for high system availability.Due to the need for more efficiency, digital technology is becoming more prevalent within these systems [1].
Public awareness of the energy sector has grown rapidly in the last few years.Many countries are making efforts to develop a smart grid to efficiently use energy resources, with stable operation of the power grid.In the implementation of the smart grid, the demand response plays a key role, because based on this concept, customers can schedule energy consumption and operation of their appliances to reduce their electricity bill [2].The introduction of smart meters as part of smart grids offers a lot of interesting opportunities for understanding energy consumption patterns [3].
Nowadays, household appliances cover a large amount of residential and building energy consumption [4,5].With the help of smart meters, knowing the exact time an appliance is used can be useful in a smart system to optimize energy consumption.In a home, there are many appliances, each of which has its own energy consumption and operating characteristics.By understanding the consumption patterns of individual loads, energy demand management strategies can be implemented [6].
By having access to the energy consumed by each individual appliance, both the electric utility and the consumer can better manage their energy resources.This aspect promoted considerable interest in non-intrusive appliance load-monitoring (NIALM) research.NIALM is a technique for analyzing energy consumption data, monitored from a single-point source such as a smart meter [7,8].The main purpose of NIALM is the ability to identify the specific signature of each appliance, which is useful from the perspective of energy demand management.Until now, there have been many approaches regarding NIALM and most of them rely on signal-processing and machine-learning (ML) techniques to extract individual load features from the signature of the analyzed appliance [9,10].
Refs. [11,12] discussed the use of conventional electrical parameters such as the real power, reactive power and current waveform for appliance recognition.It was concluded that this approach has the drawback that low-power consumer appliances have similar power consumption characteristics, making the recognition task very challenging.
The voltage-current (V-I) plot, obtained by plotting the one-cycle steady-state voltage and current, is another conventional method of handling the recognition process of appliances [13][14][15].In ref. [15], the authors used the V-I information on the plug load appliance identification dataset (PLAID) [16], consisting of current values and voltage values measured from 11 different electrical appliances, to train a CNN network.The V-I information had a high degree of similarity, managing to obtain an F-score equal to only 78.16%.Improving on this approach, the authors of [13,14] used the binary trajectory mapping of the V-I plots in order to transfer the trajectory information to a matrix with little sacrifice in terms of computing complexity.Thus, for the PLAID dataset, they managed to obtain an accuracy of 97.5% [13] and over 93% for a private database [14].
Besides these conventional approaches, different analysis tools, such as the shorttime Fourier transform (STFT) [17,18], wavelet transform (WT) [19,20] and statistical approach [21], have recently been used to obtain the appliance features.In [17], the approach based on STFT was combined with harmonic power and the SVM algorithm to classify data from 11 different loads, obtaining accuracy results above 80% after a training step of several minutes, without any additional information.But this approach is limited to a small number of consumers and it can induce problems when their frequency is similar.A different approach based on a customized deep neural network in a time-frequency domain in order to effectively extract the flexible portion of loads was discussed in [18].This method of analysis for load disaggregation to extract flexible loads offered satisfactory results for numerical simulations of residential and commercial buildings.The computational time is medium, approximately 15 min for the training part and 7 s for the testing part.
The authors of [19] applied a multiscale wavelet packet tree to collect comprehensive energy consumption features.The proposed non-intrusive load-monitoring system accomplishes very satisfying appliance recognition performances in terms of the accuracy, where up to 29 domestic appliances are well identified, with an average accuracy of 97.87% in the case of the GREEND dataset [22].A limitation of the proposed system is the identification of unknown appliances that do not belong to any class in the reference dataset.Another wavelet-based approach is also used in [20], for testing a system composed of two dynamic and two static three-phase loads.The features are extracted by the wavelet transform, with several types of wavelet being analyzed.The obtained results showed that higher order wavelets result in higher accuracy.The Haar wavelet was found to be the least accurate at 75% and the Daubechies wavelets increased in accuracy with an increase in order as the accuracies were found to be 79.60%,84.56%, 96.51% and 99.26% for db2, db3, db4, and db5, respectively.However, the high number of possible wavelets to be used in the analysis represents a major drawback.
A comparative approach from the perspective of machine-learning classification algorithms is presented in [21].The experiments were carried out using the COOLL public NIALM dataset [23], consisting of 42 devices of different brands and power ratings.The authors used the vectors from the current envelopes to extract seven statistical features: interquartile range, crest factor, variance, kurtosis, mean absolute deviation, skewness and form factor, and they used the following as classification algorithms: Naive Bayes (NB), multi-class support vector machine (SVM), ensemble, discriminant analysis (DA), binary decision tree (DT) and k-nearest neighbors (kNN).The worst results were obtained using DA and SVM, with 80.75% and 80.95% accuracy, and the best result was obtained with kNN, 98.41%.
The current development trend within NIALM is focused on two perspectives.First, a scientific perspective, based on ways to optimize the results with artificial intelligence elements.For example, in [24] in order to ensure better energy efficiency in buildings, the authors used the autoregressive moving average with eXternal inputs (ARMAX) model to predict the energy consumption.This approach offers promising results from the perspective of the management, control, metering and billing system of consumption.The other perspective is the commercial one, where the implementation of low-cost monitoring devices based on simple processing is attempted.For example, in [25], the authors proposed a versatile monitoring device (DAQ) for NILM consisting of four voltage and four current inputs of up to 1700 V and 100 A, respectively, with frequencies of up to 64 kHz based on a Raspberry Pi board.This system uses for the analysis part simple characteristics of the recorded signals, such as the average values of V RMS , I RMS and power factor.The results obtained are promising for the three targeted environments: residential, commercial, and industrial.
Based on these considerations, we conclude that the accuracy result depends on the degree of separation of the features, the classification algorithm and the type of appliance.Therefore, in this work, we propose a data-driven analysis method for extracting features, namely the phase diagram analysis [26,27].This approach allows highlighting of the particularities of each appliance by transposing the analyzed current signal into a new representation space, the phase diagram.This approach proves to be very useful in the analysis of energy transport and distribution networks, especially in the detection and classification of faults or external signals that appear in the power cables [28], as well as for the location of partial discharge sources [29].
This paper is structured as follows.In Section 2, the theoretical apparatus of the proposed approach is presented.Section 3 presents the experiment carried out and the results obtained using the phase diagram, which highlight the interest of this approach.Section 4 brings into discussion some aspects related to the proposed approach regarding the advantages and limitations of the method and Section 5 presents the conclusions of this work and the perspectives of this paper.

Signal-Processing Methods
In this section, we detail the theoretical aspects of the proposed smart system based on the phase diagram approach.

The Phase Diagram Analysis
The phase diagram is a method initially introduced for non-linear data analysis [30] in order to characterize a dynamical system where certain non-linear properties can indicate changes in system behavior.Analysis in the time domain involves finding statistical information related to signal parameters, the frequency domain requires Fourier transform of the signal, while analysis in the phase diagram requires embedding of the data in a multidimensional space.
The strategy for the phase diagram design is based on moving from the initial values of the time series as defined in Equation (1) to a vector that defines the new representation space, as shown in Equation ( 2): In the definition of the phase diagram vectors x[i] are the time series samples, m is the embedding dimension, d is the delay between the samples, N is the length of the signal expressed as a time series, → e k are the axis unit vectors and M = N − (m − 1)d.Usually, the delay is computed using the mutual information method [31] or the multi-lag phase-space analysis [32].The embedding dimension is chosen using the false nearest neighbor method [31].An example of transposing the time series to the phase diagram is shown in Figure 1.Usually, the delay is computed using the mutual information method [31] or the multi-lag phase-space analysis [32].The embedding dimension is chosen using the false nearest neighbor method [31].An example of transposing the time series to the phase diagram is shown in Figure 1.The quantification of the information from the phase diagram domain is a current research topic [33].One of the most used methods is the recurrence plot analysis (RPA), which is defined as follows.From the representation obtained using (2), we quantify the distances between the vectors in the phase diagram, using the distance matrix (DM), denoted in Equation ( 3) with  When the DM is compared with a threshold, the recurrence matrix (RM) is obtained as: The quantification of the information from the phase diagram domain is a current research topic [33].One of the most used methods is the recurrence plot analysis (RPA), which is defined as follows.From the representation obtained using (2), we quantify the distances between the vectors in the phase diagram, using the distance matrix (DM), denoted in Equation ( 3) with D i,j : where D(•, •) is a distance applied on the phase diagram vectors (Euclidean distance, L1 norm, angular distance, etc. [34]).A representation of this transformation is shown in Figure 2, where the colors of the new transformation represent the magnitude of the distances between the phase diagram vectors.
of the signal expressed as a time series, k e are the axis unit vectors and Usually, the delay is computed using the mutual information method [31] or the multi-lag phase-space analysis [32].The embedding dimension is chosen using the false nearest neighbor method [31].An example of transposing the time series to the phase diagram is shown in Figure 1.The quantification of the information from the phase diagram domain is a current research topic [33].One of the most used methods is the recurrence plot analysis (RPA), which is defined as follows.From the representation obtained using (2), we quantify the distances between the vectors in the phase diagram, using the distance matrix (DM), denoted in Equation ( 3) with  When the DM is compared with a threshold, the recurrence matrix (RM) is obtained as: When the DM is compared with a threshold, the recurrence matrix (RM) is obtained as: where R i,j is the recurrence matrix, Θ(•) is the Heaviside function and ε(i) is the threshold considered for recurrence.The term recurrence means that the system under study returns in a state previously visited.A representation of this transformation is shown in Figure 3, where after comparing with the threshold we get only two states, represented by black and white.This recurrence approach represents an interesting way of characterizing the appliance, as shown in ref. [35], where it was highlighted that different consumers have different recurrence matrices.What we propose to do next is to generalize the quantification of the information from the phase diagram.Identifying a recurrence is not the final goal we can achieve using this approach.We aim to quantify the entire distribution of the vectors, being able to obtain much more information from this.
where , i j R is the recurrence matrix, ( ) Θ ⋅ is the Heaviside function and ( ) i ε is the threshold considered for recurrence.The term recurrence means that the system under study returns in a state previously visited.A representation of this transformation is shown in Figure 3, where after comparing with the threshold we get only two states, represented by black and white.This recurrence approach represents an interesting way of characterizing the appliance, as shown in ref. [35], where it was highlighted that different consumers have different recurrence matrices.What we propose to do next is to generalize the quantification of the information from the phase diagram.Identifying a recurrence is not the final goal we can achieve using this approach.We aim to quantify the entire distribution of the vectors, being able to obtain much more information from this.

Phase Diagram Metrics Design for Non-Intrusive Load-Monitoring of Appliances
The phase diagram representation has very high potential for quantifying the information about the analyzed signal, as shown in [36], because each load signature has an unique shape in the phase diagram.In order to see this aspect, we describe three phase diagram metrics used in our analysis for separation and identification purposes.In their definition, we consider a two-dimensional phase diagram representation of a time series, obtained for  The first phase diagram feature that we consider is the phase diagram area (PDA), as shown in Figure 5a.As can be seen, the phase diagram can be inscribed in an ellipse.We can determine the surface that includes the signal representation in the phase diagram with the help of the two semi-axes of the ellipse, using Equation ( 5

Phase Diagram Metrics Design for Non-Intrusive Load-Monitoring of Appliances
The phase diagram representation has very high potential for quantifying the information about the analyzed signal, as shown in [36], because each load signature has an unique shape in the phase diagram.In order to see this aspect, we describe three phase diagram metrics used in our analysis for separation and identification purposes.In their definition, we consider a two-dimensional phase diagram representation of a time series, obtained for d = 2 and m = 2, as shown in Figure 4.This recurrence approach represents an interesting way of characterizing the appliance, as shown in ref. [35], where it was highlighted that different consumers have different recurrence matrices.What we propose to do next is to generalize the quantification of the information from the phase diagram.Identifying a recurrence is not the final goal we can achieve using this approach.We aim to quantify the entire distribution of the vectors, being able to obtain much more information from this.

Phase Diagram Metrics Design for Non-Intrusive Load-Monitoring of Appliances
The phase diagram representation has very high potential for quantifying the information about the analyzed signal, as shown in [36], because each load signature has an unique shape in the phase diagram.In order to see this aspect, we describe three phase diagram metrics used in our analysis for separation and identification purposes.In their definition, we consider a two-dimensional phase diagram representation of a time series, obtained for  The first phase diagram feature that we consider is the phase diagram area (PDA), as shown in Figure 5a.As can be seen, the phase diagram can be inscribed in an ellipse.We can determine the surface that includes the signal representation in the phase diagram with the help of the two semi-axes of the ellipse, using Equation ( 5): The first phase diagram feature that we consider is the phase diagram area (PDA), as shown in Figure 5a.As can be seen, the phase diagram can be inscribed in an ellipse.We can determine the surface that includes the signal representation in the phase diagram with the help of the two semi-axes of the ellipse, using Equation ( 5): where a is the major semi-axis and b is minor semi-axis of the ellipse.An important parameter of this feature is provided by the eccentricity e of the ellipse, as defined by Equation ( 6).
For this feature to provide a suitable result, the eccentricity of the ellipse must be chosen so that the ellipse best inscribes the phase diagram representation.This feature also shows us the degree of scattering of the vectors that make up the representation along the two semi-axes of the ellipse.
The second phase diagram feature is the angular mean (AM).Two successive vectors of the representation in the phase diagram create an angle.By summing all the existing angular values in the representation and dividing to their number, we can determine the angular mean value of the entire phase diagram.Considering two successive vectors, → s i and → s i+1 , the angle created by these two vectors α i can be determined with Equation (7).
The angular mean (AM) quantification is shown in Figure 5b and can be determined with Equation (8), where M is the number of the representation points.
This feature provides us with information about the trajectory of the representation in the phase diagram.Also, based on the angle's values, we can highlight the moments when the trajectory will pass through a previous point.
The third phase diagram feature is the spatial point distribution (SPD).The distribution of the points of the phase diagram is generally spread in particular parts of the phase diagram domain.For instance, in the case of the two-dimensional representation shown in Figure 5c, the points are distributed in quadrant Q1 and Q3.In order to quantify this property, we define the SPD parameter as: where More information about the phase diagram metrics can be found in [36].All these previously discussed aspects are implemented in the mathematical algorithm used by the approach included in the smart system for acquiring and processing the extracted data.For data classification, we used ML classifiers elements such as support vector machine (SVM), k-nearest neighbors (kNN), and Naive Bayes (NB).A diagram of the steps in the algorithm is shown in Figure 6.More information about the phase diagram metrics can be found in [36].All these previously discussed aspects are implemented in the mathematical algorithm used by the approach included in the smart system for acquiring and processing the extracted data.For data classification, we used ML classifiers elements such as support vector machine (SVM), k-nearest neighbors (kNN), and Naive Bayes (NB).A diagram of the steps in the algorithm is shown in Figure 6.
More information about the phase diagram metrics can be found in [36].All these previously discussed aspects are implemented in the mathematical algorithm used by the approach included in the smart system for acquiring and processing the extracted data.For data classification, we used ML classifiers elements such as support vector machine (SVM), k-nearest neighbors (kNN), and Naive Bayes (NB).A diagram of the steps in the algorithm is shown in Figure 6.

Experimental Configuration and Results
A block diagram of the designed experimental platform is shown in Figure 7.

Experimental Configuration and Results
A block diagram of the designed experimental platform is shown in Figure 7.
More information about the phase diagram metrics can be found in [36].All these previously discussed aspects are implemented in the mathematical algorithm used by the approach included in the smart system for acquiring and processing the extracted data.For data classification, we used ML classifiers elements such as support vector machine (SVM), k-nearest neighbors (kNN), and Naive Bayes (NB).A diagram of the steps in the algorithm is shown in Figure 6.

Experimental Configuration and Results
A block diagram of the designed experimental platform is shown in Figure 7.The experimental setup consists of the following components: four types of home appliances connected to the electrical network of the house (laptop, heater, LED light bulb and blender), current and voltage transducers used for the acquisition of the electrical signals, signal conditioners and a data acquisition board (Arduino Uno R3, working at 5 V) and a personal computer (PC).The non-invasive current sensor SCT-013 (YHDC, Beijing, China) connected directly to the current wire for which the measurement is desired, without the need to make any modification, has an input current of 0~30 A AC, output voltage 0~1 V and a working frequency of 50 Hz~1 kHz.The analog signals from the outputs of the transducers, connected to the domestic AC power line (220 V/50 Hz), are conditioned so as to be suitable for a microcontroller-based DAQ board.In this phase, analog signals are converted into digital form and these data are transmitted serially to the PC, via a USB, for the processing part.After the measurements, we obtained a diverse database.
The data were recorded at 5700 samples per second.This value provides us with a value of 5.7 kHz for the sampling frequency.The measurement time of each acquisition was 10 ms.A representation of the system used for the acquisition is shown in Figure 8.
An example of the appliance signature of each consumer is shown in Figure 9.
As you can see, each signature is different.The differences can be very easily observed when we move from the representation of the signatures in the time domain to the phase diagram domain.Such an example is shown in Figure 10, which displays the phase diagram of the four signals for d = 2 and m = 2.In our experiment, we compute the delay using the mutual information method and the false nearest neighbor method to compute the embedding dimension.
ditioned so as to be suitable for a microcontroller-based DAQ board.In this phase, analog signals are converted into digital form and these data are transmitted serially to the PC, via a USB, for the processing part.After the measurements, we obtained a diverse database.
The data were recorded at 5700 samples per second.This value provides us with a value of 5.7 kHz for the sampling frequency.The measurement time of each acquisition was 10 ms.A representation of the system used for the acquisition is shown in Figure 8.Based on these phase diagrams, we determine the three metrics derived from the phase diagram for a dataset consisting of 50 signals from each appliance.A representation of their values is shown in the boxplots in Figure 11.Based on these phase diagrams, we determine the three metrics derived from the phase diagram for a dataset consisting of 50 signals from each appliance.A representation of their values is shown in the boxplots in Figure 11.Based on these phase diagrams, we determine the three metrics derived from the phase diagram for a dataset consisting of 50 signals from each appliance.A representation of their values is shown in the boxplots in Figure 11.Based on these phase diagrams, we determine the three metrics derived from the phase diagram for a dataset consisting of 50 signals from each appliance.A representation of their values is shown in the boxplots in Figure 11.Analyzing the boxplots, it can be seen that these metrics offer high potential for discrimination between consumers.Figure 12 emphasizes this, where the degree of separation of the appliances according to the three phase diagram metrics is displayed.
Smart Cities 2024, 7, FOR PEER REVIEW 10 Analyzing the boxplots, it can be seen that these metrics offer high potential for discrimination between consumers.Figure 12 emphasizes this, where the degree of separation of the appliances according to the three phase diagram metrics is displayed.In the features space, the four consumers are well separated, so this approach to identifying the appliance signature provides very suitable results.Besides this approach, which offers extremely promising results for appliance classification, another quantification point that would represent a potential point of interest is provided by the recurrence matrix.In Figure 13, the recurrence matrix is displayed for each appliance.Each representation is unique and it can provide important information about the analyzed signal.In the features space, the four consumers are well separated, so this approach to identifying the appliance signature provides very suitable results.Besides this approach, which offers extremely promising results for appliance classification, another quantification point that would represent a potential point of interest is provided by the recurrence matrix.In Figure 13, the recurrence matrix is displayed for each appliance.Each representation is unique and it can provide important information about the analyzed signal.
Thus, a statistical quantification of these two-dimensional representations could provide another set of valuable features for machine-learning classification algorithms.
After recording a sufficient number of signals, we checked the accuracy of the proposed method.From the database obtained, we used 350 signals specific to each consumer for the training part and 150 signals for each consumer for the testing part using ML algorithms.Figure 14 shows the confusion matrices for the three ML classifiers using the three phase diagram features.
In the features space, the four consumers are well separated, so this approach to identifying the appliance signature provides very suitable results.Besides this approach, which offers extremely promising results for appliance classification, another quantification point that would represent a potential point of interest is provided by the recurrence matrix.In Figure 13, the recurrence matrix is displayed for each appliance.Each representation is unique and it can provide important information about the analyzed signal.Smart Cities 2024, 7, FOR PEER REVIEW 11 Thus, a statistical quantification of these two-dimensional representations could provide another set of valuable features for machine-learning classification algorithms.
After recording a sufficient number of signals, we checked the accuracy of the proposed method.From the database obtained, we used 350 signals specific to each consumer for the training part and 150 signals for each consumer for the testing part using ML algorithms.Figure 14 shows the confusion matrices for the three ML classifiers using the three phase diagram features.As can be seen, we obtain a level of maximum accuracy.This was expected, considering the degree of separation between the features observed in Figure 11.To validate the capabilities of the proposed method, we compare the approach based on the phase diagram with two other approaches identified in the specialized literature.We sought to identify similar approaches based on similar classification principles.
In ref. [20], the for types of loads were classified based on the features extracted using the wavelet transform, and for Daubechies order 4, an accuracy 96.51% was obtain.Due to the number of analyzed signal classes being identical to ours, a comparison with this approach represents an important point of interest.In ref. [21], 42 types of signals were classified after extracting seven statistical features: interquartile range, crest factor, variance, kurtosis, mean absolute deviation, skewness and form factor.The classification was also performed from the perspective of a comparison of ML algorithms, such as SVM, NB and kNN, with this comparison being also present in our approach.The worst result using a classifier that we also used was obtained using SVM, with 80.95% accuracy, and the best result is obtain with kNN, with 98.41% accuracy.Seeing these common elements, we con- As can be seen, we obtain a level of maximum accuracy.This was expected, considering the degree of separation between the features observed in Figure 11.To validate the capabilities of the proposed method, we compare the approach based on the phase diagram with two other approaches identified in the specialized literature.We sought to identify similar approaches based on similar classification principles.
In ref. [20], the for types of loads were classified based on the features extracted using the wavelet transform, and for Daubechies order 4, an accuracy 96.51% was obtain.Due to the number of analyzed signal classes being identical to ours, a comparison with this approach represents an important point of interest.In ref. [21], 42 types of signals were classified after extracting seven statistical features: interquartile range, crest factor, variance, kurtosis, mean absolute deviation, skewness and form factor.The classification was also performed from the perspective of a comparison of ML algorithms, such as SVM, NB and kNN, with this comparison being also present in our approach.The worst result using a classifier that we also used was obtained using SVM, with 80.95% accuracy, and the best result is obtain with kNN, with 98.41% accuracy.Seeing these common elements, we consider that the comparison with this method represents a good point of interest.
We applied the two methods described in refs.[20,21] to the signals recorded during the experiment and obtained the results in Figure 15 from the perspective of the confusion matrices for the three ML classifiers.As can be seen, these methods misclassify some signals.For example, using the SVM classifier, we see that from the 150 blender signals, 17 signals are misclassified an LED bulb light signals and 16 signals are misclassified as laptop signals.Figure 16 shows the accuracy results of the three approaches.For the wavelet-based approach, we obtain a minimum accuracy with the NB classifier at 83.6% and a maximum with the SVM classifier at 94.5%.For the approach based on statistical elements, we obtain a minimum of 93.1% with the NB classifier and a maximum of 98.1% with the SVM classifier.The results obtained with the kNN classifier are intermediate.

Discussion
Using the information presented in this paper, our goal is to implement a low-cost smart microcontroller system that is capable of being used by individuals for various household appliance applications.This approach is suitable because it is based on simple mathematical functions that can be supported by a microcontroller, with limited resources and performance constraints, unlike the methods based on artificial intelligence.The proposed low-cost system can be a centralized one.It collects raw data from various sensors As can be seen, these methods misclassify some signals.For example, using the SVM classifier, we see that from the 150 blender signals, 17 signals are misclassified an LED bulb light signals and 16 signals are misclassified as laptop signals.Figure 16 shows the accuracy results of the three approaches.For the wavelet-based approach, we obtain a minimum accuracy with the NB classifier at 83.6% and a maximum with the SVM classifier at 94.5%.For the approach based on statistical elements, we obtain a minimum of 93.1% with the NB classifier and a maximum of 98.1% with the SVM classifier.The results obtained with the kNN classifier are intermediate.

Discussion
Using the information presented in this paper, our goal is to implement a low-cost smart microcontroller system that is capable of being used by individuals for various household appliance applications.This approach is suitable because it is based on simple mathematical functions that can be supported by a microcontroller, with limited resources and performance constraints, unlike the methods based on artificial intelligence.The proposed low-cost system can be a centralized one.It collects raw data from various sensors For the wavelet-based approach, we obtain a minimum accuracy with the NB classifier at 83.6% and a maximum with the SVM classifier at 94.5%.For the approach based on statistical elements, we obtain a minimum of 93.1% with the NB classifier and a maximum of 98.1% with the SVM classifier.The results obtained with the kNN classifier are intermediate.

Discussion
Using the information presented in this paper, our goal is to implement a low-cost smart microcontroller system that is capable of being used by individuals for various household appliance applications.This approach is suitable because it is based on simple mathematical functions that can be supported by a microcontroller, with limited resources and performance constraints, unlike the methods based on artificial intelligence.The proposed low-cost system can be a centralized one.It collects raw data from various sensors installed on different appliances and sends these data to a central server for phase diagram analysis and load monitoring.This centralization simplifies data management and ensures consistent analysis.
The feature extraction method used in our approach manages to outline precious information that leads to a high degree of separation between them.In general, the choice of features significantly impacts the classification performance, and our selected features do not have limitations in capturing the necessary signal characteristics because each type of signal has a unique phase diagram representation.In this experiment, we proved that the proposed approach has an extremely high degree of validation, outperforming other current approaches.
However, expanding the nature of the experiment, we can identify some limitations of the method.Our approach, while showing promising results, may have some limitations in achieving consistently high performance across different classifiers and a very high number of signal types.Differences in the signal complexity, noise levels, or class distributions could affect the performance of feature extraction and classification methods.
A second possible limitation would stem from the hardware component.When this approach is transposed into a commercial context, it is necessary for the created system to have a price as low as possible.A compromise will have to be reached on the implementation aspects.Obviously, the higher the sampling frequency, the better the resolution of the acquired signal, so the signals can be separated more easily.A careful study must be performed to find the most reliable implementation solution.
In a complex context, for example, the use of this device in a commercial building, where there are many consumers, several optimization methods can be implemented.The first one is the classifier optimization.We can conduct a more thorough evaluation of different classifiers, including hyperparameter tuning and ensemble methods, to identify the optimal classifier for the given feature set.The second one is dataset expansion and augmentation.We can expand the dataset to include a more diverse range of signals.This could improve the model's ability to generalize and perform well in real-world scenarios.

Conclusions
This paper presents a new approach based on phase diagram analysis in order to characterize and classify each existing home appliance load signature connected to the domestic AC power line.Depending on this, decisions regarding the type of appliance can be made.
The approach proposed in this work proved superior in the presented experiment when comparing it with two other approaches identified in the specialized literature.The obtained results show a 100% degree of accuracy, as validated by several ML classifiers.
A smart system for the non-intrusive load monitoring of appliances in residential buildings based on the phase diagram approach can offer very good results, both from the accuracy and simplicity perspective.
Our future work will be based on creating an automated global system through which each analysis of a domestic AC power line will allow the detection of each signature signal and the classification of their generating sources.Also, a future research direction will be based on the extraction of the best combination of statistical features from the recurrence matrix.

m
is the embedding dimension, d is the delay between the samples, N is the length of the signal expressed as a time series, k e  are the axis unit vectors and

Figure 1 .
Figure 1.Generation of the phase diagram from the time series.
applied on the phase diagram vectors (Euclidean distance, L1 norm, angular distance, etc. [34]).A representation of this transformation is shown in Figure 2, where the colors of the new transformation represent the magnitude of the distances between the phase diagram vectors.

Figure 2 .
Figure 2. Generation of the distance matrix from the phase diagram for the Euclidian distance,2 m=

Figure 1 .
Figure 1.Generation of the phase diagram from the time series.

Figure 1 .
Figure 1.Generation of the phase diagram from the time series.
is a distance applied on the phase diagram vectors (Euclidean distance, L1 norm, angular distance, etc. [34]).A representation of this transformation is shown in Figure 2, where the colors of the new transformation represent the magnitude of the distances between the phase diagram vectors.

Figure 2 .
Figure 2. Generation of the distance matrix from the phase diagram for the Euclidian distance, 2 m=

Figure 2 .
Figure 2. Generation of the distance matrix from the phase diagram for the Euclidian distance, m = 2 and d = 2.

Figure 3 .
Figure 3. Generation of the recurrence matrix from the distance matrix for 2 m= , 2 d = and 0.1 ε = .

Figure 4 .
Figure 4.The test signal and its phase diagram representation.

Figure 3 .
Figure 3. Generation of the recurrence matrix from the distance matrix for m = 2, d = 2 and ε = 0.1.

Figure 3 .
Figure 3. Generation of the recurrence matrix from the distance matrix for 2 m= , 2 d = and 0.1 ε = .

Figure 4 .
Figure 4.The test signal and its phase diagram representation.

Figure 4 .
Figure 4.The test signal and its phase diagram representation.

Figure 6 .
Figure 6.Processing diagram of the algorithm based on the phase diagram.

Figure 6 .
Figure 6.Processing diagram of the algorithm based on the phase diagram.

Figure 6 .
Figure 6.Processing diagram of the algorithm based on the phase diagram.

Figure 8 .
Figure 8. Experimental setup.An example of the appliance signature of each consumer is shown in Figure9.As you can see, each signature is different.The differences can be very easily observed when we move from the representation of the signatures in the time domain to the phase diagram domain.Such an example is shown in Figure10, which displays the phase diagram of the four signals for 2 d = and 2 m = .In our experiment, we compute the delay using the mutual information method and the false nearest neighbor method to compute the embedding dimension.

Figure 10 .
Figure 10.The phase diagrams of the signatures of the four home appliances for 2 m = and 2 d = .

Figure 9 .Figure 9 .
Figure 9.The signatures of the four home appliances.

Figure 10 .
Figure 10.The phase diagrams of the signatures of the four home appliances for 2 m = and 2 d = .

Figure 10 .
Figure 10.The phase diagrams of the signatures of the four home appliances for m = 2 and d = 2.

Figure 10 .
Figure 10.The phase diagrams of the signatures of the four home appliances for 2 m = and 2 d = .

Figure 11 .
Figure 11.The boxplots of the three phase diagram metrics.Figure 11.The boxplots of the three phase diagram metrics.

Figure 11 .
Figure 11.The boxplots of the three phase diagram metrics.Figure 11.The boxplots of the three phase diagram metrics.

Figure 12 .
Figure 12.The appliance separation using the three phase diagram metrics.

Figure 12 .
Figure 12.The appliance separation using the three phase diagram metrics.

Figure 14 .
Figure 14.The confusion matrices using the three ML classifiers: (a) training part; and (b) testing part.

Figure 14 .
Figure 14.The confusion matrices using the three ML classifiers: (a) training part; and (b) testing part.

Smart Cities 2024, 7 ,Figure 15 .
Figure 15.The confusion matrices using the methods from the literature: (a) wavelet features; and (b) statistical features.

Figure 16 .
Figure 16.The accuracy obtained using the three approaches: phase diagram, wavelet transform and statistical features [20,21].

Figure 15 .Figure 15 .
Figure 15.The confusion matrices using the methods from the literature: (a) wavelet features; and (b) statistical features.As can be seen, these methods misclassify some signals.For example, using the SVM classifier, we see that from the 150 blender signals, 17 signals are misclassified an LED bulb light signals and 16 signals are misclassified as laptop signals.Figure16shows the accuracy results of the three approaches.

Figure 16 .
Figure 16.The accuracy obtained using the three approaches: phase diagram, wavelet transform and statistical features [20,21].

Figure 16 .
Figure 16.The accuracy obtained using the three approaches: phase diagram, wavelet transform and statistical features [20,21].