# A Dedicated Mixture Model for Clustering Smart Meter Data: Identification and Analysis of Electricity Consumption Behaviors

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- We aim to cluster consumers into a reduced set of groups based on their electricity smart meter data. Our model automatically considers the day type (weekday, Saturday or Sunday), thus providing three typical consumption patterns for each cluster: one for each day type.
- We cross the clustering results with socio-economic information of the consumers studied by the CER survey. This post-analysis may offer insights into the relationships between the socio-economic characteristics of consumers and their electricity consumption.
- We investigate the variability of consumer behavior over time by analyzing the changes in clustering results from month to another.

## 2. Related Work

## 3. Data and Preprocessing

## 4. A Constrained Mixture Model-Based Clustering Approach

#### 4.1. Generative Model

#### 4.2. Maximum Likelihood Estimation via the EM Algorithm

- Expectation (E step), which consists in evaluating the expectation of the complete log-likelihood conditionally on the observed data $({\mathbf{x}}_{1},\dots ,{\mathbf{x}}_{N})$. This quantity is given by:$$\begin{array}{cc}\hfill Q(\mathsf{\Theta},{\mathsf{\Theta}}^{(q)})& =E\left[{L}_{c}(\mathsf{\Theta})|{\mathbf{x}}_{1},\dots ,{\mathbf{x}}_{N},{\mathsf{\Theta}}^{(q)}\right]\hfill \\ & =\sum _{i,k}{\tau}_{ik}^{(q)}log\left({\pi}_{k}\prod _{d}\mathcal{N}({\mathit{x}}_{id};\sum _{l}{\delta}_{dl}{\mu}_{kl},\sum _{l}{\delta}_{dl}{\mathsf{\Sigma}}_{kl})\right),\hfill \end{array}$$$${\tau}_{ik}^{(q)}=\frac{{\pi}_{k}^{(q)}{\prod}_{d}\mathcal{N}\left({\mathit{x}}_{id};{\sum}_{l}{\delta}_{dl}{\mu}_{kl}^{(q)},{\sum}_{l}{\delta}_{dl}{\mathsf{\Sigma}}_{kl}^{(q)}\right)}{{\sum}_{k}{\pi}_{k}^{(q)}{\prod}_{d}\mathcal{N}\left({\mathit{x}}_{id};{\sum}_{l}{\delta}_{dl}{\mu}_{kl}^{(q)},{\sum}_{l}{\delta}_{dl}{\mathsf{\Sigma}}_{kl}^{(q)}\right)}.$$
- Maximization (M step), which consists in maximizing the expectation Q with respect to $\mathsf{\Theta}$. This maximization leads to the following formulas:$$\begin{array}{ccc}\hfill {\pi}_{k}^{(q+1)}& =& \frac{1}{N}\sum _{i}{\tau}_{ik}^{(q)},\hfill \end{array}$$$$\begin{array}{ccc}\hfill {\mu}_{kl}^{(q+1)}& =& \frac{1}{{\sum}_{i,d}{\tau}_{ik}^{(q)}{\delta}_{dl}}\sum _{i,d}{\tau}_{ik}^{(q)}{\delta}_{dl}{\mathit{x}}_{id},\hfill \end{array}$$$$\begin{array}{ccc}\hfill {\mathsf{\Sigma}}_{kl}^{(q+1)}& =& \frac{1}{{\sum}_{i,d}{\tau}_{ik}^{(q)}{\delta}_{dl}}\sum _{i,d}{\tau}_{ik}^{(q)}{\delta}_{dl}\left({\mathit{x}}_{id}-{\mu}_{kl}^{(q+1)}\right){\left({\mathit{x}}_{id}-{\mu}_{kl}^{(q+1)}\right)}^{T}.\hfill \end{array}$$

Algorithm 1: EM algorithm |

## 5. Clustering during the Month of November

#### 5.1. Choosing the Number of Clusters

#### 5.2. Evaluation of the Proposed Algorithm

#### 5.3. Interpretation of the Clustering Results

- Cluster 1 is mainly characterized by low consumption load profile. The pattern seems to be similar during both weekdays and weekend days.
- Clusters 2 and 3 are characterized by a relatively low consumption level with a morning peak during weekdays. These peaks are not striking, and they are followed by a small decline. This attests that a minority of the residents in these households leave home during the day. Lunch and evening times are also observable. It can be noted that the two clusters differ mainly in their evening behavior for the time period between 6 p.m. and midnight (see Figure 4).
- Clusters 4 and 5 exhibit a remarkable electricity consumption peak during weekday mornings. The significant gap between the morning peak value and the consumption level after the drop is linked to the number of occupants in the household. For these clusters, a slight increase of the electricity consumption during the lunch time can also be observed. In the evening, their electricity consumption increases to reach a peak. Here, also, the evening behaviors are different for the two clusters for the time period between 6 p.m. and midnight (see Figure 4).
- The behavior of cluster 6 is quite similar to those of the clusters 4 and 5 in spite of the fact that its consumption level is higher.

## 6. Clustering Applied to the Normalized Data for the Month of November

#### 6.1. Data Normalization

#### 6.2. Interpretation of the Clustering Results

## 7. Residential Behavior Changes over Months

#### 7.1. Methodology

#### 7.2. Discussion

## 8. Conclusions

## Author Contributions

## Conflicts of Interest

## Appendix A

## References

- Nam, T.; Pardo, T.A. Conceptualizing Smart City with Dimensions of Technology, People, and Institutions. In Proceedings of the 12th Annual International Digital Government Research Conference: Digital Government Innovation in Challenging Times, College Park, MD, USA, 12–15 June 2011; ACM: New York, NY, USA, 2011; pp. 282–291. [Google Scholar]
- Giffinger, R.; Fertner, C.; Kramar, H.; Kalasek, R.; Pichler-Milanovic, N.; Meijers, E. Smart Cities-Ranking of European Medium-Sized Cities; Technical Report; Vienna University of Technology: Wien, Austria, 2007. [Google Scholar]
- McLoughlin, F.; Duffy, A.; Conlon, M. Characterising domestic electricity consumption patterns by dwelling and occupant socio-economic variables: An Irish case study. Energy Build.
**2012**, 48, 240–248. [Google Scholar] [CrossRef] - Kolter, J.Z.; Ferreira, J., Jr. A large-scale study on predicting and contextualizing building energy usage. In Proceedings of the 25th AAAI Conference on Artificial Intelligence and the 23rd Innovative Applications of Artificial Intelligence Conference, San Francisco, CA, USA, 7–11 August 2011. [Google Scholar]
- Kavousian, A.; Rajagopal, R.; Fischer, M. Determinants of residential electricity consumption: Using smart meter data to examine the effect of climate, building characteristics, appliance stock, and occupants’ behavior. Energy
**2013**, 55, 184–194. [Google Scholar] [CrossRef] - Devijver, E.; Goude, Y.; Poggi, J.M. Clustering electricity consumers using high-dimensional regression mixture models. arXiv, 2015; arXiv:arXiv:1507.00167. [Google Scholar]
- Beckel, C.; Sadamori, L.; Staake, T.; Santini, S. Revealing household characteristics from smart meter data. Energy
**2014**, 78, 397–410. [Google Scholar] [CrossRef] - Beckel, C.; Sadamori, L.; Santini, S.; Staake, T. Automated Customer Segmentation Based on Smart Meter Data with Temperature and Daylight Sensitivity. In Proceedings of the 6th IEEE International Conference on Smart Grid Communications (SmartGridComm 2015), Miami, FL, USA, 2–5 November 2015; IEEE: Piscataway, NJ, USA, 2015. [Google Scholar]
- Yu, Z.; Fung, B.; Haghighat, F.; Yoshino, H.; Morofsky, E. A systematic procedure to study the influence of occupant behavior on building energy consumption. Energy Build.
**2011**, 43, 1409–1417. [Google Scholar] [CrossRef][Green Version] - Nizar, A.; Dong, Z.Y.; Zhao, J. Load profiling and data mining techniques in electricity deregulated market. In Proceedings of the 2006 IEEE Power Engineering Society General Meeting, Montreal, QC, Canada, 18–22 June 2006; IEEE: Piscataway, NJ, USA, 2006; p. 7. [Google Scholar]
- Melzi, F.N.; Zayani, M.H.; Ben Hamida, A.; Samé, A.; Oukhellou, L. Identifying Daily Electric Consumption Patterns from Smart Meter Data by Means of Clustering Algorithms. In Proceedings of the International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 9–11 December 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1136–1141. [Google Scholar]
- Figueiredo, V.; Rodrigues, F.; Vale, Z.; Gouveia, J.B. An electric energy consumer characterization framework based on data mining techniques. IEEE Trans. Power Syst.
**2005**, 20, 596–602. [Google Scholar] [CrossRef] - Verdú, S.V.; Garcia, M.O.; Senabre, C.; Marin, A.G.; Franco, F.J.G. Classification, filtering, and identification of electrical customer load patterns through the use of self-organizing maps. IEEE Trans. Power Syst.
**2006**, 21, 1672–1682. [Google Scholar] [CrossRef] - Dent, I.; Aickelin, U.; Rodden, T. The application of a data mining framework to energy usage profiling in domestic residences using UK data. arXiv, 2013; arXiv:arXiv:1307.1380. [Google Scholar]
- Khan, I.; Capozzoli, A.; Corgnati, S.P.; Cerquitelli, T. Fault detection analysis of building energy consumption using data mining techniques. Energy Procedia
**2013**, 42, 557–566. [Google Scholar] [CrossRef] - Aqlan, F.; Ahmed, A.; Srihari, K.; Khasawneh, M.T. Integrating Artificial Neural Networks and Cluster Analysis to Assess Energy Efficiency of Buildings. In Proceedings of the 2014 Industrial and Systems Engineering Research Conference, Montreal, QC, Canada, 31 May–4 June 2014; pp. 281–297. [Google Scholar]
- Chicco, G. Overview and performance assessment of the clustering methods for electrical load pattern grouping. Energy
**2012**, 42, 68–80. [Google Scholar] [CrossRef] - Birt, B.J.; Newsham, G.R.; Beausoleil-Morrison, I.; Armstrong, M.M.; Saldanha, N.; Rowlands, I.H. Disaggregating categories of electrical energy end-use from whole-house hourly data. Energy Build.
**2012**, 50, 93–102. [Google Scholar] [CrossRef] - Cao, H.Â.; Beckel, C.; Staake, T. Are domestic load profiles stable over time? An attempt to identify target households for demand side management campaigns. In Proceedings of the 39th IEEE Industrial Electronics Society (IECON), Vienna, Austria, 10–13 November 2013; IEEE: Piscataway, NJ, USA, 2013. [Google Scholar]
- Kwac, J.; Tan, C.W.; Sintov, N.; Flora, J.A.; Rajagopal, R. Utility customer segmentation based on smart meter data: Empirical study. In Proceedings of the 2013 IEEE International Conference on Smart Grid Communications (SmartGridComm), Vancouver, BC, Canada, 21–24 October 2013; pp. 720–725. [Google Scholar]
- McLoughlin, F.; Duffy, A.; Conlon, M. A clustering approach to domestic electricity load profile characterisation using smart metering data. Appl. Energy
**2015**, 141, 190–199. [Google Scholar] [CrossRef] - Haben, S.; Singleton, C.; Grindrod, P. Analysis and clustering of residential customers energy behavioral demand using smart meter data. IEEE Trans. Smart Grid
**2016**, 7, 136–144. [Google Scholar] [CrossRef] - Tong, X.; Li, R.; Li, F.; Kang, C. Cross-domain feature selection and coding for household energy behavior. Energy
**2016**, 107, 9–16. [Google Scholar] [CrossRef] - Wang, Y.; Chen, Q.; Kang, C.; Xia, Q. Clustering of Electricity Consumption Behavior Dynamics Toward Big Data Applications. IEEE Trans. Smart Grid
**2016**, 7, 2437–2447. [Google Scholar] [CrossRef] - Kwac, J.; Flora, J.; Rajagopal, R. Lifestyle segmentation based on energy consumption data. IEEE Trans. Smart Grid
**2017**. [Google Scholar] [CrossRef] - Kwac, J.; Flora, J.; Rajagopal, R. Household Energy Consumption Segmentation Using Hourly Data. IEEE Trans. Smart Grid
**2014**. [Google Scholar] [CrossRef] - Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B
**1977**, 39, 1–38. [Google Scholar] - Schwarz, G. Estimating the Dimension of a Model. Ann. Stat.
**1978**, 6, 461–464. [Google Scholar] [CrossRef] - MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics; Le Cam, L.M., Neyman, J., Eds.; University of California Press: Berkeley, CA, USA, 1967; pp. 281–297. [Google Scholar]
- Lance, G.N.; Williams, W.T. A general theory of classificatory sorting strategies 1. Hierarchical systems. Comput. J.
**1967**, 9, 373–380. [Google Scholar] [CrossRef] - Beckel, C.; Sadamori, L.; Santini, S. Towards Automatic Classification of Private Households Using Electricity Consumption Data. In Proceedings of the Fourth ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings (BuildSys ’12), Toronto, ON, Canada, 6 November 2012; ACM: New York, NY, USA, 2012; pp. 169–176. [Google Scholar]
- Balijepalli, V.M.; Pradhan, V.; Khaparde, S.; Shereef, R. Review of demand response under smart grid paradigm. In Proceedings of the 2011 IEEE PES Innovative Smart Grid Technologies-India (ISGT India), Kollam, India, 1–3 December 2011; pp. 236–243. [Google Scholar]
- Daneshi, H.; Daneshi, A. Real time load forecast in power system. In Proceedings of the 2008 Third International Conference on Electric Utility Deregulation and Restructuring and Power Technologies, Nanjing, China, 6–9 April 2008. [Google Scholar]
- Fan, S.; Hyndman, R.J. Short-Term Load Forecasting Based on a Semi-Parametric Additive Model. IEEE Trans. Power Syst.
**2012**, 27, 134–141. [Google Scholar] [CrossRef] - Abadi, M.L.; Same, A.; Oukhellou, L.; Cheifetz, N.; Mandel, P.; Feliers, C.; Chesneau, O. Predictive Classification of Water Consumption Time Series using Non-homogeneous Markov Models. In Proceedings of the IEEE International Conference on Data science and Advanced Analytics (IEEE DSAA 2017), Tokyo, Japan, 19–21 October 2017. [Google Scholar]

**Figure 1.**Average electricity consumption of residential consumers obtained for the three day types (Saturday, Sunday and weekday) during November.

**Figure 3.**Electricity consumption profiles for the six clusters during Saturday, Sunday and working day for one month data (November).

**Figure 4.**Close-up of the electricity consumption profiles without normalization during the working day.

**Figure 5.**(

**a**) representation of clusters according to employment, (

**b**) social class (AB: managerial; C1C2: intermediate background; DE: manual background; F: farmer), (

**c**) number of appliances, (

**d**) household size, (

**e**) age of the chief income earner, (

**f**) Internet usage, (

**g**) heating and (

**h**) number of employees.

**Figure 6.**Electricity consumption profiles with and without normalization during Saturday, Sunday and working day.

**Figure 7.**Electricity consumption profiles for the six clusters during Saturday, Sunday and working days for one year’s data.

**Figure 12.**Evolution of the monthly electricity consumption behaviors of three consumers over the year.

Clusters | 1 | 2 | 3 | 4 | 5 | 6 | Proportions (%) |
---|---|---|---|---|---|---|---|

1 | 0 | - | - | - | - | - | 11.36 |

2 | 433 | 0 | - | - | - | - | 19.07 |

3 | 743 | 243 | 0 | - | - | - | 20.48 |

4 | 1871 | 449 | 349 | 0 | - | - | 19.47 |

5 | 2634 | 1095 | 445 | 245 | 0 | - | 20.19 |

6 | 6151 | 3357 | 1853 | 1226 | 441 | 0 | 9.44 |

**Table 2.**Comparison between the proposed model, K-means, Hierarchical Ascendant Classification and Basic Gaussian Mixture Model according to intra-class inertia, computational time and number of parameters.

Cluster | Inertia | |||
---|---|---|---|---|

Proposed Model | K-Means | HAC | Basic-GMM | |

Cluster 1 | 27,224 | 437,851 | 52,990 | 35,570 |

Cluster 2 | 173,957 | 270,131 | 235,645 | 260,631 |

Cluster 3 | 189,603 | 173,658 | 556,865 | 183,004 |

Cluster 4 | 459,959 | 582,254 | 448,206 | 547,664 |

Cluster 5 | 456,532 | 430,745 | 769,036 | 465,702 |

Cluster 6 | 492,079 | 483,594 | 309,410 | 512,367 |

Total inertia (${I}_{w}$) | 1,809,356 | 2,378,233 | 2,372,152 | 2,004,938 |

Computational time (sec) | 138 ± 34 | 7 ± 2 | 219 ± 4 | 154 ± 7 |

Number of parameters | 1733 | 8640 | - | 17,285 |

**Table 3.**Table of Contingency between non-normalized clusters (from 1 to 6) and normalized clusters (from A to F).

Clusters | A | B | C | D | E | F |
---|---|---|---|---|---|---|

1 | 15.84 | 9.87 | 38.70 | 18.70 | 12.98 | 3.89 |

2 | 33.63 | 24.28 | 7.69 | 7.54 | 18.40 | 8.44 |

3 | 9.97 | 12.64 | 21.76 | 39.60 | 8.70 | 7.30 |

4 | 27.17 | 29.54 | 3.84 | 4.28 | 20.97 | 14.18 |

5 | 12.07 | 15.05 | 17.18 | 35.51 | 9.09 | 11.07 |

6 | 10.63 | 11.55 | 37.68 | 20.06 | 12.76 | 7.29 |

Clusters | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|

1 | 77.21 | 18.03 | 3.16 | 0.72 | 0.24 | 0.63 |

2 | 8.58 | 69.97 | 9.51 | 10.71 | 0.73 | 0.47 |

3 | 1.71 | 11.27 | 65.13 | 8.72 | 10.26 | 2.89 |

4 | 0.32 | 10.85 | 7.56 | 67.94 | 11.91 | 1.39 |

5 | 0.11 | 0.43 | 8.45 | 10.72 | 69.83 | 10.44 |

6 | 0.46 | 0.65 | 4.98 | 2.17 | 20.54 | 71.17 |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Melzi, F.N.; Same, A.; Zayani, M.H.; Oukhellou, L. A Dedicated Mixture Model for Clustering Smart Meter Data: Identification and Analysis of Electricity Consumption Behaviors. *Energies* **2017**, *10*, 1446.
https://doi.org/10.3390/en10101446

**AMA Style**

Melzi FN, Same A, Zayani MH, Oukhellou L. A Dedicated Mixture Model for Clustering Smart Meter Data: Identification and Analysis of Electricity Consumption Behaviors. *Energies*. 2017; 10(10):1446.
https://doi.org/10.3390/en10101446

**Chicago/Turabian Style**

Melzi, Fateh Nassim, Allou Same, Mohamed Haykel Zayani, and Latifa Oukhellou. 2017. "A Dedicated Mixture Model for Clustering Smart Meter Data: Identification and Analysis of Electricity Consumption Behaviors" *Energies* 10, no. 10: 1446.
https://doi.org/10.3390/en10101446