Machine Learning for Identifying Demand Patterns of Home Energy Management Systems with Dynamic Electricity Pricing

Energy management plays a crucial role in providing necessary system flexibility to deal with the ongoing integration of volatile and intermittent energy sources. Demand Response (DR) programs enhance demand flexibility by communicating energy market price volatility to the end-consumer. In such environments, home energy management systems assist the use of flexible end-appliances, based upon the individual consumer’s personal preferences and beliefs. However, with the latter heterogeneously distributed, not all dynamic pricing schemes are equally adequate for the individual needs of households. We conduct one of the first large scale natural experiments, with multiple dynamic pricing schemes for end consumers, allowing us to analyze different demand behavior in relation with household attributes. We apply a spectral relaxation clustering approach to show distinct groups of households within the two most used dynamic pricing schemes: Time-Of-Use and Real-Time Pricing. The results indicate that a more effective design of smart home energy management systems can lead to a better fit between customer and electricity tariff in order to reduce costs, enhance predictability and stability of load and allow for more optimal use of demand flexibility by such systems.


Introduction
The energy business is going through a series of swift and radical transformations to meet the growing demands for sustainable energy.The future of the energy sector will, to a large extent, be formed by a transformation in the electricity sector, posing challenges for traditional electrical power systems.This shift is of a complex nature, but offers ample opportunities for business and information analytics to support the transition [1].The electricity grid faces decentralized production from renewable sources, electric mobility, and related advances.These are at odds with traditional power systems, where central large-scale generation of electricity follows inelastic consumer demand.Additionally, renewable electricity sources are, unlike traditional sources, volatile in nature and strongly dependent on external factors such as weather conditions, making short-term energy production forecasting increasingly important [2].The non-storability and volatile aspect of sustainable energy sources and the required shift from a demand-driven to a supply-driven market means the energy transition requires system flexibility from all market participants.On the retail side, end-consumers can offer demand flexibility to the grid by shifting their load on moments of peak demand to other times in the day.Energy services and utilities can incentivize such behavior [3], by optimizing feedback to their customers and enabling new forms of dynamic trading.Energy Management Systems (EMS) can control the use of end-appliances and optimize the flexible range, based upon the end consumer's personal preferences and beliefs [4].Demand Response (DR) programs offer financial incentives to take action to reduce or shift load in correspondence to market price behavior.As such, DR builds upon the behavioral traits of energy consumers via comprehensive communication through home energy systems.The effectiveness of such programs is largely affected by the willingness of end-users to be involved in such programs [5].Engaging customers requires systematic communication and interaction between the utility provider and the people it serves, with the intent of building trust, respect, and achieving optimal energy usage amongst the heterogeneity of individual households.
The heterogeneity amongst customers ensures that EMS and the energy service provider have to learn from the customer's individual preferences in order to make optimal recommendations in terms of dynamic tariff targeting, and ultimately, consumer welfare.We apply machine learning techniques in this paper, to show how such systems can recommend tariff schemes based upon the individual household's attributes.Our setting is a natural experiment, involving real-world customers of a national utility company participating via dynamic pricing contracts.This allows us to reach high ecological validity in testing, whether the increase of household informedness influences demand for individual households, and how utilities can use machine learning techniques in designing their home energy management systems for better targeting of customers.
With utility providers starting to introduce dynamic price tariff schemes to residential households, demand side flexibility may enhance stabilizing the grid load under the right price incentives [6].Demand side management strategies, such as demand response, can have a dual effect of reducing electricity consumption and allowing greater efficiency and flexibility in grid management [7].Demand response refers to a wide range of actions that can be taken on the customer side of the electricity meter, in order to respond to specific conditions within the electricity system [8].The different DR programs can generally be classified into two main categories, namely Incentive-Based Programs (IBP) and Price-based Programs (PBP) [9].In PBP programs, consumers are offered dynamic pricing rates over time, typically with prices significantly higher during peak-periods than during off-peak periods.
In this study, we focus on the two most widely used programs within PBP.The tariffs offered to end customers in the set-up of our natural experiment are Time-Of-Use (TOU) and Real-Time Pricing (RTP) pricing.Both tariff schemes vary according to timing and impact on process quality [10].TOU pricing is changing the unit price of electricity during specific time periods at a fixed rate.The tariff tries to encourage customers to shift the consumption from peak to off-peak periods, by reflecting the production and investment cost structure with higher rates during high demand periods.The tariff variations are, however, predetermined and fixed at delivery, making it easily interpretable for the customer to optimize his portfolio.Market DR (Demand Response) relies on Real-Time Pricing (RTP) or wholesale market spot prices, where price signals reflect supply conditions.Although this creates a higher level of demand flexibility for the system, it requires a higher involvement from a customer perspective.Therefore, as customer involvement is dependent on the tariff scheme, correct customer targeting can enhance utility portfolio optimization and by extension smart grid management.In RTP schemes, customers are informed about, mostly hourly changing, varying electricity prices on a day-ahead or hour-ahead basis [11].Given the recent technological developments and an advanced metering infrastructure worldwide, effective communication of prices and detailed load profiles between utility providers and electricity consumers have become possible.Dynamic pricing schemes are usually based on retail prices and reflect real-time system costs, thus encouraging energy consumers to reduce or shift energy consumption during high wholesale price periods.
This study contributes to the growing stream of literature discussing the effects of electricity prices on energy consumption behavior.It is argued that the effectiveness of energy information is not generalizable across cultures and demographic groups [12] and large variations in effect sizes and significance of different electricity pricing types can be found.Previous studies suggest that dynamic pricing strategies would encourage the price-responsive demand to balance supply and demand of electricity [13], with various dynamic pricing schemes impacting consumption behavior of residential households in different ways [14].The first ever hourly Real-Time Pricing (RTP) program for residential customers, however, was only conducted in 2011 in Chicago.The results of the demand estimates of the respective study suggest that dynamic prices only influence the residential electricity demand during peak hours.More specifically, it shows that participating residential households were engaged in peak shaving behavior, but not in load shifting behavior [15].Earlier work however suggests different results depending on the dynamic pricing scheme, where demand based TOU electricity tariffs have decreased peak demand and shifted electricity demand from peak to off-peak periods [16].We contribute by analyzing both respective dynamic pricing schemes, comparing TOU pricing performance to Real-Time Pricing schemes in the same natural experiment.Our study extends the current line of literature by providing results from one of the first natural large-scale open sign up experiments involving dynamic electricity prices.In addition, we focus on the individual preferences and beliefs of each individual, crucial for learning of smart home energy management systems, as household attributes influence price sensitivity and the effectiveness of the dynamic pricing schemes [17].
Household attributes have been proven to play an important role in explaining how willing energy users are to conserve energy [18].Di Cosmo et al. [19] have found evidence that the influence of electricity prices is different across household groups.Households with higher income, higher education levels, and higher electricity use are more reactive to energy consumption behavior change than other groups [20,21].In contrast, other feedback studies could not find a clear link between household-level characteristics and price effectiveness [22].In addition to the influence of electricity prices on absolute electricity consumption, the effect of building characteristics and socio-demographic information significantly influence energy demand behavior.Testing TOU demand behavior in Germany, Schleich and Klobasa [23] find that the size of a household's building positively influences energy consumption and that the number of appliances positively influences the electricity consumption.The number of occupants of a household, the building size of a household, the building type, and the number of bedrooms is found to positively influence electricity consumption [24].A study characterizing domestic electricity consumption patterns in Ireland found that the number of occupants, the building size and the building type positively influence residential electricity consumption [25].Contradictory findings have been reported as well, for example the authors of [26] investigate determinants of residential electricity consumption, finding that the number of occupants and the building size are positively associated with electricity consumption, while the building type and the building age are not associated with electricity consumption.A study investigating short-and long-term price elasticity [27] found that income positively influences peak and off-peak residential electricity consumption, while the household size does not show any significant influence.Whereas the previously mentioned studies investigate the influence of household attributes in a flat or TOU pricing environment, little work has been done in dynamic RTP settings, let alone comparing tariff schemes.Alberini, Gans and Velez-Lopez [28] find that in a dynamic tariff scheme, only building type shows a significant influence on electricity usage, but they do not compare the variety of household attributes to other tariff schemes.Our study is one of the first to analyze for both TOU and RTP, and validate the efficient targeting for utility companies in terms of these characteristics.An overview of related literature can be found in Table 1.
With the growth of home energy management systems and smart meters, the volume, frequency and variety of information is growing exponentially [29].Business intelligence and machine learning techniques can provide utilities and energy-market-related companies with demand forecasts and customer usage patterns to support data-driven decision making.In such a high-dimension space with high frequent multi-variable data as the above context, developing effective clustering methods for customer patterns is a challenging problem, due to the curse of dimensionality [30].We apply spectral relaxation by a Principal Component Analysis (PCA) to find a meaningful representation of the intrinsic dimensionality of the data and perform a k-means clustering analysis in the projected space, to analyze distinct customer group behavior from the natural experiment.This method has become one of the important machine learning algorithms for learning consumer behavior [31], particularly for energy demand forecasting [32,33].Our paper makes several contributions to both dynamic pricing literature and the design of home energy management systems.First, the study shows that time capabilities play a significant role when examining usage behavior in dynamic price settings.By looking at each hour independently, we aim to identify specific times during the day, in which the households are adapting their behavior more rigorously in order to take advantage of varying price levels.We confirm different daily demand profiles for both tariff schemes and find evidence for a significant influence of household attributes.These attributes influence our different household clusters, which have varying capabilities to react to the respective dynamic pricing scheme.Implementing machine learning algorithms in EMS for identified households by utility companies, should lead to better fit between customer and electricity tariff in order to reduce costs, enhance predictability and stability of load and allow for more optimal use of demand flexibility by the EMS.The rest of this paper is organized as follows.Section 2 clarifies the design and set-up of our natural experiment.Section 3.1 describes the effects of dynamic pricing schemes on energy demand in relation with household attributes.Section 3.2 describes the proposed clustering algorithm and discusses the distinct groups for targeting home energy management purposes.Finally, Section 4 discusses the findings and draws conclusions with future scope.

Design of Natural Experiment
A natural experiment in which real-world end-users are exposed to dynamic pricing schemes is set in order to identify demand patterns of residential households.The experiment is designed in collaboration with a Dutch utility company.The study was carried out in collaboration with Qurrent Energie, providing access to the natural experiment.Although similar case studies have been carried out before [34], the dynamic pricing scheme roll-out was the first to combine different dynamic pricing schemes on a large-scale sign-up basis, providing flat and different dynamic tariff options nationwide since late 2016.Households were proactively approached and asked for voluntary participation.The allocation of the usage of the customers is done on the basis of smart meter (P4) data, with a central server communicating tariffs and reading data to individual households and their energy management systems.An overview of the origin of information and data signals can be found in Figure 1.
Large-scale generators sell the produced electricity on wholesale markets to retailers or utility companies.The merit order curve of wholesale electricity markets varies depending on market fundamentals of the underlying fuels, such as the oil and gas price, and variability of weather conditions for renewable power plants.Wholesale price formation depends on the generation costs of the marginal technology in competitive power wholesale markets and elasticity of retailer demand.
Where renewables run at low marginal cost, mainly consisting out of maintenance costs, coal fired power plants and several types of gas power plants run at higher marginal costs, typically due to fuel costs, however have a higher degree of flexibility [35].Large-scale generators sell the produced electricity on wholesale markets to retailers or utility companies.The merit order curve of wholesale electricity markets varies depending on market fundamentals of the underlying fuels, such as the oil and gas price, and variability of weather conditions for renewable power plants.Wholesale price formation depends on the generation costs of the marginal technology in competitive power wholesale markets and elasticity of retailer demand.
Where renewables run at low marginal cost, mainly consisting out of maintenance costs, coal fired power plants and several types of gas power plants run at higher marginal costs, typically due to fuel costs, however have a higher degree of flexibility [35].
Utility companies engage in financial contracts with their customers on retail markets.The tariff scheme for the customer is set by the utility company, determining for every contract the timegranularity of the information stream between wholesale markets and retail markets.Depending on the contract, home energy management systems can guide the customer or act upon his preferences and beliefs in order to optimize his demand profile, where monetary incentives are given by dynamic tariff schemes.With end-appliances becoming highly digitized, so-called prosumers can consume and locally generate production.Distributed production and demand add more flexibility within smart grids and opportunities arise for end-users to shift electricity demand throughout the day.It is therefore of importance for the utility company to offer correct price signals to its customers and optimally target them according individual preferences.As these may vary according to the heterogeneity amongst customers depending on household attributes, we aim to identify behaviors of customers operating a home energy management system in dynamic price settings.
The digitization process of the electricity sector in recent decades has created new services, products and markets, in addition to reshaping existing technical systems and infrastructure.New mechanisms are put in place with data generated by numerous smart devices connected in a so-called network Internet of Things, and opportunities arise in the context of energy and electricity markets [3].The digitalization of energy services can improve utility providers' understanding of increasingly uncertain energy production and consumption, while encouraging a better way of communication with electricity consumers and providing electricity consumers with new information about their use of electricity.Smart grids paired with smart meters provide both supply and demand valuable realtime information on energy flows and consumption, giving also consumers more control over their energy usage, and enabling the development and expansion of demand-side management programs, contributing to needed flexibility in volatile energy systems.In such fast-evolving domains as smart energy markets, it is necessary to develop and adapt existing pricing methods within changing Utility companies engage in financial contracts with their customers on retail markets.The tariff scheme for the customer is set by the utility company, determining for every contract the time-granularity of the information stream between wholesale markets and retail markets.Depending on the contract, home energy management systems can guide the customer or act upon his preferences and beliefs in order to optimize his demand profile, where monetary incentives are given by dynamic tariff schemes.With end-appliances becoming highly digitized, so-called prosumers can consume and locally generate production.Distributed production and demand add more flexibility within smart grids and opportunities arise for end-users to shift electricity demand throughout the day.It is therefore of importance for the utility company to offer correct price signals to its customers and optimally target them according individual preferences.As these may vary according to the heterogeneity amongst customers depending on household attributes, we aim to identify behaviors of customers operating a home energy management system in dynamic price settings.
The digitization process of the electricity sector in recent decades has created new services, products and markets, in addition to reshaping existing technical systems and infrastructure.New mechanisms are put in place with data generated by numerous smart devices connected in a so-called network Internet of Things, and opportunities arise in the context of energy and electricity markets [3].The digitalization of energy services can improve utility providers' understanding of increasingly uncertain energy production and consumption, while encouraging a better way of communication with electricity consumers and providing electricity consumers with new information about their use of electricity.Smart grids paired with smart meters provide both supply and demand valuable real-time information on energy flows and consumption, giving also consumers more control over their energy usage, and enabling the development and expansion of demand-side management programs, contributing to needed flexibility in volatile energy systems.In such fast-evolving domains as smart energy markets, it is necessary to develop and adapt existing pricing methods within changing market dynamics, taking into account individual user preferences and objectives with respect to the decision context.
The Real-Time Prices offered in the experiment are electricity retail prices based upon the Dutch spot wholesale electricity market, APX [36] and are forwarded to the participating households on a day-ahead basis.The TOU prices vary between a peak period and an off-peak period, depending on the Appl.Sci.2017, 7, 1160 6 of 18 month-ahead forward price for electricity and a risk premium according peak and off-peak.Both tariffs are supplemented with a fixed number for surcharges, namely balancing, sustainability guarantees and energy taxes.Prior to the start of the experiment, electricity home energy management devices are installed in all households.As such, households gained access to real-time energy consumption and electricity prices through a web-based dashboard for Notebooks, Smartphones and Tablets.This home energy management system thus allows customers to get a better, real-time understanding of their energy-related activities.A tariff overview allows for the proper review of past household electricity usage behavior, electricity prices and final cost of electricity usage throughout the day.The web application displays usage to the users as the net electricity consumption of a household.Negative usage values indicate that the user generates more electricity than consumed.The respective negative values in the cost column indicate the discount granted to the household for the next electricity bill.A visualization of the web application can be found in Figure 2.
market dynamics, taking into account individual user preferences and objectives with respect to the decision context.
The Real-Time Prices offered in the experiment are electricity retail prices based upon the Dutch spot wholesale electricity market, APX [36] and are forwarded to the participating households on a day-ahead basis.The TOU prices vary between a peak period and an off-peak period, depending on the month-ahead forward price for electricity and a risk premium according peak and off-peak.Both tariffs are supplemented with a fixed number for surcharges, namely balancing, sustainability guarantees and energy taxes.Prior to the start of the experiment, electricity home energy management devices are installed in all households.As such, households gained access to real-time energy consumption and electricity prices through a web-based dashboard for Notebooks, Smartphones and Tablets.This home energy management system thus allows customers to get a better, real-time understanding of their energy-related activities.A tariff overview allows for the proper review of past household electricity usage behavior, electricity prices and final cost of electricity usage throughout the day.The web application displays usage to the users as the net electricity consumption of a household.Negative usage values indicate that the user generates more electricity than consumed.The respective negative values in the cost column indicate the discount granted to the household for the next electricity bill.A visualization of the web application can be found in Figure 2.  The home energy management system allows households to get a better overview of past usage behavior and future outlook in the form of next day prices by a dashboard visualizing consumption and production next to general functions, such as profile, metering, messages and a financial overview.
The users can examine the respective load profiles for electricity and gas for any given point in the past on daily, weekly or monthly levels.Additionally, users can compare their load profile with the load profile of comparable households in the Netherlands.The households have the option of exporting their usage data from the dashboard for a more in-depth analysis.The hourly electricity tariffs are communicated on a day-ahead basis via the home energy management system.
At the beginning of the experimental period, the RTP group receives the hourly real-time electricity prices of the following day, while the TOU group continues to receive the standard time-of-use electricity prices of the energy provider.We selected data from households post experiment, when no privacy concerns were indicated by the individual households.In total, our dataset consists of 78 households in the RTP group, and 150 households for the TOU group.Both groups had access to the home energy management system of the energy provider, and were able to observe their energy consumption in real-time.Next to measuring the energy demand, the selected households answered a survey post experiment about specific attributes and characteristics of their households.Specifically, the survey asked the households about the number of occupants, the type of house, the insulation of the house, the location, the size, the type of heating, and the use of solar panels.An extensive descriptive overview of nomenclature and all measured variables can be found in Appendix A.
Table 2 presents the descriptive statistics of our sample.The average number of household occupants is 2.63 for the RTP group and 2.97 in the TOU group.The average building age of the sample, average building size, building type, heating type and roof insulation experienced similar levels for control and treatment group.No significant differences in terms of our main variables could be found; however, in order to account for the fact that there may be unobserved differences between our households, we have used household fixed effects in our analysis.

Panel Data Analysis of Dynamic Pricing Schemes
We analyze how the RTP and TOU treatment perform in terms of demand variation, in order to validate both schemes and their difference in behavior.Assuming that each household has his very own price sensitivity and energy usage patterns, a panel data regression is capable of assessing the statistical significance of the individual variables within each separate panel.It also accounts for individual-level heterogeneity, making it possible to control for variables that are not measurable, per se, across households or across time.We apply a panel data regression to measure the behavior of entities as households across time [37], allowing double-subscripts on the variables, taking cross-sections into account as well as time-dimensions.We denote: Appl.Sci.2017, 7, 1160 where the U i,t is the dependent variable, representing a form of the demand of household i during the hour t of the day, with error ε.In order to study how the behavior depending on treatment throughout the day, we take an hourly approach.Coefficient α i reflects household-fixed effects that record potential time-invariant, household-level heterogeneity related to the net electricity usage.β represents the vector of coefficient estimates of dimension N of explanatory variables X i,t .More information on the explanatory variables can be found in Table A2.
We verify different behavior between the tariff schemes in two steps.First, we verify demand shifting behavior for both treatments analyzing relative electricity usage throughout the day.We set the dependent variable in (1) to electricity usage relative to total daily usage U i,t , comparing daily usage across treatments.The updated coefficient estimate β is two-dimensional, consisting of the main effect for price and control effect for solar influx.Figure 3 visualizes the relative electricity usage and the average daily electricity price for both tariff schemes.Load shifting behavior occurs more dominantly in the RTP treatment as expected, occurring between 3 p.m. and 9 p.m., corresponding to higher prices for electricity usage.This is compensated for by relatively higher demand from late evening until late morning.
where the , is the dependent variable, representing a form of the demand of household i during the hour t of the day, with error .In order to study how the behavior depending on treatment throughout the day, we take an hourly approach.Coefficient reflects household-fixed effects that record potential time-invariant, household-level heterogeneity related to the net electricity usage.represents the vector of coefficient estimates of dimension N of explanatory variables , .More information on the explanatory variables can be found in Appendix Table A2.
We verify different behavior between the tariff schemes in two steps.First, we verify demand shifting behavior for both treatments analyzing relative electricity usage throughout the day.We set the dependent variable in (1) to electricity usage relative to total daily usage , , comparing daily usage across treatments.The updated coefficient estimate is two-dimensional, consisting of the main effect for price and control effect for solar influx.Figure 3 visualizes the relative electricity usage and the average daily electricity price for both tariff schemes.Load shifting behavior occurs more dominantly in the RTP treatment as expected, occurring between 3 p.m. and 9 p.m., corresponding to higher prices for electricity usage.This is compensated for by relatively higher demand from late evening until late morning.Table 3 gives the differences in regression coefficients between the control and treatment groups.The results show that both dynamic electricity price schemes have a positive significant relationship with relative electricity usage.When we control for interaction effects in the single model, we find a significant negative difference in slopes for the price variable for both treatments ( 0.16 0.31 ).
We thus find that the price difference effect is smaller for RTP treatment than it is for the TOU treatment.Although counterintuitive at first, we argue that this comes from the fact that RTP behavior is less reactive to price differences around peak times, as indicated by Figure 3, finding evidence for differing load shifting behavior between both dynamic price settings.Table 3 gives the differences in regression coefficients between the control and treatment groups.The results show that both dynamic electricity price schemes have a positive significant relationship with relative electricity usage.When we control for interaction effects in the single model, we find a significant negative difference in slopes for the price variable for both treatments (β = −0.16(0.31)).We thus find that the price difference effect is smaller for RTP treatment than it is for the TOU treatment.Although counterintuitive at first, we argue that this comes from the fact that RTP behavior is less reactive to price differences around peak times, as indicated by Figure 3, finding evidence for differing load shifting behavior between both dynamic price settings.
Next, we analyze what effect household attributes have on demand patterns throughout the day.We verify whether these effects are significantly tariff dependent by regressing on the willingness to use energy, and set the dependent variable in (1) to price sensitivity ϑ i,t .We test for multicollinearity by observing correlations between household variables, but find that the predictor variables correlate enough to bias results.We perform a Hausman test, to select random effects or fixed effects for the Appl.Sci.2017, 7, 1160 9 of 18 panel data regression [38].This approach ensures the sufficient measurement of the interrelatedness of the variables of this study, while accounting for individual differences across households and time.Controlling for fixed effects obliges us to explore the relationship between the proposed individual and outcome variables within a household.A random effect model assumes that entity error terms are not correlated with the household level attributes.As the Hausman test was insignificant for all 24 data sets, we make use of random effects.Results can be found in Tables A3 and A4 in Appendixs B and C. When examining the 24 different panel data regression coefficients between the RTP and TOU treatment groups, it becomes evident that the household attributes have a very different influence on both groups.The observed differences are either time-differentiated or show an opposite relationship with the willingness to use electrical energy.We find that some variables, such as the building size, have a distinct time gap in which the influence of the variable is not relevant.Moreover, the variables of building type and roof insulation are shown to be insignificant in the overall panel data regression, but have specific times during the day in which they indeed are significant.
Lastly, we test for significant differences between treatment coefficients.We introduce RTP as a dummy variable g i for the Real-Time Price treatment, and connect it with all the coefficients of interest.As the above regression indicated error terms ε i to be independent for both treatments, we form the combined model.With γ representing the difference in slope between both coefficients, allowing for testing the difference in slopes between regression coefficients.Results are given in Table 4.
We find that the willingness to use electricity is negatively correlated with building age, building size and building type.The analysis indicates that households in the RTP group have a significantly lower willingness to use electricity according to the mentioned household attributes.However, we find a significant positive correlation for the number of household occupants.We argue that this may be the case for larger families, where the utility of shifting demand is lower due to a more inelastic demand curve.
The panel regression analysis shows a clear influence of household attributes on the effect of energy demand by home energy systems operating under dynamic pricing contracts.Home energy management systems with different tariff structures lead to different types of behavior from the customer, depending on their household attributes.Home energy management systems should learn from their customers, via a smart home device or intelligent agent, recommending optimal tariffs to both customer and utility, based upon the individual's preferences and beliefs.The next section makes use of machine learning techniques to achieve this goal.

Reduced Dimension Clustering to Identify Demand Patterns
The development of sensor technologies and network communication technologies, via smart home energy management systems, for example, have had an explosive impact on customer behavior data accumulation.Extracting useful information from the resulting high-frequent multivariate data, machine learning have enormous potential for efficient targeting and steering end-customer behavior [29].Earlier work has applied various machine learning techniques for customer load curve typification, which can be divided into 5 method groups; partitioning, hierarchical, density-based, grid-based and model-based [39].Although no single clustering method is always superior to the other, as they are used for specific applications, most commonly used are partitioning models such as k-means, fuzzy c-means clustering and hierarchical.Non-hierarchical clustering methods [40,41] are preferred over hierarchical in order to prioritize the minimization of internal cluster variance and cluster similarity.Fuzzy c-means and probability neural networks have proven to be useful methods for identifying load customer patterns [42]; however for distinct cluster identification, k-means is often preferred for achieving efficient and scalable results [43].Lastly, in the above context of increasingly large datasets, efficiency can be improved by combining machine learning methods via hybrid approaches [44].
We conduct such a hybrid approach, via a form of spectral relaxation clustering [45], with a kernel Principal Component Analysis (kPCA) combined with k-means, in order to find distinct groups of households for demand targeting by home energy management systems.This section gives a descriptive overview of the reasoning, for a more comprehensive description we refer to earlier work [46].The analysis shows how applying machine learning techniques should ideally play a role in smart home EMS in order to target individual customers along the individual attributes and send tariff recommendations to the utility.
Spectral relaxation clustering finds the eigenvectors of the data set, which are orthogonal, and aims at explaining the variability of the data in lower dimensionality.The first step involves building a similarity matrix and computing the first k eigenvectors.We apply kernel PCA to ensure an interior solution, in that this method renders the eigenvectors of the covariance matrix.As the principal components are the continuous solution of the cluster membership, the analysis allows for identifying groups in terms of their components, while the groups can be used in terms of regression clustering for optimal targeting purposes.In a second step the algorithm builds a new data matrix, where the column represents the eigenvectors.We interpret the rows of this matrix as our new data points and apply k-means clustering.
The principal component analysis is conducted to identify the most powerful variables for the cluster analysis of our households.The analysis aims to find a set of new variables in order to extract the most important information from the data set and analyze the structure of the observations and variables.The results can be seen in Table 5.The analysis gives the following variables as most important factors for clustering households; number of household occupants, building size, building type and terrain type.We evaluate the appropriate cluster solution, by examining the absolute differences between the original and random sum of squared errors (SSE) against the chosen cluster solutions.The appropriate cluster solution is given by the solution where the actual SSE differs the most from the mean of the random SSE.This process is shown in Figure 4, giving evidence for selecting 5 as the appropriate number of clusters.We thus proceed with the five strongest variables, of which the first two components explain more than 80% of the point variability of the entire data set.
The visualization of the k-means clustering is given in Figure 5 against the two strongest principal components, segmenting the data set into clearly distinguishable clusters.The k-means clustering analysis resulted in a cluster sum-of-squares of 84.31.The two strongest principal components explain 78.17% of the point variability within our sample.Each of the five identified clusters of this study should be seen as an archetype that categorizes the households across our sample, with the respective individual within-cluster variance given within brackets:

•
'Cluster 1' (18.09) represents 17% of the participating households.The group of households consists of large groups of people, that are only residing in urban or suburban areas.The majority of this group resides in semi-detached and row-houses.The group represents higher-end urban workers with families.

•
'Cluster 2' (15.75) represents the largest group of households with 36%.This group represents households living in urban areas, in row or semi-detached houses.The group represents lower-end urban workers.

•
'Cluster 3' (14.07) represents 21% of the participating households.The cluster primarily contains three-person households, which live mostly in detached houses, in the city or rural areas.
We identify young starters to be present in this group.

•
'Cluster 4' (4.59) is characterized by a small group composition of 2 persons on average.They are represented in all classes of building type and terrain, but with a relatively large size of building.We find this group to consist of mainly seniors.

•
'Cluster 5' (31.81) is only comprised of 2 households, relatively small in number of occupants, living in detached houses.As the cluster is not found to be significant, it will be ignored for the following elaborations.The visualization of the k-means clustering is given in Figure 5 against the two strongest principal components, segmenting the data set into clearly distinguishable clusters.The k-means clustering analysis resulted in a cluster sum-of-squares of 84.31.The two strongest principal components explain 78.17% of the point variability within our sample.Each of the five identified clusters of this study should be seen as an archetype that categorizes the households across our sample, with the respective individual within-cluster variance given within brackets:


'Cluster 1' (18.09) represents 17% of the participating households.The group of households consists of large groups of people, that are only residing in urban or suburban areas.The majority of this group resides in semi-detached and row-houses.The group represents higherend urban workers with families. 'Cluster 2' (15.75) represents the largest group of households with 36%.This group represents households living in urban areas, in row or semi-detached houses.The group represents lowerend urban workers. 'Cluster 3' (14.07) represents 21% of the participating households.The cluster primarily contains three-person households, which live mostly in detached houses, in the city or rural areas.We identify young starters to be present in this group. 'Cluster 4' (4.59) is characterized by a small group composition of 2 persons on average.They are represented in all classes of building type and terrain, but with a relatively large size of building.We find this group to consist of mainly seniors. 'Cluster 5' (31.81) is only comprised of 2 households, relatively small in number of occupants, living in detached houses.As the cluster is not found to be significant, it will be ignored for the following elaborations.The relative electricity usage across clusters and TOU (control) group is visualized in Figure 6.Cluster 3 is showing extreme behavioral deviations in their relative electricity usage.Comparing it to the relative electricity usage of our control group, it becomes evident that Cluster 3 exchanges part of its daily energy consumption to early noon hours, and saves high amounts during afternoon hours, when the electricity prices are starting to rise again.In addition, Cluster 4 is shows a similarly favorable relative load profile for the Real-Time Prices.More specific, Cluster 4 has an above average relative electricity usage during early morning and late evening times, which are usually the times in which the prices are comparably low.Lastly, Cluster 1 and 2 have very similar relative load profiles compared to the control group.
The results indicate how spectral clustering can help in identifying distinct groups of residential household behavior in relation to their smart home energy management systems.Depending on certain household attributes, we find that specific households are more energy aware and a smart energy management system can beneficially operate the household's appliances according to market developments.Other households are not as price sensitive, rendering a suboptimal situation when RTP is applied.In this case, energy management systems should recommend a TOU or fixed tariff in order to prevent high peak prices.The analysis indicates the potential of smart home energy management systems, with transparent communication between both parties essential in order to optimally engage residential load in dynamic pricing schemes.The relative electricity usage across clusters and TOU (control) group is visualized in Figure 6.Cluster 3 is showing extreme behavioral deviations in their relative electricity usage.Comparing it to the relative electricity usage of our control group, it becomes evident that Cluster 3 exchanges part of its daily energy consumption to early noon hours, and saves high amounts during afternoon hours, when the electricity prices are starting to rise again.In addition, Cluster 4 is shows a similarly favorable relative load profile for the Real-Time Prices.More specific, Cluster 4 has an above average relative electricity usage during early morning and late evening times, which are usually the times in which the prices are comparably low.Lastly, Cluster 1 and 2 have very similar relative load profiles compared to the control group.The relative electricity usage across clusters and TOU (control) group is visualized in Figure 6.Cluster 3 is showing extreme behavioral deviations in their relative electricity usage.Comparing it to the relative electricity usage of our control group, it becomes evident that Cluster 3 exchanges part of its daily energy consumption to early noon hours, and saves high amounts during afternoon hours, when the electricity prices are starting to rise again.In addition, Cluster 4 is shows a similarly favorable relative load profile for the Real-Time Prices.More specific, Cluster 4 has an above average relative electricity usage during early morning and late evening times, which are usually the times in which the prices are comparably low.Lastly, Cluster 1 and 2 have very similar relative load profiles compared to the control group.

Conclusions
Home energy management systems are believed to play a crucial role in efficiently capturing the benefits of DR programs to ensure demand flexibility and peak load reduction.The effectiveness of such programs are, however, largely affected by the willingness of end-users to be involved in such programs.Based upon the individual preferences and beliefs of the customer, not every dynamic pricing program will be as effective for the overall welfare of customer and utility company.This paper, therefore, aims to identify demand patterns of households with the two most-widely used dynamic pricing settings, TOU and RTP pricing.As such, it allows to study a better fit between customer and electricity tariff, allowing for a more optimal use of demand flexibility by the energy management system.Our setting is a natural experiment, involving real-world customers of a national utility company participating via dynamic pricing contracts.
In a first step, we find a significant impact of the dynamic pricing scheme on the relative electricity usage of the individual households and find evidence for different demand behavior based upon the individual household attributes.This indicates the need for individual customer tariff targeting, as static building characteristics, such as building age, building size and building type, and demographic characteristics, such as the number of household occupants, make end-consumers act significantly different in the two dynamic pricing schemes.Secondly, we identify distinct groups of customer demand patterns by conducting a spectral clustering analysis.Our results show clusters with clearly distinguishable behavioral electricity usage patterns.Using machine learning in non-convex data-sets such as household electricity demand data, will allow energy management systems to more optimally target customers based upon individual preferences.
The study presents several conclusions and implications for exploiting smart home energy management systems.By segmenting households based on their household attributes, we are able to isolate groups that differ in their level of engagement with the respective dynamic pricing scheme.This indicates that home energy management systems do not perform equally over a varying set of households, with respect to reducing and shifting load.Our study provides an initial better understanding of households their willingness to engage with dynamic pricing schemes, indicating that successful implementation of home energy management systems will not only be based upon the respective DR program, but also on the individual household preferences and beliefs.

Appendix B
Appl.Sci.2017, 7, 1160 5 of 21 energy management systems.An overview of the origin of information and data signals can be found in Figure 1.

Figure 1 .
Figure 1.Information stream overview with respect to Home Energy Management Systems and communication with the central utility server in relation with electricity market landscape.

Figure 1 .
Figure 1.Information stream overview with respect to Home Energy Management Systems and communication with the central utility server in relation with electricity market landscape.

Figure 2 .
Figure 2. Visualization of Home Energy Management System in natural experiment-Translated from Dutch.(a) displays an electricity tariff overview on mobile devices and (b) a household electricity usage dashboard.

Figure 3 .
Figure 3. Average relative daily load profile and tariff price for Real-Time Pricing (RTP) (with Min and Max) and Time-Of-Use (TOU) pricing.

Figure 3 .
Figure 3. Average relative daily load profile and tariff price for Real-Time Pricing (RTP) (with Min and Max) and Time-Of-Use (TOU) pricing.

Figure 4 .
Figure 4. Difference of within group sum of squared error (SSE) and 250 random sets per cluster solution.

Figure 4 .
Figure 4. Difference of within group sum of squared error (SSE) and 250 random sets per cluster solution.

Figure 5 .
Figure 5. Visualization of Five-Cluster Solution of K-Means Analysis Along the Two Strongest Principal Components.

Figure 6 .
Figure 6.Differences in the Relative Electricity Usage (kw/h) across the Identified Clusters.

Figure 5 .
Figure 5. Visualization of Five-Cluster Solution of K-Means Analysis Along the Two Strongest Principal Components.

Figure 5 .
Figure 5. Visualization of Five-Cluster Solution of K-Means Analysis Along the Two Strongest Principal Components.

Figure 6 .
Figure 6.Differences in the Relative Electricity Usage (kw/h) across the Identified Clusters.

Figure 6 .
Figure 6.Differences in the Relative Electricity Usage (kw/h) across the Identified Clusters.

Table 1 .
Overview of related literature with respect to household attribute's influence on individual energy demand behavior.

Table 2 .
Sample descriptors of household attributes.

Table 3 .
Panel regression coefficient (standard errors sum of squared errors (SSE)) results for both tariff treatments.

Table 3 .
Panel regression coefficient (standard errors sum of squared errors (SSE)) results for both tariff treatments.

Table 4 .
Regression coefficient (standard errors SSE) results of complete model with interaction on price sensitivity.

Table 5 .
Principal component analysis of all available household variables.

Table A3 .
The 24 h regression coefficient (standard errors) results of household attributes on price sensitivity-Real-Time Price group.