Watt’s up at Home? Smart Meter Data Analytics from a Consumer-Centric Perspective

The key advantage of smart meters over traditional metering devices is their ability to transfer consumption information to remote data processing systems. Besides enabling the automated collection of a customer’s electricity consumption for billing purposes, the data collected by these devices makes the realization of many novel use cases possible. However, the large majority of such services are tailored to improve the power grid’s operation as a whole. For example, forecasts of household energy consumption or photovoltaic production allow for improved power plant generation scheduling. Similarly, the detection of anomalous consumption patterns can indicate electricity theft and serve as a trigger for corresponding investigations. Even though customers can directly influence their electrical energy consumption, the range of use cases to the users’ benefit remains much smaller than those that benefit the grid in general. In this work, we thus review the range of services tailored to the needs of end-customers. By briefly discussing their technological foundations and their potential impact on future developments, we highlight the great potentials of utilizing smart meter data from a user-centric perspective. Several open research challenges in this domain, arising from the shortcomings of state-of-the-art data communication and processing methods, are furthermore given. We expect their investigation to lead to significant advancements in data processing services and ultimately raise the customer experience of operating smart meters.


Introduction
After the invention of electricity meters over a century ago, billions of such devices have been installed worldwide [1]. They are found in private households, commercial buildings, industrial sites, and all other domains that require electrical energy consumption to be tracked. Initially realized as rotating-disc meters (also known as Ferraris meters) with mechanical displays, transferring the actual energy consumption data used to be a labor-intensive manual process. However, with the rise of digital metering devices, so-called smart meters, the collection of electrical power consumption at much more finegrained spatial and temporal resolutions has become possible. The digital communication interface to report the collected data represents a major advancement over rotating-disc meters in particular. As a result, meter data have started to become available at previously unimaginable temporal resolutions on the order of seconds to minutes, on building-or even apartment-level. Moreover, the resultant digital data can be easily transmitted to online data centers for storage and further data processing. This opens up unprecedented opportunities to analyze, compare, and combine such data and provide novel energy-based services to customers and grid operators alike [2].
Numerous research activities have accompanied the global roll-out of smart meters [3], seeking to exploit the information content of the collected data to its fullest extent. By leveraging and combining signal processing techniques from a wide range of domains (digital signal processing, stochastic analysis, artificial intelligence, and many others), various indicators and identifying features can be detected and extracted from smart meter data. They allow for the realization of plentiful use cases that benefit consumers, utility companies, or other stakeholders. It is noteworthy, however, that existing works have focused mainly on smart meter data analytics methods to the benefit of the power grid as a whole [4,5]. In contrast to this application domain, equally great potential lies in the provision of user-centric services based on electrical consumption data. We dedicatedly survey such use cases in this work, not least because we anticipate many more services that provide direct benefits to electricity consumers to be developed in the near future.
This review of the state of the art in smart meter data analytics applications is targeted to be a concise introduction that seeks to provide an overview of the range of user-centric applications for smart meter data as well as highlighting promising future research avenues in this domain. It is organized as follows. Section 2 sets the definition of smart meter data used in this paper and highlights frequently used data (pre)processing steps. We survey consumer-centric applications based on smart meter data in Section 3 including the provision of electricity related user feedback, the recognition of patterns or anomalies, the recognition of flexible loads, vital improvements in single home demand forecasting, and finally the comparison and correlation of consumers based on their load profiles. Section 4 discusses the most widely encountered obstacles in developing customer-centric smart meter data services such as missing standardization, mediocre-performing algorithms, and privacy concerns. To surmount these obstacles, we formulate the corresponding research challenges and finally conclude this review paper in Section 5.

Smart Meter Data Collection and Preprocessing
Metering the consumption of primary energy is commonplace and an everyday experience for most people. When refueling the storage tanks of oil-or gas-powered central heating systems or vehicles powered by combustion engines, measuring fuel quantities is ubiquitous to be billed only for the amount added. Consumption metering has also manifested itself for commodities beyond fuels, such as running water, district heating, or pressurized air. However, none of these fields has seen the same enormous increase in data analytics research as the field of electrical power consumption monitoring. In fact, besides a single work on interpreting natural gas consumption [6], only the evaluation of water flows has seen scientific consideration in related works [7][8][9][10][11], primarily seeking to infer user activities based on the corresponding water demands. The reason for the surge of electrical consumption analytics is simple: With the rise of smart meters, electrical consumption data have become available in unprecedented temporal and spatial resolutions. This not only makes longitudinal analyses much easier to conduct, but the high penetration of the building stock with smart meters has also created the foundation to run data analytics at scale. The large variety of electrical consumers [12,13], coupled with their frequent use in everyday activities and the ensuing potentials to save energy, makes them viable candidates for analysis. Before surveying possible use cases and their practical implications in the following section, however, let us first revisit the definition of smart meter data and delineate them from other ways of consumption measurements in electrical power grids.
As shown in Figure 1, smart meters are located at the entry-point of a building's electrical grid connection. All power flows between the (smart) power grid and the appliances in the (smart) home can be captured at this point, thus smart meters can effectively lead to benefits on both sides. Their primary use case lies in billing consumers for the exact amount of electrical energy taken from the power grid or balancing between consumption and generation of prosumers, i.e., grid-connected entities with local generation facilities, respectively. Smart meters are thus distinct from both customer-side monitoring systems, such as circuit-level or even plug-level power monitors, which rarely exhibit the same accuracy as smart meters but rather serve as data sources for smart home installations or Building Management Systems (BMS). Likewise, monitoring devices exist in transmission and distribution grids, yet their data can generally not be unambiguously attributed to a single customer. Coupled with their generally smaller number when compared to the scale at which smart meters have been rolled out, we also exclude such grid-level monitors (such as phasor measurement units) from our analysis in this paper. We specifically wish to highlight, however, that the presented data processing mechanisms and correspondingly enabled use cases can likely also find application on such devices or smart meters for quantities beyond electrical energy.  Management Systems (BMS). Likewise, monitoring devices exist in transmission and distribution 82 grids, yet their data can generally not be unambiguously attributed to a single customer. Coupled 83 with their generally smaller number when compared to the scale at which smart meters have been 84 rolled out, we also exclude such grid-level monitors (such as phasor measurement units) from our 85 analysis in this paper. We specifically wish to highlight, however, that the presented data processing 86 mechanisms and correspondingly enabled use cases can likely also find application on such devices or 87 smart meters for quantities beyond electrical energy. 88 Smart electricity meters represent the state-of-the-art solution to collect, process, and forward 89 load information to all stakeholders involved. Through direct connections to the Internet, or indirect 90 connection using smart meter gateways [14], access to metered data is ubiquitously possible. Incentive 91 schemes and policymakers in many countries furthermore contribute to the increasing market 92 penetration of smart meters. This enables numerous user-centric use cases beyond billing, which we 93 survey and categorize in Section 3. Before documenting how the full potential of smart meter data can 94 be unleashed, we would like to note that the enablement of these use cases frequently relies on data 95 preprocessing steps to isolate characteristic features from the stream of raw measurements provided 96 by smart meters. 97

98
Data collected by smart meters is not always directly usable for the provision of user-centric 99 services. At least some preprocessing steps are generally needed to create a uniform and error-free 100 foundation for data analytics. On the one hand, many services rely on processed input data, such 101 as a building's energy consumption during a specific period, rather than raw readings of electrical 102 voltage levels and current flows. On the other hand, errors introduced during the sampling process, 103 the analog-to-digital conversion step, and the transmission over communication channels raise the 104 possibility of errors and signal falsifications that need to be eliminated. Proper preprocessing thus 105 serves to transform the collected data into a unified and interpretable format, based on which 106 user-centric services can be provided reliably. To establish the foundation for the data preprocessing 107 steps required to realize the use cases surveyed in Section 3, we list typical data preprocessing steps 108 preceding the actual data analysis as follows. 109 First and foremost, obviously erroneous values are generally eliminated. These primarily 110 occur due to faulty storage devices, unreliable communication channels, or buffer overflows on 111 the transmitting or receiving devices. Readings that do not represent valid number representations 112 and infeasible values (e.g., current flows exceeding the nominal circuit breaker limits by a large factor) 113 are thus removed. Unless a long sequence of wrong data is being reported, the imputation of values 114 and the interpolation of gaps in the sampled data (e.g., by using the impyute library [15]) is an effective 115 means to prepare the data for further processing. 116 The fundamental mode of operation of smart meters is to measure raw voltage (V) and current 117 (I) waveforms at sampling rates that allow for the computation of Root Mean Square (RMS) values, Smart electricity meters represent the state-of-the-art solution to collect, process, and forward load information to all stakeholders involved. Through direct connections to the Internet, or indirect connection using smart meter gateways [14], access to metered data is ubiquitously possible. Incentive schemes and policymakers in many countries furthermore contribute to the increasing market penetration of smart meters. This enables numerous user-centric use cases beyond billing, which we survey and categorize in Section 3. Before documenting how the full potential of smart meter data can be unleashed, we would like to note that the enablement of these use cases frequently relies on data preprocessing steps to isolate characteristic features from the stream of raw measurements provided by smart meters.

Data (Pre-)Processing
Data collected by smart meters are not always directly usable for the provision of user-centric services. At least some preprocessing steps are generally needed to create a uniform and error-free foundation for data analytics. On the one hand, many services rely on processed input data, such as a building's energy consumption during a specific period, rather than raw readings of electrical voltage levels and current flows. On the other hand, errors introduced during the sampling process, the analog-to-digital conversion step, and the transmission over communication channels raise the possibility of errors and signal falsifications that need to be eliminated. Proper preprocessing thus serves to transform the collected data into a unified and interpretable format, based on which user-centric services can be provided reliably. To establish the foundation for the data preprocessing steps required to realize the use cases surveyed in Section 3, we list typical data preprocessing steps preceding the actual data analysis as follows.
First, obviously erroneous values are generally eliminated. These primarily occur due to faulty storage devices, unreliable communication channels, or buffer overflows on the transmitting or receiving devices. Readings that do not represent valid number representations and infeasible values (e.g., current flows exceeding the nominal circuit breaker limits by a large factor) are thus removed. Unless a long sequence of wrong data is being reported, the imputation of values and the interpolation of gaps in the sampled data (e.g., by using the impyute library [15]) is an effective means to prepare the data for further processing.
The fundamental mode of operation of smart meters is to measure raw voltage (V) and current (I) waveforms at sampling rates that allow for the computation of Root Mean Square 2 dt, with T denoting the duration of one or more mains periods and V(t) and I(t) being the voltage and current waveform signals, respectively. However, raw data are rarely communicated beyond the local system boundary due to their sheer size and their highly redundant information content [16]. Instead, smart meters typically process the raw samples locally and return one or multiple of the following parameters: RMS voltage (V RMS ), RMS current (I RMS ), phase angle between voltage and current (cos Φ), active power (P), reactive power (Q), apparent power (S), and/or the consumed electrical energy (E). In multi-phase electrical installations, parameters are either returned individually for all phases or merely available in an aggregated fashion. If a particular parameter is required but not directly provided by the smart meter, it may still be possible to calculate it from the provided parameters; this is, again, a part of the preprocessing step.
To demonstrate the variability of data reported by practical smart meter deployment, Table 1 provides a brief overview of the attributes, sampling rate, and communication interface of smart meters and custom-built meters, which have been used to record publicly released electrical consumption datasets. The diversity of the provided data highlights why general data preprocessing is required to create a uniform data representation to realize consumer-centric use cases independently of the specific underlying smart meter hardware. Table 1. Metering devices used and parameters provided in a selection of electricity datasets.

Dataset Smart Meter Model Captured Parameters Sampling Rate Interface
Dataport [17] EG3000 + EG201X a V RMS , I RMS , P, Q, S, cos Φ 1 Hz Modbus iAWE [18] EM6400 b V RMS , I RMS , P, cos Φ 1 Hz Modbus AMPds [19] Powerscout18 c V RMS , I RMS , P, Q, S, E, cos Φ 1/60 Hz Modbus RAE [20] Powerscout24 c V RMS , I RMS , P, Q, S, E, cos Φ 1 Hz Modbus ECO [21] E750 d V RMS , I RMS , P 1 Hz SyM 2 REDD [22] custom design V, I 16.5 kHz USB SustDataED [23] custom design V, I 12.8 kHz USB BLOND [24] custom design V, I 250 kHz TCP a eGauge. b Schneider Electric. c DENT Instruments. d Landis + Gyr. Table 1 highlights one more aspect of heterogeneity in smart meter data, which is also confirmed in [25]: the temporal resolution at which the parameters are being reported. Reducing the rate at which values are being made available, i.e., downsampling smart meter data, is usually trivial and computationally lightweight, as long as the original data have undergone low-pass filtering to avoid aliasing artifacts. Commonly used methods to downsample data include subsampling, averaging, and interpolation [26,27]. Conversely, increasing the temporal resolution of data is not as trivial, but it may be required for smart meter data reported at very low sampling rates. Interpolation techniques such as super-resolution [28] have been shown to achieve good performance during preliminary tests on the Dataport [17] dataset. As the sampling rate is frequently limited by the smart meter's communication channel and processing power, finding the optimal sampling rates for various electricity load analysis algorithms has been investigated in numerous works (e.g., [29][30][31][32]. Similarly, lossy compression mechanisms [33,34]), and pattern recognition methods [16] have been investigated as candidates to maintain high temporal resolutions while reducing the extent of exchanged data.

Extracting Higher-Level Information
While inspecting conditioned smart meter data may be of interest for tech-savvy users or grid operators, it has been shown to provide little benefit to the average consumer, according to Serrenho et al. [35]. Consumer relevant information such as provided in Section 3 must first be inferred from the consumption data by extracting higher level information. This includes signal features, transient events, or individual appliance consumption data. Calculating these features from the consumption data is a widely used preprocessing step that goes beyond the data cleansing and adaptation steps described in Section 2.1. Instead, it is used to eliminate redundant information and only retain the most informative features about the consumption data. Besides this, it also generally leads to implicit data compression, e.g., to utilize the available communication channels optimally or to reduce the input size for machine learning algorithms. Domain experts have introduced and compared numerous features in related works [36][37][38]. For example, Kahl et al. [36] evaluated 36 features such as the voltage and current trajectory or the harmonic energy distribution for their suitability to serve as distinctive higher-level features for the enablement of user-centric services. Because of their virtually ubiquitous usage, we survey a selection of methods to extract higher-level features from smart meter data as follows.
Many user-centric use cases for smart meter data rely on the analysis of user-induced events, e.g., when electrical appliances are being switched on or off, or their mode of operation is changed. In Table 2, we summarize the number of such power events found in a selection of publicly available electricity datasets. The average of the tabulated values is approximately 275 events per day, i.e., approximately one event every 6 min. As such, the Switch Continuity Principle (SCP), first introduced by Hart [39] and confirmed to hold by Makonin [40], states that the total number of events is small compared to the number of samples in the overall signal. In other words, events can be assumed to be anomalies in the signal, which makes it possible to utilize a range of known methods for their detection [41]. In practice, event detection algorithms span the range from computationally lightweight solutions (e.g., using thresholds between successive power samples [39,50,51]) to the application of probabilistic models and voting methods [52][53][54]. More recently, the application of even more complex filters to electrical signals was proposed in order to suppress minor fluctuations while emphasizing actual events. Trung et al. [55] used a CUmulative SUM (CUSUM) filter to clean the power signal, while Wild et al. [56] applied a Kernel Fisher Discriminant Analysis (KFDA) on harmonics of the current signal. De Baets et al. [57] used spectral components of the current signal which have been smoothed using an inverse Hann window in the Cepstral domain, and the method of Cox et al. [58] solely uses the voltage signal and extracts the spectral envelope of the first and third harmonics.
Data collection from smart meters implies that data are only available on the scale of buildings or apartments (cf. Figure 1). Consequently, the energy consumption of individual electrical consumers is not directly identifiable within the reported (aggregate) data. The concept of Non-Intrusive Load Monitoring (NILM) thus refers to the process of disaggregating a composite electrical load into the contributions of all individual consumers. NILM methods frequently utilize machine learning techniques or neural networks to this end [59][60][61][62][63][64][65][66][67][68][69]. This makes their execution on current-generation smart meters largely impossible. However, it is generally possible to send collected data to external entities that offer the required storage and processing capabilities to perform NILM and thus provide appliance-level consumption values. As will become apparent in Section 3, several use cases can benefit from the availability of appliance-level data. The use of NILM, which comes at the advantage of requiring no additional metering devices to be deployed, is thus a widely usable data preparation method to enable additional user-centric use cases when smart meter data is available.

Consumer-Centric Use Cases of Smart Meter Data
While it is crucial for the operators of electrical power grids to understand the load and generation characteristics [5] in order to ensure grid stability and avoid power outages, electrical parameters can also be used to provide services to the benefit of the customers. Figure 2 depicts the primary services that can be realized when smart meter data and the corresponding higher-level information are available. We provide more details about the enabled use cases as follows.

Load profiling Load forecasting User feedback
Demand-side flexibility

Smart meter data
Electricity production data … … Data (pre-)processing

Providing User Feedback
One of the vital value propositions of smart meter deployments is providing near realtime and historical information on electricity consumption to the customers. Having access to such information is expected to result in the adoption of more sustainable consumption behavior, and thus to ultimately lead to energy savings [70][71][72]. Feedback on electricity consumption has been provided in numerous ways, including In-Home Displays (IHDs) [73,74], ambient displays [75,76], web and mobile applications [77][78][79], and public displays [80,81]. While the majority of the works focused on providing information only to the home residents, other studies also looked at the potential of social pressure by enabling direct comparisons between individual consumers or consumer groups [82,83].
A meta-review of 118 studies that involved providing feedback on electricity consumption is presented in [35]. In general, the surveyed studies report that feedback can reduce a household's energy consumption from 5 % to 10 %, particularly in cases where the deployed systems are able to provide consumption information of individual appliances. The potential of feedback to energy savings was also confirmed in [84], where 12 studies on the efficacy of disaggregated feedback were examined. Again, an average energy reduction of 4.5 % was reported across the surveyed studies. Even though there are no reports of long-term results on how to sustain the accomplished energy savings, many works have identified that, without proper engagement strategies, once habituation sets in (after as little as four weeks), there is a considerable loss of interest from the end-users in the feedback devices (e.g., [85][86][87][88]). However, it is evident from the literature that, through visualizing smart meter data in a timely and intuitive way, consumers become increasingly literate in understanding their domestic energy consumption, and in particular on how unintentional behavior can lead to unnecessary consumption [89,90].
With increasing distributed Renewable Energy Sources (RES), such as rooftop Photovoltaic (PV) installations, it also becomes increasingly important to aid users in aligning their consumption habits to their local generation [91,92]. As a result of this trend, energy feedback has received renewed interest to enable prosumers, i.e., consumers with local production facilities, to interact with the power grid optimally. Even at larger scales (e.g., smart microgrids [93]), the emergence of Peer-to-Peer (P2P) energy markets requires prosumers to have an understanding of the saving potentials and the consequences of their actions, both of which can be conveyed through feedback systems [94][95][96]. One such use case is practically studied in [97], confirming that user feedback was consistently utilized throughout the entire duration of the study (4.5 months) in order to make or defer consumption decisions.

Recognizing Patterns and Anomalies
Finding patterns that do not conform to the expected behavior indicated through abnormal electrical energy consumption is another consumer-centric use case for smart meter data. Even though detecting anomalies in smart meter data is challenging, signal processing and machine learning techniques can efficiently be utilized for this purpose. For example, detecting anomalies in smart meter data can be used to enable Ambient Assisted Living (AAL), where consumption patterns are indicative of the Activities of Daily Livings (ADLs) executed by the residents [98][99][100][101]. Detecting unusually short or long ADLs, or unexpected ADLs sequences, in general, are often suitable indicators of unusual user behavior. Knowledge of such situations can help to alert relatives early and thus contribute to safety and well-being [102]. Several different algorithmic approaches have been used to accomplish the recognition of patterns and anomalies. Clement et al. [98] presented a semi-Markov model that describes the daily use of appliances to detect human activity/behavior from smart meter data. In [99], smart meter data are analyzed to identify the behavioral patterns of the occupants, and Bousbiat et al. [100] proposed a framework for detecting abnormal ADLs from smart meter data.
Further use cases based on the application of machine learning for anomaly detection in smart meter data have emerged and manifested themselves in areas such as energy theft detection [103,104], detecting inaccurate smart meters [105], and detecting abnormal consumption behavior in general [106]. In [104], two anomaly detection schemes for detecting energy theft attacks and locating metering defects in smart meter data are presented. The work by Sial et al. [106] investigates heuristic approaches for identifying abnormal energy consumption from smart meter data, based on a combination of four distinct power-, energy-, and time-related features used in conjunction to detect anomalies. An even more sophisticated approach was presented by Liu et al. [105], who applied a deep neural network in detecting inaccurate meters to prevent the unnecessary replacement of smart meters, thus increasing their service life span. Lastly, the detection and quantification of anomalies in smart meter energy data play a crucial role in assessing the energy quality, which is essential for detecting faulty appliances, malfunctioning appliances, and non-technical losses [107][108][109][110].

Enabling Demand-Side Flexibility
Demand-side flexibility (DSF) refers to the portion of electricity demand that can be reduced, increased, or shifted within a specific time window. DSF plays a crucial role in the smart grid by facilitating the integration of RES and reducing peak load demand [111]. Traditionally provided by industrial consumers (e.g., refrigerated warehouses and steel mills [112]), flexibility can also be provided to operators by domestic and commercial consumers through controllable appliances and Electric Vehicles (EVs), e.g., by triggering them to change their consumption profiles [111]. While each consumer is only able to supply a limited amount of flexibility, once controllable consumers (and RES) of multiple dwellings are aggregated, their flexibility can add a significant volume of DSF to the grid. Ultimately, this leads to direct and indirect benefits to a larger group of consumers. On the one hand, it enables an additional revenue source by offering controllable loads to help make demand and supply meet. On the other hand, balanced power grids have a more favorable eco-footprint and an overall lower cost of generation, resulting in cheaper energy tariffs. Nevertheless, this flexibility is highly dependent on consumer behaviors, which correspondingly affects their willingness to provide flexible loads [113]. In this context, smart meter data are crucial to understand the potential of device-level flexibility on the consumer's premises [114][115][116].
In [114], the authors presented one of the first works that analyzed appliance-level consumption data in order to determine the device's flexibility and its relation to device operations and usage patterns. The work shows that a significant percentage (50 % on average) of the total energy demand for a house can be considered to provide flexibility. The results of a pilot study in Belgian households are reported in [115]. Five types of appliances available within residential premises were considered (washing machines, tumble dryers, dishwashers, domestic hot water buffers, and EVs) and assessed concerning their availability for DSF. The authors concluded that, except for EVs, the DSF potential is highly asymmetrical among appliances, possibly associated with user routines. The authors also estimated that EVs and water heaters have a flexibility potential that is much greater than that of wet appliances. In [116], the authors proposed and evaluated a data-driven approach to quantify the potential of flexible loads for participation in DSF programs. Their approach considered EVs, wet appliances (dryer, washing machine, and dishwasher), and Air Conditioning Unit (AC) loads and was evaluated on data from over 300 households from the Pecan Street project [117]. Analogous to previous works' results, the study confirms that variations in providing flexibility are considerable among households. Besides this, the results show that EVs and ACs provide higher levels of flexibility compared to wet appliances. As can be observed, in the context of DSF, EVs are of particular interest to the end-users since beyond sustainable transportation, they provide additional benefits like charging flexibility and a non-stationary energy storage solution [118,119].
While these and other works (e.g., [119][120][121]) assume that individual appliance consumption profiles are readily available, other researchers tried to assess the flexibility of domestic loads relying on NILM (cf. Section 2.2) to extract their individual consumption [122][123][124]. The main motivations for this approach are twofold: (1) avoid the costs of instrumenting the household with sensors in the individual appliances; and (2) protect the consumer privacy by not directly revealing data about individual appliances consumption (see Section 4.3). Ultimately, the obtained results show that it is possible to estimate and predict device-level flexibility from NILM outputs, even though a high disaggregation performance is necessary to reduce the uncertainty of the DSF estimation.

Forecasting Power Demand and Generation
The level of detail made available by smart meters opens several opportunities for load forecasting at the individual building level. Forecasting the electricity consumption using smart meter data plays a significant role in energy management for end-customers by enabling the possibility of linking current usage behaviors to future energy costs [125]. Similarly, anomaly detection (as discussed in Section 3.2) is often closely related to the comparison of actual and predicted consumption (or generation) behavior; as such, efficient and accurate forecasting techniques are required. Forecasting individual household demands is particularly challenging, however, due to many contributing factors. These include, but are not limited to, user behavior, appliance ownership, the considered time period(s), and/or external factors such as the prevailing weather conditions. Against this background, researchers have proposed many forecasting approaches. For example, in [126], four of the most widely used machine learning methods, namely Multi-Layer Perceptron (MLP), Support Vector Machine (SVM), Classification and Regression Tree (CART), and Long Short-Term Memory (LSTM), are used to provide forecasts of both the daily consumption peak and the hourly energy consumption of domestic buildings using historical consumption data. It was found that MLPs and especially LSTM-based approaches can significantly improve the short term (24 h) demand forecasting as these models can capture the underlying non-linear relationships best. Several authors have tried to incorporate information from external factors into the forecasting algorithms. For instance, Amin et al. [127] proposed three different models Piecewise Linear Regression (PLR), Auto-Regressive Integrated Moving Average (ARIMA), and LSTM to forecast the electricity demand of a building leveraging smart meter data and weather information. A similar approach was followed by Gajowniczek and Ząbkowski [125]. However, instead of considering the effect of weather details, the authors focused on enhancing the forecasting algorithms by considering the impact of the residents' behavior patterns. The general consensus is that the combination of historical usage data and external features such as weather and household behavior can provide significant improvements to the forecasting results. Furthermore, these authors also confirm the suitability of LSTM models for shortterm (24-48 h) forecasting. The work by Dinesh et al. [128] demonstrates a novel method to forecast the power consumption of a single house based on NILM and affinity aggregation spectral clustering. The presented work incorporates human behavior and environmental influence in terms of calendar and seasonal contexts to improve individual appliances' forecasting performance. The house-level forecast is thus obtained by the aggregation of the individual appliance-level forecasts.
Prosumers in general, but mainly when they own micro-production units (e.g., PV or wind generators) and Energy Storage Systems (ESS), can use forecasting to optimize and manage these resources. On the one hand, consumption forecasting techniques can help users to anticipate their future energy needs, so they can plan their local generation and optimize the operation of their ESS accordingly. On the other hand, users can also support the operation of the electricity grid by taking control actions to balance the electricity supply and demand while maximizing self-consumption and profiting from energy arbitrage (i.e., trading electricity by purchasing energy at times the price is low and selling it when it is expensive) [129,130]. For example, Hashmi et al. [129] proposed an algorithm to control the ESS in the presence of dynamic pricing, whereas Hashmi et al. [130] optimized the ESS to maximize the PV self-consumption in a scenario where there is no reward for feeding energy into the power grid. In either case, forecasting the future demand is necessary to decide when to charge or discharge the ESS. Particularly, if feeding surplus power into the power grid is not rewarded [130], an understanding of the residual load (i.e., the difference between consumption and production) is necessary, generally based on forecasts of the local production and demand, in order to avoid unintended grid injection or PV curtailment. Intuitively, these optimizations are sensitive to forecasting errors. For example, Kiedanski et al. [131] showed that when the optimizations are performed at higher sampling rates (every 15 min in this work), the negative implications of forecasting errors are limited. In contrast, the authors stated that lower sampling rates (e.g., a 12 h forecasting horizon) require almost perfect forecasts to unleash their full potential to optimize ESS operations.
With the increasing number of EVs sales and their high power consumption during charging, it is also necessary to forecast their charging needs, as this will allow for better scheduling and capacity planning [132,133]. Ai et al. [133] attempted to forecast household day-ahead charging needs using machine learning ensembles. Such forecasts gain particular importance if the EV owners are also prosumers, since in these cases their EVs also function as an ESS. The ability to increase self-consumption and reduce peak demand using EVs was studied by Fachrizal and Munkhammar [134], who showed that, in a single (Swedish) household, the self-consumption could be increased up to 8.7 %. However, this result was obtained in the presence of perfect load demand and PV production forecasts, which again raises the question of sensitivity to forecasting errors. In sum, as more research works indicate that in general EV owners favor domestic over public charging infrastructures (e.g., [135][136][137]), it becomes evident that accurate load demand and production forecasts will gain increasing importance in the near future.

Load Profiling
Standard load profiles [138], i.e., averaged models of customer energy consumption over time, have traditionally found their application in power grid capacity planning. However, standard load profiles are only accurate when considering many connected customers in conjunction and generally do not adequately reflect individual consumers' consumption characteristics. Smart meters can mitigate this situation and allow for capturing load profiles available in an unprecedented resolution. The enabled understanding of energy consumption profiles empowers users not only to better recognize how much energy they consume but also to compare their consumption profiles to the profiles of other dwellings [139]. This gives households greater control of their energy consumption and enables the adoption of more energy-efficient, and responsible behaviors [139][140][141][142]. Instead of considering the load profiles of buildings individually, it is often sufficient to know the category that better describes the dwelling. In other words, by categorizing the electrical power consumption, it is possible to approximate the load profile of a household sufficiently. It is therefore not unexpected that most of the existing load profiling techniques rely on clustering algorithms, such as k-means [143][144][145], fuzzy k-means [143], hierarchical clustering [143,146], Self-Organizing Maps (SOM) [143], neural networks-based clustering [147][148][149], Gaussian Mixture Models (GMM) [150,151], Density-Based Spatial Clustering (DBSCAN) [152], and agglomerative clustering [153]. Due to the high stochasticity and irregularity of household-level consumption, clustering techniques that analyze the variability and uncertainty of smart meter data have also been considered in the literature [150,151]. For example, the work by Lee et al. [143] proposes a two-stage (feature extraction and load pattern identification) k-means clustering for customers segmentation in residential demand response programs.
Load profiling results have been documented to find use in supporting and enhancing continuous energy audits in buildings that currently require multiple measurements [61,154]. Furthermore, the insights generated from load profiling can be used to enhance many other use cases of smart meter data. Eco-feedback techniques often utilize load profiles, e.g., to compare the consumption of individual days or different homes (see Section 3.1). For instance, in [141], an algorithm for computing the carbon footprint derived from load curves is presented. Likewise, load profiles can be used along with load demand forecasts (see Section 3.4) to generate optimal schedules of home appliance usage. This is presented in [155], in which a NILM-based energy management system was developed to schedule controllable loads taking into account customers' preferences and overall satisfaction.
In summary, load profiling at the end-user level enables and potentiates consumercentric services. Furthermore, as load profiling can play an essential role in assisting the smart grid, it will become even more relevant to the individual consumer when Distributed Energy Resources (DERs) and local energy communities become ubiquitous and require the active participation of citizens [156].

Open Research Challenges
The range of customer-centric use cases for smart meter data contributes to numerous areas of daily living. Besides allowing for monetary savings, grid-friendly appliance scheduling, and the detection of atypical and anomalous appliance operations, it has been shown to serve as the enabling technology for AAL as well as the integration of RES, ESS, and EVs. Many of the underlying research challenges have been solved to a satisfactory degree to date, and corresponding commercial solutions are already on the market. During our survey of user-centric applications in Section 3, however, we identified obstacles to the enablement of the services, which potentially impact their widespread acceptance. We thus summarize the most important observed challenges as follows.

Standardized Hardware and Data Formats
As stated in Section 2, there are no universally acknowledged definitions of: (1) the parameters to be reported by smart meters; (2) the temporal resolutions at which they are being made available; and (3) the interface using which service providers can access these data. As such, delivering the use cases to customers generally requires non-negligible adaptation efforts. A widespread approach to achieve compatibility nowadays is when the same company that rolls out and operates the smart meters also acts as the service provider. This "vendor lock-in", however, severely hampers the scale of services that can be provided, as well as their interoperability with other external services. Thus, creating an open ecosystem in which different stakeholders can synergistically combine their (often complementary) components to create an environment that leverages the full potential of smart meter data is currently not possible from a technological point of view. As one first step towards overcoming this obstacle, the International Electrotechnical Commission (IEC) has established a dedicated technical committee (TC 85) for the standardization of equipment, systems, and methods for the analysis of steady-state and transient electrical quantities. One of the committee's publications, DIN IEC/TS 63297, is a project report on "Sensing Devices for Non-Intrusive Load Monitoring" [157], seeking to unify the access to the required data from smart metering devices. Despite the ongoing standardization efforts, however, solutions that cater to the needs of metering and consumer-centric service providers alike remain to be found and widely adopted.
Besides the limited access to the electrical parameters measured by a smart meter, a second major impediment to the roll-out of services is the unavailability of a local execution environment for data processing code. This is particularly relevant to address privacy considerations, as detailed in Section 4.3. Most smart meters do not offer possibilities to run code apart from the device's (metering) firmware. While this is generally intentional, to prevent tampering with the reported data (e.g., to avoid electricity theft, cf. Section 3.2), it does not allow any of the user-centric services to be executed directly on the smart meter. As technology advances and embedded devices are gradually becoming more and more potent in processing power and the ability to run user code in dedicated sandboxes, the provision of an execution environment on smart metering devices appears as a promising and potentially groundbreaking approach. Retrofitting existing and coming smart meter generations will represent a challenge, given the expected operational life of smart meters (often more than a decade) and, more importantly, the expected evolution of software frameworks in the same period. One viable solution is to offload these services to dedicated data processing devices, such as local set-top boxes, edge-clouds, and Multi-access Edge Computing (MEC), or cloud computing in general. The widespread adoption of 5G networks (and more recent developments, such as 6G) will allow sending data of even finer temporal resolution to external processing devices. It can be expected that communication will no longer be the bottleneck, thus the execution of services can be distributed across the aforementioned range of possible processing devices in order to optimize consumerspecific expectations to reliability, privacy, and real-time needs. However, this requires standardized interfaces, (real-time) transport protocols, and data formats aligned with our above observations.

Innovative Consumer-Centric Data Processing Algorithms
Smart meter data analytics services are frequently based on the combination of data preprocessing with novel methods to find correlations, patterns, and outliers in the available input data. Although the concept of electrical load signature analysis has been investigated since 1985 [158], the underlying preprocessing methods (e.g., event detection and NILM) are still not perfect and yield mediocre disaggregation performances in certain settings [159]. Sometimes this limitation can be circumvented by enriching electrical data with other sensed parameters (e.g., ambient conditions) and combining the data collected from different dwellings. However, only when the full amount of information can be extracted from smart meter data, the complete spectrum of user-centric services can be realized. Increasing the data processing methods' reliability and accuracy is crucial for widespread user acceptance and remains a significant future research challenge.
However, the sole availability of a range of data processing services does not necessarily lead to their ubiquitous adoption. Rather, the selection of useful and necessary services is expected to differ significantly between users. Fitting all smart meters with the same processing methods will thus not only incur the excessive and unnecessary use of computational power but still not serve all customers' needs equally well. Ultimately, we expect customers to utilize services depending on their situations. This implies that they need to selectively decide which of the services are of relevance to them. Thus, helping users identify the required services and understand the privacy implications when sharing data with the service providers (cf. Section 4.3) is crucial. This is also well-aligned with Section 4.1, confirming that a more flexible configuration of services and corresponding data sources are needed.
Lastly, we would like to recall our statement from Section 2, in which we emphasize the enormous potential of analyzing electrical signals for the provision of user-centric services. In fact, smart meter data are primarily related to electrical quantities, for whose interpretation an in-depth understanding of electrical engineering and power engineering is required. Many of the use cases surveyed in Section 3 rely on the interpretation of time-series data (i.e., sequences of measurements), which, in turn, calls on the expertise of mathematicians and signal processing experts. Simultaneously, experts in artificial intelligence and machine learning (e.g., computer scientists) can contribute yet different data analytics methods, especially when the volume of data to process is enormous. Smart meter data analytics is thus not only a cross-domain challenge, but also transdisciplinary research communities are inevitable to apply the state-of-the-art methods on smart meter data and thus enable accurate service provision. Finally, we expect that the same methods that apply to smart meter data can also find their application to other metered commodities, such as water or natural gas.

User Privacy Protection
The collection of smart meter data at high temporal resolutions bears the enormous potential to provide services to the electricity customers' benefit. Simultaneously, however, the appropriate protection of collected data against unauthorized third parties' access is strongly needed. The reason is straightforward: Any processing method applicable to captured data (cf. Section 3) can be equally well applied by an attacker, seeking to profile a building's inhabitants (see Section 3.5) or learn about their habits. Often, this includes learning about usual sequences of household activities from consumption profile (as discussed in Section 3.2). Solutions to ensure the secure transmission of smart meter data and adequate user privacy preservation are thus indispensable.
One method to circumvent security and privacy implications from the transmission of smart meter data to centralized processing systems (e.g., cloud computing) is their purely local processing. Due to the high resource requirements of many (pre-)processing methods and services (see, e.g., Section 2.2), however, this approach cannot always be applied. Particularly, when the data processing methods depend on parameters unavailable locally, corresponding computations must be executed on remote systems. Collaborative data processing approaches, i.e., the local extraction of features and their forwarding (devoid of most sensitive information) to remote data processing centers, represent an important future research direction. As a side effect, this also increases the services' adherence to the "data minimization" and "purpose limitation" requirements of data protection laws (such as the European Union's General Data Protection Regulation (GDPR) [160]).
When users cannot exert full control over the data their smart meters report, covering up characteristics in smart meter traces to hide user actions/intentions may also be necessary to protect their privacy. Current approaches mostly realize this functionality employing operating controllable generators or consumers to obfuscate the operation of sensitive appliances (e.g., [161,162]) or by intentionally falsifying reported data [27]. The potentially negative impact on the achievable services based on smart meter data, however, needs to be weighed up individually by clients and their willingness to pay the "cost of privacy" [163].

Conclusions
The operation of smart electrical power grids has become unimaginable without the opportunity to capture the status of grid-connected consumers in real-time and at fine resolution. Processing smart meter data has traditionally been centered around use cases that benefit the operations of electricity providers and the stability of the power grid [5]. The range of services that are tailored to the needs of end-customers is still comparably small. In this review paper, we present and discuss the range of use cases that are enabled through the collection of smart meter data but primarily benefit the consumers of electrical energy. We believe that three major preconditions are crucial for the long-term establishment of user-centric service provision. First, smart meters and the corresponding data processing mechanisms must be capable of reporting accurate information. They must undergo continuous improvements in order to extract the information content to the fullest extent possible. Second, adequate measures must be provided to protect user privacy. Established methods to provide secure networking must be combined with meaningful local preprocessing steps to remove sensitive features before data leave the customers' premises. Third, not all services apply to all users in the same way. A dedicated ecosystem, such as an "app store" for energy-based services (similar to the proposition in [164]), thus represents a viable option to allow consumers to individually subscribe to their desired services and understand the ensuing privacy implications. The range of user-centric data analysis methods, as surveyed in this work, can then be executed either locally or with the help of remote execution environments. A corresponding ecosystem will ultimately make it possible for both developers and providers of smart meter data processing methods to easily offer novel services, and simultaneously lower the barrier for customers to consume these services and avail of their benefits.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: