1. Introduction
Water is an essential resource for survival, and ensuring a good supply represents a huge challenge in metropolitan areas. In recent years, drought and the low flow during peak hours have changed the demand for water [
1]. The emergence of digital water meters (DWMs) represents a major benefit in terms of managing water demand, being an important step in water conservation.
Several recent studies on water management have empirically shown the importance of water consumption awareness. In one use case [
2], when the examined households received information on their own large and excessive consumption, a significant economy of consumption was noted. Practically, summarizing the data collected and sent to consumers as feedback can lead them to self-educate and change their habits to reduce water consumption. Similarly, a decrease of 5.5% on average for water consumption can be obtained when following the feedback given regarding water consumption [
2].
The data collected from DWMs can be used to create short-term water demand forecasting models, as shown by [
1], which reported a significant improvement in peak demand. Moreover, data can be structured into five main categories [
3]: (1) Water use feedback; (2) water event categorization; (3) water demand forecasting; (4) behavior analysis; (5) socioeconomic analysis.
By overcoming technological barriers, current systems can be optimized by integrating recommendation systems (RSs). There is a close connection between personalized feedback and its effects on water conservation. Thus, personalization can be obtained through proper profiling of users. RSs highlight the preferences and habits of consumers, offering consumption recommendations while considering the needs of users. User profiling consists of analyzing the collected data and extracting specific characteristics and behaviors [
4]. This technique is often applied in research in various fields, such as artificial intelligence and machine learning.
Homes offer tenants a comfortable environment to spend time with family, but at the same time, the lifestyle can be influenced by their habits. Moreover, housing-related costs are considered fundamental and should be handled carefully so as not to negatively affect the needs of household residents [
5]. Moreover, a household is a place where families grow and evolve, having a significant social impact and influencing their well-being [
6].
Making decisions based on the consumption of resources in a household consists of evaluating several aspects such as the size and location of a household, the number of inhabitants, and their occupations and habits. Moreover, choice of housing implies the subjective preferences of the tenants that fall within their corresponding social and economic limits [
7].
Much research has been carried out for this purpose [
8,
9,
10], analyzing the sociodemographic attributes (age and income) correlated with the preferences of the inhabitants. Thus, specific preferences for groups that share certain demographic attributes are highlighted. However, these studies have some limitations, excluding users who belong to the same group but have different preferences.
Therefore, a personalized RS that incorporates and analyzes all data of the inhabitants has drawn attention, gaining ground for new research and development. Collaborative filtering technology has emerged as a highly accurate method among the various approaches employed in implementing recommender systems. To make a substantial impact on promoting water conservation practices among consumers and achieving sustainable water management, advanced machine learning techniques and data analysis methods need to be integrated into a personalized recommender system.
This paper presents advanced recommendation strategies based on urban water consumption, using data collected from sensors installed in various households with different numbers of inhabitants, different habits, ages, and occupations. The aim of this paper is to use the collaborative filtering recommender system with the rule-based recommender system to generate personalized recommendations for each household based on both their consumption patterns and the predefined rules. The novelty of this article lies in the combination of the two methods to obtain a complex system of recommendations, precise and personalized for each household.
The rest of the paper is structured as follows:
Section 2 summarizes the most relevant studies in the field of consumer profiling and recommendations;
Section 3 presents the methodology used and describes the architecture of the proposed system with details for each component, focusing on the theoretical background;
Section 4 presents the experimental results obtained using the data collected from several households, with references to the source code and data repository;
Section 5 provides a discussion of the results. Finally, general conclusions are drawn in
Section 6.
2. Related Work
This section summarizes the most relevant studies in the field of urban water, with an emphasis on consumer profiling and providing personalized feedback and recommendations.
The launch of RSs can be considered a potential solution for achieving user personalization. Nowadays, RSs can be easily applied in most fields of activity. However, the water field is still premature in the development and integration of RSs within classic water demand monitoring systems. Researchers have discovered this potential and have promoted through numerous studies the importance of increasing awareness of water demand [
11]. For example, demand profiles have been extracted from residential DWM data to achieve a more accurate forecast of water demand.
McKenna et al. [
12] presented a solution based on Gaussian mixture models (GMMs) to extract and then classify water demand patterns. Furthermore, in [
13], the researchers presented an approach that includes two phases, clustering and classification, to detect water consumption patterns.
Most studies that analyze water consumption data aim to understand consumption patterns and consumer dynamics. A new approach is described in [
14], where the water events were classified considering the probability that a certain event will occur at a certain time. Moreover, detecting the habits of a consumer profile can also be achieved by applying constraints. Such an algorithm can be deepened in [
15].
Effective decision support in the context of smart water networks involves discovering consumer habits and identifying outliers [
16,
17]. For example, [
18] extracted different types of consumption patterns from the dataset, highlighting trends, variation, and overall consumer profiles. K-means clustering and seasonal decomposition were used to assess weekly trends and normalization to analyze relative variations in consumer demand.
A new approach to evaluating consumer profiles is described in [
19] and consists of comparing clustering models with classification models to analyze their degree of similarity with consumer classes. This shows the level of similarity between the initial classification and consumer profiles obtained from actual consumption data.
Furthermore, consumer profiling can also be obtained based on geographic coordinates to highlight the distribution of consumer behaviors in a certain area. In [
20], the OPTICS clustering method was used to group the data based on the coordinates, while K-means clustering was used to extract the consumption patterns for each identified area. The standard deviation of the seasonal component was also considered to classify the resulting consumer behaviors from the least desirable to the most desirable.
Constant monitoring of water consumption presents an overview of current activities and helps us understand where and how much is consumed. In [
21], an analysis of four different types of activities (hot water/cold water sink, toilet, and shower) was conducted. It was found that the toilet requires the largest volume of water, and the sink has the most variable consumption. On the same structure with four types of activities, another study was carried out [
22] to predict the source of water consumption events using four algorithms based on machine learning and deep learning. The results are promising, showing a correct prediction for toilet and shower data, while sink data were sometimes confused with toilet data.
Moreover, the four types of household activities were analyzed in [
23] to monitor and estimate water consumption. Clustering, classification methods, and evaluation metrics were applied, showing high accuracy by extracting consumption events from the time series. Both machine learning and deep learning algorithms were used, and good results were obtained in terms of prediction accuracy. This study can be extended to more households, including location-based clustering and demographic analysis in large-scale implementations.
Rahim et al. [
11] proposed a recommendation system capable of anticipating the next moment of water demand. The solution is based on a neural network with long short-term memory (LSTM) and can predict consumption events from 83 households (showers, baths, and irrigation).
Luo et al. [
24] presented a personalized recommendation system (PRS) to optimize energy consumption for a large sample of residential users. This study provides appropriate plans for the use of appliances, taking into account the lifestyle of the residents. Furthermore, the solution is based on collaborative filtering (CF) recommendations, classifying users into “highly responsive” and “less responsive”. For each less-responsive user, PRS infers the lifestyle and associates with similar habits of highly responsive users. Furthermore, the experiences of using the device by very responsive users are accumulated and, at the end, recommendations are offered to the less responsive user with programs for using energy-efficient devices.
Bassiliades et al. [
25] described an intelligent system that monitors water quality through two different networks: Andromeda (seawater) and Interrisk (freshwater). This solution sends early alerts when an environmental parameter exceeds the pollution limit. Moreover, adaptive filtering techniques are also applied to predict water quality parameters and avoid unwanted situations.
Dai et al. [
26] built a recommendation system for daily water consumption through fuzzy methods. The input data of the system are age, physical activity, and ambient temperature and the output data are the values of daily water consumption. Moreover, by applying fuzzy methods, the recommended values of daily consumption are calculated and then compared with the actual values. The results are promising and the built recommendation system is effective.
Another approach that uses fuzzy theory is described in [
27] and aims to estimate a correct water price, taking into account the cost of water quantity and quality and socioeconomic development. Furthermore, the household sector in Shanghai was assessed, where residents face the problem of water scarcity. The proposed solution aims to ensure a sustainable use of water for the future.
Through multiple research works, it has been observed that the financial factor is not the only one that determines the behavior of consumers in terms of daily water consumption. Social and psychological factors also have a substantial contribution. Providing feedback helps consumers reduce their individual water consumption by applying certain rules in their daily activities. Amir et al. [
28] analyzed consumer focus groups in three different cities in Israel with a focus on online feedback applications and behavioral incentives for water conservation. The results emphasize that citizens do not have major knowledge about the costs of water demand. Instead, they showed a greater interest in preserving the environment and receiving alerts when leaks are detected or abnormal consumption occurs.
Another study presented an approach that examines the challenges of community involvement in water policy decision-making in Thailand. Following some questionnaires applied to the community, it emerged that after some discussions, the problems remained in the community sector, without applicability at the governmental level. Moreover, there is a need for further research on opportunities in water decision-making between communities in rural and urban areas [
29].
On the other side, the limited involvement of communities or the lack of understanding of the problems related to water consumption could affect the decision-making process. In [
30], a serious game concept was analyzed to explain strategic environmental assessment (SEA) and showed encouraging results in terms of increasing the awareness and skills of a community in making decisions about water policy. The authors applied this study to 39 community members from the East Coast River Basin of Thailand and the results showed a significant improvement in the participants’ knowledge of SEA.
The limited number of existing studies shows the need for a recommendation system to increase awareness of water demand and to promote new approaches for water conservation.
3. Methodology
Considering multiple households having different profiles, a comparative feedback and recommendation system can be used to compare the water usage of each household against the usage of the others in the same apartment complex or neighborhood. The outline of the proposed methodology defines the steps for a comparative feedback and recommendation system as follows:
Collecting water usage data from sensors installed in each household;
Clustering households based on their water usage patterns using clustering algorithms such as K-means clustering;
Analyzing the clusters to identify households that are consuming more water than others in the same cluster;
Generating personalized recommendations for each household based on their water usage patterns and recommendations that have worked for similar households in the same cluster;
Providing comparative feedback to each household by comparing their water usage against the usage of other households in the same cluster, and by showing how they rank compared to others in terms of water usage;
Encouraging households to adopt water-saving measures and to compete with others in the same cluster to reduce their water usage.
The proposed scenario involves multiple households, while the clustering stage was described in previous research [
18,
19,
20,
21]. The processing pipeline is shown in
Figure 1, highlighting the proposed methodology, with a focus on providing feedback and recommendations based on the data collected from water consumption sensors installed in households.
The data were acquired from water consumption sensors installed in households, measuring independent water outlets. The data acquisition module detailed in
Figure 2 receives the data from the sensors connected via the MQTT interface, then stores the data in a database. The measurement data include the reference to the measurement node ID, measured channel (there can be multiple sensors connected to a measurement node), value, and timestamp.
The processing pipeline queries the data from the application server and receives the consumption data formatted as time series. Depending on the scenario, the analysis can involve aggregating data from multiple households over a given timeframe (e.g., daily, weekly, or monthly).
The preprocessing module is detailed in
Figure 3 and involves creating a pivot table (i.e., user–item matrix) based on the water consumption dataset, with households (users) and outlets (items). The first step is to extract water consumption events from the time series dataset, represented by their consumption amount (volume) and duration.
The AI-based recommender system implements collaborative filtering to generate recommendations based on the consumption patterns of similar households. The algorithm identifies patterns in the consumption data across multiple households and then recommends actions based on those patterns. The key is to identify similar households based on their consumption behavior. This can be achieved by clustering households based on their consumption patterns, and then recommending actions based on the consumption patterns of similar households. The algorithm for the collaborative filtering stage is as follows:
The dataset is transformed into a pivot table representing the user–item matrix, where items are represented by consumption outlets (i.e., sink_hot, sink_cold, and toilet), with their average consumption for each household.
The SVD (singular value decomposition) model is trained using the extracted water consumption events to capture latent features that represent user behaviors and consumption characteristics. The SVD matrix factorization model is used to factorize the user–item matrix into three matrices , where:
- -
is the orthogonal left singular matrix containing the left eigenvectors of , where ;
- -
is the orthogonal right matrix containing the right eigenvectors of , where ;
- -
is a diagonal matrix with positive eigenvalues sorted in descending order .
The resulting matrices are then used to provide recommended consumption values based on similarities to other households or consumption characteristics. The results obtained after using SVD are in the scope of the proposed recommender system to generate actionable recommendations for new households.
The output of the collaborative filtering stage represents personalized recommendations for each household based on the consumption patterns of similar households. The recommended water consumption provides an important feedback loop for consumers that can evaluate their current behavior relative to other households and adjust accordingly.
To translate the recommendations into actionable items, the next module, depicted in
Figure 4, collects data from the collaborative filtering module and implements a rule-based recommender system.
The process requires expert knowledge or the involvement of a domain expert to identify water-saving actions that are relevant to the consumption behavior in households (e.g., taking shorter showers, fixing leaks, using low-flow showerheads, or turning off the tap while brushing teeth) and to define the set rules that map specific consumption behaviors to corresponding actions (e.g., a rule for taking shorter showers could be triggered if the average shower time exceeds a certain threshold).
For example, if the collaborative filtering system identifies a particular household as having high water consumption compared to similar households (i.e., based on defined thresholds), the rule-based system could then analyze the water usage patterns of that household’s specific outlets (e.g., bathroom sink, shower, or toilet) and provide targeted recommendations (e.g., fix leaky faucet in the bathroom sink).
Therefore, the combined recommender system can propose recommendations and notify the users for the following general scenarios:
When the consumption volume exceeds the proposed threshold above the recommended values, the recommendations can target consumption behaviors and possible leaks.
When the consumption duration exceeds the proposed threshold above the recommended duration, the recommendations can target consumption behaviors.
When both consumption volume and duration exceed the proposed threshold above the recommended values, the recommendations can target consumption behaviors, possible leaks, or installations (i.e., consumption outlets or infrastructure).
To determine the thresholds dynamically from the dataset, which are required to trigger rule-based recommendations, we can use statistical methods to analyze the data and identify the points in which water consumption changes significantly. The use quartiles are used to split the data into groups and determine the thresholds based on the values in these groups.
Finally, the proposed solution presents personalized recommendations for each household based on the collaborative filtering and the rule-based system. These recommendations are based on a combination of the rule-based system, which identifies common water-saving behaviors, and the collaborative filtering system, which identifies patterns across multiple households and suggests personalized recommendations based on those patterns.
4. Experimental Results
The measurements were obtained using flow sensors and wireless transmitters installed in households, sending data to the application server. The original dataset contained approximately 480,000 water consumption measurements over a time period of two weeks, from five households, each with three measured outlets (i.e., sink_cold, sink_hot, and toilet), with a sampling time of 60 s. Each tested household included families with different numbers of inhabitants, different habits, ages, and occupations.
The consumption events were extracted from the time series dataset, which resulted in over 6000 data points, each represented by total volume and duration.
Figure 5 shows the water consumption volume and duration for the consumption events generated by the households with multiple outlets. Clustering methods have been applied in previous research to identify consumption patterns and profile households based on their relative consumption, as presented in [
21].
In the scope of the proposed recommender system, the consumption events were then used to create a pivot table, mapping the households to their average consumption event data for each outlet. A sample of the results is shown in
Table 1 for both the average volume and duration determined by the extracted consumption events for the measured outlets (i.e., sink_cold, sink_hot, and toilet).
The next stage involves training the collaborative filter to evaluate recommended consumption for households based on their similarity. The Surprise python library (Surprise python scikit
https://surpriselib.com/, Last accessed: 12 June 2023) is commonly used for building and analyzing recommender systems [
31]. The SVD algorithm was used to perform matrix factorization and predict water consumption patterns for each household based on the consumption patterns of similar households.
The results are shown in
Figure 6 for the recommended consumption volume and
Figure 7 for the recommended consumption duration. In both scenarios, the current values were compared to the recommended values for each consumption outlet, providing a measurable overview of the relative consumption.
It was revealed that the recommended consumption can provide a threshold for water-saving actions based on actionable recommendations. Therefore, households with higher consumption than the recommendations provided by the collaborative filtering would be able to have a quick overview on their relative consumption. Furthermore, households with lower consumption can be encouraged to maintain their profile by using social incentives as part of a smart government strategy, as discussed in [
32].
The rule-based recommender module received the output of the collaborative filter and a set of rules that define water-saving actions based on the deviation from the recommended consumption relative to similar households. To provide dynamic thresholds, the interquartile range was calculated by subtracting the first quartile from the third quartile. We then calculated the upper and lower bounds for each feature using the interquartile range and a constant factor of 1.5. Finally, we obtained thresholds for each consumption outlet by combining the upper and lower bounds. These thresholds were then used in the rule-based recommender system to generate personalized recommendations for each household based on their water usage patterns.
The sample results of the rule-based system are summarized in
Table 2, showing the household and the identified outlet that exceeds the dynamic threshold in terms of consumption event duration compared to the collaborative filter recommendations, with the suggested action defined according to the rule set.
5. Discussion
Designing a machine learning or AI-based recommender system requires careful consideration of the specific problem and data at hand and may involve additional steps such as feature engineering and model selection. The proposed solution requires a large dataset of consumption data and household features to improve the level of detail and prediction accuracy for providing personalized recommendations, as well as a method for evaluating the effectiveness of the recommendations. The dataset can be used to train different machine learning models, and the models can then be used to generate recommendations for new households.
The proposed collaborative filtering method provides comparative feedback to each household by comparing their water consumption against the consumption of other households in the same cluster, and by showing how they rank compared to others in terms of water consumption. With an effective strategy for user interaction, the results can encourage households to adopt water-saving measures and to compete with others in the same cluster to reduce their water consumption.
The specific rules and recommendations involved in the rule-based recommender system depend on the context and goals of the water distribution system, as well as the available data and resources. In this sense, the set of rules can be defined based on real-world scenarios, considering residential buildings, such as:
Using a low-flow showerhead for households that have higher water usage in their showers compared to others in the same cluster, which can provide significant water savings and still provide a comfortable shower experience, especially in apartments where multiple households share the same water supply;
Fixing leaks for households that have higher water usage in their sinks or toilets compared to others in the same cluster. Leaks can cause significant water wastage over time and can lead to higher water bills. Fixing leaks can help conserve water and save money for individual households, as well as the entire apartment complex;
Using a dishwasher for households that have higher water usage in their kitchen sink compared to others in the same cluster, as dishwashers can be more water-efficient than handwashing dishes, thus helping to reduce overall water usage in the apartment complex.
Another perspective is given by the potential detection of anomalies based on the recommended consumption characteristics, using a change point detection strategy as described in [
16]. The authors evaluated multiple real-world scenarios to validate the proposed rule-based decision support system, with a focus on automating the detection of anomalies, based on change point detection and machine learning models. In the context of our work, detecting anomalies can further translate into actionable recommendations in large-scale water distribution systems, providing decision support for human operators. On the contrary, anomaly detection modules can be adapted to improve the separation between changes in consumption behaviors and problems related to the infrastructure (e.g., leaks).
The overall effectiveness of the recommendations can be evaluated by measuring the change in consumption behavior of the households after they receive the recommendations, which will be the subject of future work that involves large-scale deployment or integration with the water distribution network. In this sense, the platform should continuously evaluate the effectiveness of the recommendations and refine the rules and algorithms based on user feedback and real-world outcomes. This could involve:
Conducting A/B tests to evaluate the effectiveness of different rule sets on the consumption behaviors for randomized households;
Analyzing user engagement and adoption rates to evaluate the effectiveness of the recommendation system to provide an overall incentive;
Measuring the actual water savings achieved by users who follow the recommendations to evaluate the overall impact on sustainability.
6. Conclusions
In this research paper, we explored the effectiveness of household profiling and personalized feedback for improving decision-making regarding water consumption. Our study proposed the development of a personalized recommendation system based on data collected from various sensors installed in households with different profiles, specifically focusing on water outlets such as sink (cold water), sink (hot water), and toilet.
The implementation of an AI-based recommendation system based on a dual approach, combining collaborative filtering and rule-based recommendations, was proven to be highly valuable in generating feedback based on consumption patterns observed in similar households.
One of the significant outcomes of our study is the identification of households with higher water consumption compared to similar households. This detection enables the rule-based system to offer targeted recommendations, such as fixing leaking faucets in bathroom sinks, to reduce excessive water usage. Conversely, households with lower consumption can be encouraged to maintain their water-saving habits through external incentives, which will be the subject of future research.
Furthermore, we recognize the potential for scaling up the system to evaluate the effectiveness of the recommendations provided. In this sense, the dataset can be expanded to train different machine learning models and generate recommendations for new households. Continuous data analysis, refining the rules based on real-world expert knowledge, and interactive user interfaces are essential to enhance the accuracy of results and adapt to changing household consumption behaviors.
In summary, our research emphasizes the significance of household profiling, personalized feedback, and AI-driven recommendation systems in facilitating informed decision-making and promoting sustainable water consumption. Based on our study, we recommend the implementation of our proposed system, along with continuous analysis and improvement, to enhance water conservation efforts and contribute to a more sustainable future.