# Implications of Experiment Set-Ups for Residential Water End-Use Classification

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Methods

#### 2.1. Simulation of Common Database and Data Preprocessing

#### 2.2. Feature Description and Feature Extraction

#### 2.3. Supervised Techniques

#### 2.4. Unsupervised Techniques

#### 2.4.1. Threshold-Based Clustering Using Dynamic Time Warping

#### 2.4.2. K-Means

#### 2.4.3. DBSCAN

#### 2.4.4. OPTICS

_{i}is applied, which are smaller than a “generating distance” eps

_{max}(0 ≤ eps

_{i}≤ eps

_{max}). This allows OPTICS to find clusters of different densities in the feature space. eps

_{max}can simply be set to infinity, as this will identify clusters across all scales [20]. In the original paper [26], experiments indicated that values between 10 and 20 for minPts will always lead to promising results. Hence, minPts is chosen to be 15 in this work.

#### 2.4.5. CASH

## 3. Results and Discussion

#### 3.1. Evaluation of the Supervised Techniques

#### 3.1.1. Evaluation Results

#### 3.1.2. Discussion

#### 3.2. Evaluation of the Unsupervised Techniques

#### 3.2.1. Evaluation Results

#### 3.2.2. Discussion

## 4. Conclusions

**our first recommendation**for the experiment set-up of residential water end-use classification is to

**incorporate prior knowledge**as much as possible. More concretely, this can be realized by understanding the structure of the water network, by acquiring meta-information about each consumer (e.g., size of the household, location, devices etc.) or by generating labels manually or automatically. The benefits of these activities are various. First, the utilization of prior knowledge increases the overall performance of the classification methods. Moreover, it leads to a deep understanding of water consumption behavior and consequently facilitates the feature engineering step as well as the preprocessing. Finally, known operations encoded in prior knowledge increase the interpretability of algorithms based on neural networks and reduce the number of trainable parameters at the same time [28].

**A further implication**derived from our study is that the

**database for water end-use classification needs to be representative**. As stated in the introduction, one of the major limitations of the state-of-the-art literature is that the databases are commonly acquired in a specific country/region. Although our study provides a first comprehensive comparison between supervised and unsupervised ML methods, the utilized database still suffers from the same limitation (refer to Section 3.1.2 and Section 3.2.2). Hence, the quantitative results in the state-of-the-art as well as in our study can be hardly applied to other datasets. In order to increase the generalizability and reproducibility of ML techniques for end-use classification, we suggest to

**establish a large representative dataset comprising raw data and annotations from various countries or regions**for future research. For this purpose, end consumers in different circumstances (e.g., housing and household situation, age, gender etc.) need to be encouraged to participate. Considering the above-mentioned aspects, crowdsourcing approaches which have been applied for medical applications [29] could be a possible solution. A suitable framework to implement such approaches successfully is the so-called citizen science project, where the citizens are contributing to a scientific project actively with their resources and knowledge. Scientific results and other output of the project are accessible to the participants as an exemplary reward.

**next implication**: The

**characteristics of the underlying database**should

**meet the requirement of the water end-use classification task.**An orientation for the estimation of an optimal temporal resolution (i.e., frequency) is given by the Nyquist sampling theorem. It states that a system with uniform sampling can reconstruct a continuous function (i.e., analog signal) adequately, only if the sampling rate is at least two times the maximum frequency of the original signal.

**final recommendation**derived from our study is that we should

**be aware of the “class imbalance” problem**for water end-use classification. In common residential households, events like faucet and toilet occur more frequently than bathtub, washing machine or dishwasher. Furthermore, bathtub, washing machine or dishwasher are not always available in each household. Consequently, the water end-use classification problem generally suffers from an imbalanced class distribution, which could result in a lower classification accuracy of ML models. Effective ways to handle it are upsampling of the minority class or downsampling of the majority class. The former can be realized with data augmentation techniques, whilst the latter can be achieved by removing redundant data.

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

ML | Machine-Learning |

SVM | Support Vector Machine |

DTW | Dynamic Time Warping |

DBSCAN | Density-Based Spatial Clustering of Applications with Noise |

OPTICS | Ordering Points To Identify the Clustering Structure |

CASH | Clustering in Arbitrary Subspaces based on the Hough transform |

STREaM | STochastic Residential water End-use Model |

PCC | Pearson Correlation Coefficient |

ARI | Adjusted Rand Index |

AMI | Adjusted Mutual Information |

RNN | Recurrent Neural Network |

## Appendix A

Precision | Recall | F1-Score | Support | |
---|---|---|---|---|

0 ^{1} | 0.66 | 0.63 | 0.65 | 150 |

1 ^{1} | 0.80 | 0.84 | 0.82 | 466 |

2 ^{1} | 0.80 | 0.96 | 0.87 | 49 |

3 ^{1} | 0.83 | 0.89 | 0.86 | 775 |

4 ^{1} | 0.84 | 0.56 | 0.67 | 137 |

5 ^{1} | 0.25 | 0.02 | 0.04 | 47 |

6 ^{1} | 0.56 | 0.56 | 0.56 | 9 |

Accuracy ^{2} | 0.80 | 1633 | ||

Macro Average | 0.68 | 0.64 | 0.64 | 1633 |

Weighted Average | 0.79 | 0.80 | 0.79 | 1633 |

^{1}The rows in the classification report correspond to the end-use labels: 0 = Overlapping, 1 = Toilet, 2 = Shower, 3 = Faucet, 4 = Washing Machine, 5 = Dishwasher, 6 = Bathtub.

^{2}The micro average corresponds to the accuracy. If only one subset of the classes is present in the predictions for the test data, the two metrics are not equivalent.

Precision | Recall | F1-Score | Support | |
---|---|---|---|---|

0 ^{1} | 0.69 | 0.66 | 0.68 | 289 |

1 ^{1} | 0.79 | 0.87 | 0.83 | 692 |

2 ^{1} | 0.89 | 0.91 | 0.90 | 56 |

3 ^{1} | 0.85 | 0.86 | 0.86 | 1098 |

4 ^{1} | 0.77 | 0.58 | 0.66 | 110 |

5 ^{1} | 0 | |||

6 ^{1} | 1.00 | 0.50 | 0.67 | 2 |

Micro Average | 0.81 | 0.82 | 0.82 | 2247 |

Macro Average | 0.83 | 0.73 | 0.77 | 2247 |

Weighted Average | 0.81 | 0.82 | 0.82 | 2247 |

^{1}The rows in the classification report correspond to the end-use labels: 0 = Overlapping, 1 = Toilet, 2 = Shower, 3 = Faucet, 4 = Washing Machine, 5 = Dishwasher, 6 = Bathtub.

Precision | Recall | F1-Score | Support | |
---|---|---|---|---|

0 ^{1} | 0.69 | 0.70 | 0.69 | 379 |

1 ^{1} | 0.77 | 0.80 | 0.78 | 708 |

2 ^{1} | 0.87 | 0.92 | 0.90 | 66 |

3 ^{1} | 0.81 | 0.85 | 0.83 | 1253 |

4 ^{1} | 0.82 | 0.61 | 0.70 | 137 |

5 ^{1} | 0 | |||

6 ^{1} | 0.80 | 0.57 | 0.67 | 7 |

Micro Average | 0.78 | 0.80 | 0.79 | 2550 |

Macro Average | 0.79 | 0.74 | 0.76 | 2550 |

Weighted Average | 0.78 | 0.80 | 0.79 | 2550 |

^{1}The rows in the classification report correspond to the end-use labels: 0 = Overlapping, 1 = Toilet, 2 = Shower, 3 = Faucet, 4 = Washing Machine, 5 = Dishwasher, 6 = Bathtub.

Precision | Recall | F1-Score | Support | |
---|---|---|---|---|

0 ^{1} | 0.78 | 0.75 | 0.77 | 471 |

1 ^{1} | 0.79 | 0.83 | 0.81 | 844 |

2 ^{1} | 0.95 | 0.89 | 0.92 | 89 |

3 ^{1} | 0.82 | 0.86 | 0.84 | 1275 |

4 ^{1} | 0.76 | 0.71 | 0.73 | 166 |

5 ^{1} | 0 | |||

6 ^{1} | 1.00 | 0.33 | 0.50 | 3 |

Micro Average | 0.81 | 0.82 | 0.81 | 2848 |

Macro Average | 0.85 | 0.73 | 0.76 | 2848 |

Weighted Average | 0.81 | 0.82 | 0.81 | 2848 |

^{1}The rows in the classification report correspond to the end-use labels: 0 = Overlapping, 1 = Toilet, 2 = Shower, 3 = Faucet, 4 = Washing Machine, 5 = Dishwasher, 6 = Bathtub.

Precision | Recall | F1-Score | Support | |
---|---|---|---|---|

0 ^{1} | 0.70 | 0.77 | 0.73 | 505 |

1 ^{1} | 0.77 | 0.82 | 0.79 | 873 |

2 ^{1} | 0.94 | 0.93 | 0.93 | 113 |

3 ^{1} | 0.81 | 0.83 | 0.82 | 1315 |

4 ^{1} | 0.81 | 0.53 | 0.64 | 207 |

5 ^{1} | 0.60 | 0.05 | 0.08 | 66 |

6 ^{1} | 0.80 | 0.60 | 0.69 | 20 |

Accuracy ^{2} | 0.78 | 3099 | ||

Macro Average | 0.77 | 0.65 | 0.67 | 3099 |

Weighted Average | 0.78 | 0.78 | 0.77 | 3099 |

^{1}The rows in the classification report correspond to the end-use labels: 0 = Overlapping, 1 = Toilet, 2 = Shower, 3 = Faucet, 4 = Washing Machine, 5 = Dishwasher, 6 = Bathtub.

^{2}The micro average corresponds to the accuracy. Solely when only a subset of the classes is present in the predictions for the test data, the two metrics differ.

Precision | Recall | F1-Score | Support | |
---|---|---|---|---|

0 ^{1} | 0.73 | 0.78 | 0.76 | 747 |

1 ^{1} | 0.73 | 0.82 | 0.77 | 956 |

2 ^{1} | 0.87 | 0.82 | 0.84 | 112 |

3 ^{1} | 0.85 | 0.83 | 0.84 | 1604 |

4 ^{1} | 0.78 | 0.29 | 0.42 | 132 |

5 ^{1} | 0.67 | 0.04 | 0.08 | 50 |

6 ^{1} | 1.00 | 0.50 | 0.67 | 4 |

Accuracy ^{2} | 0.79 | 3605 | ||

Macro Average | 0.80 | 0.58 | 0.62 | 3605 |

Weighted Average | 0.79 | 0.79 | 0.78 | 3605 |

^{1}The rows in the classification report correspond to the end-use labels: 0 = Overlapping, 1 = Toilet, 2 = Shower, 3 = Faucet, 4 = Washing Machine, 5 = Dishwasher, 6 = Bathtub.

^{2}The micro average corresponds to the accuracy. Solely when only a subset of the classes is present in the predictions for the test data, the two metrics differ.

1P | 2P | 3P | 4P | 5P | 6P | |
---|---|---|---|---|---|---|

0 ^{1} | $\left[\begin{array}{cc}1456& 27\\ 73& 77\end{array}\right]$ | $\left[\begin{array}{cc}1941& 53\\ 127& 162\end{array}\right]$ | $\left[\begin{array}{cc}2164& 71\\ 154& 225\end{array}\right]$ | $\left[\begin{array}{cc}2377& 62\\ 149& 322\end{array}\right]$ | $\left[\begin{array}{cc}2492& 102\\ 159& 346\end{array}\right]$ | $\left[\begin{array}{cc}2719& 139\\ 239& 508\end{array}\right]$ |

1 ^{1} | $\left[\begin{array}{cc}1077& 90\\ 79& 387\end{array}\right]$ | $\left[\begin{array}{cc}1470& 121\\ 121& 571\end{array}\right]$ | $\left[\begin{array}{cc}1759& 147\\ 168& 540\end{array}\right]$ | $\left[\begin{array}{cc}1898& 168\\ 174& 670\end{array}\right]$ | $\left[\begin{array}{cc}2032& 194\\ 188& 685\end{array}\right]$ | $\left[\begin{array}{cc}2424& 225\\ 256& 700\end{array}\right]$ |

2 ^{1} | $\left[\begin{array}{cc}1579& 5\\ 4& 45\end{array}\right]$ | $\left[\begin{array}{cc}2224& 3\\ 7& 49\end{array}\right]$ | $\left[\begin{array}{cc}2542& 6\\ 6& 60\end{array}\right]$ | $\left[\begin{array}{cc}2817& 4\\ 14& 75\end{array}\right]$ | $\left[\begin{array}{cc}2983& 3\\ 9& 104\end{array}\right]$ | $\left[\begin{array}{cc}3486& 7\\ 25& 87\end{array}\right]$ |

3 ^{1} | $\left[\begin{array}{cc}741& 117\\ 100& 675\end{array}\right]$ | $\left[\begin{array}{cc}1062& 123\\ 180& 918\end{array}\right]$ | $\left[\begin{array}{cc}1167& 194\\ 233& 1020\end{array}\right]$ | $\left[\begin{array}{cc}1448& 187\\ 225& 1050\end{array}\right]$ | $\left[\begin{array}{cc}1588& 196\\ 290& 1025\end{array}\right]$ | $\left[\begin{array}{cc}1835& 166\\ 312& 1292\end{array}\right]$ |

4 ^{1} | $\left[\begin{array}{cc}1484& 12\\ 73& 64\end{array}\right]$ | $\left[\begin{array}{cc}2159& 14\\ 57& 53\end{array}\right]$ | $\left[\begin{array}{cc}1077& 90\\ 79& 387\end{array}\right]$ | $\left[\begin{array}{cc}2718& 26\\ 82& 84\end{array}\right]$ | $\left[\begin{array}{cc}2871& 21\\ 124& 83\end{array}\right]$ | $\left[\begin{array}{cc}3471& 2\\ 119& 13\end{array}\right]$ |

5 ^{1} | $\left[\begin{array}{cc}1586& 0\\ 47& 0\end{array}\right]$ | $\left[\begin{array}{cc}2247& 0\\ 36& 0\end{array}\right]$ | $\left[\begin{array}{cc}2550& 0\\ 64& 0\end{array}\right]$ | $\left[\begin{array}{cc}2848& 0\\ 62& 0\end{array}\right]$ | $\left[\begin{array}{cc}3033& 0\\ 66& 0\end{array}\right]$ | $\left[\begin{array}{cc}3555& 0\\ 50& 0\end{array}\right]$ |

6 ^{1} | $\left[\begin{array}{cc}1620& 4\\ 5& 4\end{array}\right]$ | $\left[\begin{array}{cc}2281& 0\\ 1& 1\end{array}\right]$ | $\left[\begin{array}{cc}2606& 1\\ 3& 4\end{array}\right]$ | $\left[\begin{array}{cc}2907& 0\\ 2& 1\end{array}\right]$ | $\left[\begin{array}{cc}3076& 3\\ 8& 12\end{array}\right]$ | $\left[\begin{array}{cc}3601& 0\\ 3& 1\end{array}\right]$ |

^{1}The rows correspond to the end-use labels: 0 = Overlapping, 1 = Toilet, 2 = Shower, 3 = Faucet, 4 = Washing Machine, 5 = Dishwasher, 6 = Bathtub.

## References

- Pastor-Jabaloyes, L.; Arregui, F.J.; Cobacho, R. Water End Use Disaggregation Based on Soft Computing Techniques. Water
**2018**, 10, 46. [Google Scholar] [CrossRef] [Green Version] - Nguyen, K.A.; Zhang, H.; Stewart, R.A. Application of Dynamic Time Warping Algorithm in Prototype Selection for the Disaggregation of Domestic Water Flow Data into End Use Events. In Proceedings of the 34th World Congress of the International Association for Hydro- Environment Research and Engineering: 33rd Hydrology and Water Resources Symposium and 10th Conference on Hydraulics in Water Engineering, Brisbane, Australia, 26 June–1 July 2011; Valentine, E.M., Apelt, C.J., Ball, J., Chanson, H., Cox, R., Ettema, R., Kuczera, G., Lambert, M., Melville, B.W., Sargison, J.E., Eds.; Engineers Australia: Barton, Australia, 2011. [Google Scholar]
- Yang, A.; Zhang, H.; Stewart, R.A.; Nguyen, K.A. Water End Use Clustering Using Hybrid Pattern Recognition Techniques—Artificial Bee Colony, Dynamic Time Warping and K-Medoids Clustering. IJMLC
**2018**, 8, 483–487. [Google Scholar] [CrossRef] - Yang, A. Artificial Intelligent Techniques in Residential Water End-use Studies for Optimized Urban Water Management Artificial. Master’s Thesis, Griffith University, Brisbane, Australia, 26 September 2018. [Google Scholar]
- Kalogridis, G.; Farnham, T.; Wilcox, J.; Faies, M. Privacy and Incongruence-Focused Disaggregation of Water Consumption Data in Real Time. Procedia Eng.
**2015**, 119, 854–863. [Google Scholar] [CrossRef] [Green Version] - Nguyen, K.A.; Stewart, R.A.; Zhang, H. An Intelligent Pattern Recognition Model to Automate the Categorisation of Residential Water End-Use Events. Environ. Model. Softw.
**2013**, 47, 108–127. [Google Scholar] [CrossRef] [Green Version] - Nguyen, K.A.; Zhang, H.; Stewart, R.A. Development of an Intelligent Model to Categorise Residential Water End Use Events. J. Hydro. Environ. Res.
**2013**, 7, 182–201. [Google Scholar] [CrossRef] [Green Version] - Nguyen, K.A.; Stewart, R.A.; Zhang, H. An Autonomous and Intelligent Expert System for Residential Water End-Use Classification. Expert Syst. Appl.
**2014**, 41, 342–356. [Google Scholar] [CrossRef] [Green Version] - Nguyen, K.A.; Stewart, R.A.; Zhang, H. Development of An Autonomous and Intelligent System for Residential Water End Use Classification. In Proceedings of the 11th International Conference on Hydroinformatics, New York, NY, USA, 17–21 August 2014; Curran Associates, Inc.: New York, NY, USA, 2015. [Google Scholar]
- Nguyen, K.A.; Stewart, R.A.; Zhang, H.; Jones, C. Intelligent Autonomous System for Residential Water End Use Classification: Autoflow. Appl. Soft. Comput.
**2015**, 31, 118–131. [Google Scholar] [CrossRef] [Green Version] - Nguyen, K.A.; Sahin, O.; Stewart, R.A. AUTOFLOW©—A Novel Application for Water Resource Management and Climate Change Response Using Smart Technology. In Proceedings of the 8th International Congress on Environmental Modelling and Software, Toulouse, France, 10–14 July 2016. [Google Scholar]
- Nguyen, K.A.; Sahin, O.; Stewart, R.A.; Zhang, H. Smart Technologies in Reducing Carbon Emission: Artificial Intelligence and Smart Water Meter. In Proceedings of the 9th International Conference on Machine Learning and Computing, Singapore, 24–26 February 2017; Association for Computing Machinery: New York, NY, USA, 2017. [Google Scholar]
- Nguyen, K.A.; Stewart, R.A.; Zhang, H.; Sahin, O. An Adaptive Model for the Autonomous Monitoring and Management of Water End Use. Smart Water
**2018**, 3, 5. [Google Scholar] [CrossRef] [Green Version] - Nguyen, K.A.; Stewart, R.A.; Zhang, H.; Sahin, O.; Siriwardene, N. Re-Engineering Traditional Urban Water Management Practices with Smart Metering and Informatics. Environ. Model. Softw.
**2018**, 101, 256–267. [Google Scholar] [CrossRef] [Green Version] - Cominola, A. Modelling Residential Water Consumers’ Behavior. Ph.D. Thesis, Politecnico di Milano, Milano, Italy, 2016. [Google Scholar]
- Cominola, A.; Giuliani, M.; Castelletti, A.; Rosenberg, D.E.; Abdallah, A.M. Implications of Data Sampling Resolution on Water Use Simulation, End-Use Disaggregation, and Demand Management. Environ. Model. Softw.
**2018**, 102, 199–212. [Google Scholar] [CrossRef] [Green Version] - Vašak, M.; Banjac, G.; Novak, H. Water Use Disaggregation Based on Classification of Feature Vectors Extracted from Smart Meter Data. Procedia Eng.
**2015**, 119, 1381–1390. [Google Scholar] [CrossRef] - Vitter, J.S.; Webber, M. Water Event Categorization Using Sub-Metered Water and Coincident Electricity Data. Water
**2018**, 10, 714. [Google Scholar] [CrossRef] [Green Version] - Carranza, J.C.I.; Morales, R.D.; Sánchez, J.A. Pattern Recognition in Residential End Uses of Water Using Artificial Neural Networks and Other Machine Learning Techniques. In Proceedings of the Computing and Control for the Water Industry, Sheffield, UK, 5–7 September 2020. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. JMLR
**2011**, 12, 2825–2830. [Google Scholar] - Schubert, E.; Zimek, A. ELKI: A Large Open-Source Library for Data Analysis—ELKI Release 0.7.5 “Heidelberg”. arXiv
**2019**, arXiv:1902.03616. Available online: http://arxiv.org/abs/1902.03616 (accessed on 7 December 2020). - Bishop, C. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006; pp. 325–343. [Google Scholar]
- Granata, F.; Gargano, R.; de Marinis, G. Support Vector Regression for Rainfall-Runoff Modeling in Urban Drainage: A Comparison with the EPA’s Storm Water Management Model. Water
**2016**, 8, 69. [Google Scholar] [CrossRef] - Ester, M.; Kriegel, H.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996. [Google Scholar]
- Sander, J.; Ester, M.; Kriegel, H.; Xu, X. Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications. Data Min. Knowl. Discov.
**1998**, 2, 169–194. [Google Scholar] [CrossRef] - Ankerst, M.; Breunig, M.M.; Kriegel, H.P.; Sander, J. OPTICS: Ordering Points to Identify the Clustering Structure. SIGMOD Record
**1999**, 28, 49–60. [Google Scholar] [CrossRef] - Achtert, E.; Böhm, C.; David, J.; Kröger, P.; Zimek, A. Global Correlation Clustering Based on the Hough Transform. Stat. Anal. Data Min. ASA Data Sci. J.
**2008**, 1, 111–127. [Google Scholar] [CrossRef] - Maier, A.K.; Syben, C.; Stimpel, B.; Würfl, T.; Hoffmann, M.; Schebesch, F.; Fu, W.; Mill, L.; Kling, L.; Christiansen, S. Learning with Known Operators Reduces Maximum Error Bounds. Nat. Mach. Intell.
**2019**, 1, 373–380. [Google Scholar] [CrossRef] [PubMed] - Marzahl, C.; Aubreville, M.; Bertram, C.A.; Gerlach, S.; Maier, J.; Voigt, J.; Hill, J.; Klopfleisch, R.; Maier, A. Is Crowd-Algorithm Collaboration an Advanced Alternative to Crowd-Sourcing on Cytology Slides? In Bildverarbeitung für die Medizin 2020; Tolxdorff, T., Deserno, T., Handels, H., Maier, A., Maier-Hein, K., Palm, C., Eds.; Springer Vieweg: Wiesbaden, Germany, 2020; pp. 26–31. [Google Scholar]

**Figure 1.**Comparison of supervised and unsupervised learning: (

**a**) Supervised techniques have a learning phase, in which the classifier trains on given labels. In the classification phase, the classifier is tested on a part of the dataset, which is not used in the training phase; (

**b**) unsupervised techniques search for grouping structures in the complete dataset.

**Figure 3.**Network architecture of Support Vector Machines (SVM). Figure adopted from [23].

**Figure 4.**Accuracy ranges for multi-class and binary SVMs. Note that the axis does not start at 0 for readability.

1P | 2P | 3P | 4P | 5P | 6P | |
---|---|---|---|---|---|---|

Estimated number of clusters | 21 | 17 | 19 | 23 | 27 | 19 |

ARI | 0.06 | 0.06 | 0.05 | 0.05 | 0.05 | 0.05 |

AMI | 0.15 | 0.11 | 0.10 | 0.14 | 0.12 | 0.12 |

1P | 2P | 3P | 4P | 5P | 6P | |
---|---|---|---|---|---|---|

ARI | 0.17 | 0.10 | 0.09 | 0.12 | 0.11 | 0.08 |

AMI | 0.24 | 0.18 | 0.18 | 0.20 | 0.19 | 0.15 |

**Table 3.**Evaluation metrics for Density-Based Spatial Clustering of Applications with Noise (DBSCAN).

1P | 2P | 3P | 4P | 5P | 6P | |
---|---|---|---|---|---|---|

Estimated number of clusters | 2 | 2 | 1 | 1 | 1 | 1 |

ARI | 0.04 | 0.03 | 0.02 | 0.01 | 0.01 | 0.01 |

AMI | 0.06 | 0.04 | 0.03 | 0.02 | 0.02 | 0.01 |

1P | 2P | 3P | 4P | 5P | 6P | |
---|---|---|---|---|---|---|

Estimated number of clusters | 11 | 17 | 26 | 21 | 17 | 27 |

ARI | −0.03 | −0.03 | −0.03 | −0.01 | −0.02 | −0.02 |

AMI | 0.02 | 0.03 | 0.03 | 0.03 | 0.03 | 0.02 |

**Table 5.**Evaluation metrics for Clustering in Arbitrary Subspaces based on the Hough transform (CASH).

1P | 2P | 3P | 4P | 5P | 6P | |
---|---|---|---|---|---|---|

Estimated number of clusters | 30 | 37 | 44 | 47 | 51 | 98 |

ARI | −0.04 | −0.04 | −0.04 | −0.04 | −0.03 | −0.06 |

AMI | 0.04 | 0.04 | 0.04 | 0.05 | 0.05 | 0.07 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Gourmelon, N.; Bayer, S.; Mayle, M.; Bach, G.; Bebber, C.; Munck, C.; Sosna, C.; Maier, A.
Implications of Experiment Set-Ups for Residential Water End-Use Classification. *Water* **2021**, *13*, 236.
https://doi.org/10.3390/w13020236

**AMA Style**

Gourmelon N, Bayer S, Mayle M, Bach G, Bebber C, Munck C, Sosna C, Maier A.
Implications of Experiment Set-Ups for Residential Water End-Use Classification. *Water*. 2021; 13(2):236.
https://doi.org/10.3390/w13020236

**Chicago/Turabian Style**

Gourmelon, Nora, Siming Bayer, Michael Mayle, Guy Bach, Christian Bebber, Christophe Munck, Christoph Sosna, and Andreas Maier.
2021. "Implications of Experiment Set-Ups for Residential Water End-Use Classification" *Water* 13, no. 2: 236.
https://doi.org/10.3390/w13020236