Next Article in Journal
Effects of Variations in Water Table Orientation on LNAPL Migration Processes
Previous Article in Journal
Procedural Justice in Water Management: A Review
Previous Article in Special Issue
Investigating the Effect of Iron Salts on E. coli and E. faecalis Biofilm Formation in Water Distribution Pipelines
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Towards a Realistic Data-Driven Leak Localization in Water Distribution Networks

1
School of Civil Engineering, College of Engineering, University of Tehran, Tehran 1417614411, Iran
2
Department of Mathematics and Statistics, Brock University, St. Catharines, ON L2S 3A1, Canada
*
Author to whom correspondence should be addressed.
Water 2025, 17(13), 1988; https://doi.org/10.3390/w17131988
Submission received: 30 April 2025 / Revised: 28 June 2025 / Accepted: 29 June 2025 / Published: 2 July 2025
(This article belongs to the Special Issue Sustainable Management of Water Distribution Systems)

Abstract

Current data-driven methods for leak localization (LL) in water distribution networks (WDNs) rely on two unrealistic assumptions: they frame LL as a node-classification task, requiring leak examples for every node—which rarely exists in practice—and they validate models using random data splits, ignoring the temporal structure inherent in hydraulic time-series data. To address these limitations, we propose a temporal, regression-based alternative that directly predicts the leak coordinates, training exclusively on past observations and evaluating performance strictly on future data. By comparing five machine-learning techniques—k-nearest neighbors, linear regression, decision trees, support vector machines, and multilayer perceptrons—in both classification and regression modes, and using both random and temporal splits, we show that conventional evaluation methods can misleadingly inflate model accuracy by up to four-fold. Our results highlight the importance and suitability of a temporally consistent, regression-based approach for realistic and reliable leak localization in WDNs.

1. Introduction

Water distribution networks (WDNs) are critical infrastructure systems that ensure reliable access to potable water. However, water loss due to undetected leaks continues to pose significant challenges globally, leading to economic loss, infrastructure damage, and resource waste [1]. Leak localization (LL), identifying the location of leaks within a WDN, is therefore an essential task for improving operational efficiency and sustainability. Current LL methods are mostly based on field inspection and using acoustic devices, which are slow and inefficient [2].
To overcome the disadvantages of physical methods, model-based, data-driven, and hybrid models are used to localize the leakage using hydraulic and topological data of WDNs. Model-based methods require a well-calibrated hydraulic model of the Water WDN, which addresses the LL problem using mechanistic approaches, such as inverse problem-solving methods [3], sensitivity analysis [4], and fuzzy logic techniques [5]. In contrast, data-driven methods do not rely on a hydraulic model of the network; instead, they only require hydraulic data, either real or simulated. Some studies employ a combination of data-driven and model-based methods, referred to as hybrid approaches. This means that certain phases of the method depend on a hydraulic model, while other steps can be executed without it [6]. Data-driven models have attracted more attention in the last few years as they require a less profound understanding of the complex nonlinear behavior of WDNs [6].
Most data-driven models used for LL of WDNs are supervised machine learning (ML) models [7], such as random forests (RFs) and Bayesian classifiers. Inputs of these models are hydraulic features of the network, the most common one of which is residual pressure, which represents the amount of pressure drop in nodes due to the leakage. The output of these models is the location of the leakage, represented by the label of the leakage node [1] or nodes close to the leakage point [8]. The dataset required for tuning the models can come from field inspections, which are both time-consuming and costly, or, more commonly, from hydraulic simulations in software such as EPANET 2 [9]. In theory, one could simply generate synthetic leak events at every node (and even inject random noise) to guarantee full nodal coverage. However, achieving representative synthetic data relies on a fully calibrated hydraulic model (accurate pipe roughness, demand patterns, boundary conditions, etc.), which itself demands high-quality field measurements and past leak records. Moreover, adding uncorrelated random noise does not reproduce the complex, correlated uncertainties in real sensor readings, demand-driven pressure variations, valve operations, and other network dynamics. As a result, purely noise-augmented simulations may still fail to capture the nuances of real-world leakage signatures.
The collected dataset is then divided into calibration and testing datasets for developing ML models. Calibration, also called training in ML, is the process of changing the learnable parameters to reduce the differences between the actual values of the target variable and the model predictions [10]. Testing is the process of evaluating the model’s performance over a different dataset from the training [11]. The generation of the training and testing datasets, although synthetic and simulation-based, should closely mimic real-world constraints.
High performances, up to 100%, were reported for data-driven models developed for the LL of WDNs. However, there are concerns regarding the development of these models that make the reported performances questionable. First of all, often a classifier was used for LL, that is, a model that classifies the leakage node as one of the predefined nodes (labels). This, however, requires the model to be trained with a training dataset that includes, for every node, a scenario where that is the leakage node. In a real-world application where the dataset consists of leakage records, this means that all network nodes should have leaked at least once in the past. Since regressor models do not require all possible outputs among their training samples, training a regressor to predict the location (coordinates) of the leaking node can be a more realistic approach.
A second issue is the random partitioning of the dataset into training and testing datasets, because, in practice, the model has only access to past data of the network, not the future [12,13]. Whereas, in a random partitioning, some instances in the testing dataset proceed to those in the training, resembling the unrealistic situation where the model can access future data when calibrated. The realistic approach is a temporal partitioning, where a time point is considered as a reference (e.g., the current time), and the data instances before and after that time point comprise the training and testing datasets, respectively (Figure 1). By taking the reference time point at the end of one of the leakage scenarios, none of the leakage scenarios are both before and after the reference point. Then, temporal partitioning will be the same as nodal partitioning, where a specific percentage of the network nodes leak in the training phase and the rest leak in both the training and testing phases. The network has relatively similar hydraulic conditions during the leakage of the same node at different time steps. It is realistic to have data instances of some nodes at different time steps in both training and testing datasets because nodes that leaked before could leak in the future, but a random partitioning often puts data instances of all nodes in both training and testing datasets. This similarity boosts the performance, which is misleading and not necessarily replicable in real conditions. It is important to recognize that considering a high percentage of nodes during the training phase is not realistic. This is because it is rare for all nodes to have leakage records and for their data to be readily available. Therefore, only a limited number of nodes should be used in the training phase.
The most common data-driven approach found in previous studies on LL of WDNs involves training a classifier to predict the location of a leak, whether it be at a node, pipe, or area, using hydraulic data from the network. These studies vary in three key aspects: the type of hydraulic data utilized as input, which can include pressure, residual pressure, or flow; the classifiers employed for training; and the specific case studies examined. For example, one of the recent works that employed this general approach [14] trained three classifiers, support vector machine (SVM), k-nearest neighbor (KNN), and artificial neural network (ANN), to predict the label of the leaking node in a campus-scale WDN, using residual pressures as the input. The SVM classifier achieved the highest overall accuracy at 79%. This was followed by KNN, which achieved an accuracy of 70%, and ANN, which achieved 61%. The other works, which used almost the same method, are summarized in Table 1.
There were also some studies that used different approaches from those listed in Table 1. Ref. [23] trained a neural network regressor to predict the coordination of the leaking node, using pressure at some nodes, on a WDN in Portugal. The model was able to predict the coordinates of the leaking node with a coefficient of determination (R2) of 0.98. In [24], an image processing technique was employed to locate the leaking node in the benchmark Hanoi network. Each node in the water network was subjected to different demand pattern scenarios with leaks, and residual pressure was recorded at 12 observation nodes. For each leak scenario, an RGB image was generated in which the value of each pixel represented the residual pressure at that location. The residual pressure at pixels corresponding to observation nodes was directly recorded, while the pressure at other pixels was estimated using spatial Kriging interpolation based on the network’s topological information and the observed pressures. A convolutional neural network (CNN) model was trained to identify the pixel corresponding to the leak location. The trained model was able to identify the leaking node with an accuracy of 94%.
The primary and common research gap among the discussed works is the reliance on a full-label dataset for training a classifier and the use of random partitioning, which results in an unrealistic train/test split.
To the best of the authors’ knowledge, no existing data-driven model has specifically addressed LL in WDNs using training data from only a limited number of nodes. In practical scenarios, generating a comprehensive dataset covering leakages at every network node is often unrealistic due to limitations in historical records and the cost and practicality of inducing artificial leaks. Most real-world leakage data come from historical records or controlled tests (e.g., opening fire hydrants) at selected locations, making it improbable to have data for every node. Therefore, it is critical that simulated datasets used for training closely mimic these practical constraints by not assuming leakage data from all nodes.
This paper addresses two primary objectives: First, it proposes a more realistic and practical alternative approach for leak localization. Second, it demonstrates that unrealistic modeling practices, such as training classifiers on datasets that include leakage scenarios from every node or using random partitioning for training and testing datasets, can result in overly optimistic and misleading performance metrics that are unlikely to be reproducible under real-world conditions.
To achieve these objectives, multiple machine-learning models, including classifiers (KNNs, Linear Regression (LR), DT, SVM, and Multilayer Perceptron (MLP)) and their corresponding regressors, are trained using both random and temporal (nodal) partitioning methods on two synthetic benchmark networks. The classifiers predict labels corresponding to the leaking nodes, while regressors predict the coordinates of the leakage locations.
The contribution of this work is two-fold: first, explicitly illustrating how common unrealistic practices, such as training classifiers with complete nodal leakage data and random dataset partitioning, lead to inflated, unrealistic performance metrics; second, proposing and validating a practical alternative using ML regressors trained via temporal (nodal) partitioning to estimate leakage coordinates, effectively addressing the challenge of incomplete node leakage data.
The main audience for this study includes researchers focused on data-driven LL in WDNs. However, the findings concerning dataset partitioning approaches (random vs. temporal) may also be beneficial for researchers developing data-driven models involving time-dependent datasets in other domains.

2. Methodology

Figure 2 illustrates the detailed methodology proposed in this research. In steps 1 through 8, the required samples were generated, with each sample corresponding to a specific leakage scenario at a particular node (n) within a given time interval, set equal to the length of the demand pattern. The number of time steps depends on the demand pattern—for instance, a 24-h demand pattern contains 24 time steps. To capture fluctuations in water demand during leakage scenarios, the base demands of nodes were multiplied by a random coefficient (d), creating variations among samples for identical nodes leaking at the same time step. To maintain realistic leakage scenarios, the leakage flow was constrained not to exceed 30% of the total network flow at any given time step. If this threshold was surpassed, the coefficient d was adjusted, and the process was repeated until the condition was satisfied.
In steps 9 and 10, the generated dataset was divided into training and testing datasets using two different methods: random partitioning and nodal (temporal) partitioning. The following step involved using a grid search approach to evaluate the model’s training effectiveness. In step 12, we checked whether the model’s performance had improved by more than 1% compared to the previous iteration, which had a smaller number of samples. If the improvement exceeded 1%, the number of samples was increased; otherwise, the process of expanding the dataset was stopped.

2.1. Benchmark Networks

Two benchmark WDNs are considered to test the proposed methodology in this study. The details and characteristics of these WDNs are explained in this section.
(a)
Hanoi network
The Hanoi WDN, a common benchmark in LL, was used as the first case study (Figure 3). This network was introduced first in [25] and includes 31 internal nodes, one reservoir, and 34 pipes.
The pipes’ diameters were considered based on the values suggested by [26] (Table A1 in Appendix A). The suggested demand pattern for each node of the network by [19] was used as the network’s base demand patterns (Table A2 and Figure A1 in Appendix A).
(b)
Anytown WDN
The Anytown WDN [27], which includes 22 nodes, was used as the second case study (Figure 4). The chart of the network’s 24-h flow pattern is shown in Table A3 of Appendix A.

2.2. Data Simulation in EPANET

EPANET 2.2 [9] was used to simulate networks in various leakage conditions. Each sample was made by leaking just one node at each timestep. Using engineering judgment, it was assumed that each leakage continued on average for 24 h, so each node was leaked in 24 different timesteps. For simulating changes in the nodal demands over time, creating samples that are more distinct from one another, and increasing the number of samples for better model training, additional demand patterns were made by multiplying the 24-h demand pattern coefficients by a random number between 0 and 2.
In EPANET, the leakage is simulated by assuming an emitter installed at the leaking node. The following experimental formula is used for measuring the output flow of an emitter [20]:
Q = C P 0.5
where Q, C, and P are the leaking flow (m3/s), emitter coefficient, and pressure head (m) at the leaking node. The emitter coefficient represents the size and form of the emitter’s nozzle.
To simulate leaking in a node, its emitter coefficient value should be determined. Theoretically, the emitter coefficient can have any positive value, but for simplification, it was assumed that all nodes experienced leakage due to physical damage with the same shape and size, so all leakages were set to have C = 1. The size of the leaking flows was up to 3% of the network’s total flow.

2.3. Feature and Target Engineering

The input features for all models were the residual pressures of all nodes. In real networks, only a subset of nodes has pressure sensors. Since determining the optimal sensor placements is beyond the scope of this work, the pressures of all nodes were used as the input. The output was the coordination of the leaking node for regressor ML models and the leaking node label for classifier ML models.

2.4. Dataset Partitioning Strategy

Two types of datasets were generated. For the nodal (temporal) partitioning, 20% of the network nodes were chosen randomly to leak in various demand conditions to generate the training dataset. Then, all nodes leaked the same way to generate the testing dataset. For the random partitioning, all nodes leaked, and then samples were divided randomly to generate training and testing datasets. As the common practice for random partitioning, 80% of the dataset was taken as the training dataset, and the remaining 20% was used for testing.

2.5. ML Model Selection

Scikit-learn, an ML library in Python 3.9, was used to train KNN, LR, LoR, DT, SVM, and MLP models. The grid-search algorithm was used for the hyperparameter optimization of these ML models. In this approach, the model is trained with various combinations of hyperparameters, and the combination that has the best performance over the training dataset is used as the model configuration. The main hyperparameters of each ML were chosen according to [29] (Table A4 in Appendix A). Since the highest accuracy among all models was approximately 90%, no evaluation for overfitting was performed [30].

2.6. Training

The number of training samples used for the Hanoi WDN started from 144 (20% of 31 nodes, which equals 6 nodes that leaked in 24 h) and increased by a constant step of 144 (data for each extra day of leakage). This process was repeated until none of the metrics used for performance evaluation of the model improved by more than 1% (the performance became almost constant). The same approach was used for the Anytown network, in which the training dataset size started from 96 samples (4 nodes that leaked in 24 h) and increased with a constant step of 96. For the nodal partitioning, a total of 12 and 9 different training sample sizes were utilized for the Hanoi and Anytown networks, respectively. The maximum number of training samples reached 8928 for the Hanoi network and 4752 for the Anytown network in order to determine the optimal size of the training dataset. For random partitioning, the number of training samples used for the Hanoi WDN started from 595 (80% of the total samples, which were made by leaking 31 nodes in 24 h) and increased by a constant step of 595. For the Anytown network, we followed the same procedure, starting with 423 training samples and increasing the dataset in fixed increments of 423. For the nodal partitioning, a total of 20 and 15 different training sample sizes were utilized for the Hanoi and Anytown networks, respectively. The maximum number of training samples reached 11,900 for the Hanoi network and 6345 for the Anytown network in order to determine the optimal size of the training dataset. In random partitioning, data from every node appears in both training and testing sets; by contrast, nodal (temporal) partitioning ensures that only a subset of nodes contributes samples to both datasets.
The 5-fold cross-validation was used as a common method [31] to ensure that the performance of the models is not biased by a specific dataset partitioning. In the case of random partitioning, the dataset was divided into 5 parts. One part was selected for testing, while the other four parts were used for training the model. This process was repeated five times so that each part served as the testing dataset once. For nodal partitioning, the Hanoi and Anytown networks were divided into 5 clusters. Four clusters contained 6 nodes, and one cluster contained 7 nodes, while three clusters in the Anytown network had 4 nodes, and two clusters had 5 nodes. The clustering was performed randomly, as there is no specific pattern for the nodes based on historical records of leakage. One cluster was used to generate the training dataset, and this process was repeated until all clusters had been used once to create the training dataset. Finally, the performance of the model was evaluated under each partitioning condition.

2.7. Evaluation Metrics

The classifiers’ output, the leaking node label, could be used directly to evaluate the model’s performance. However, the regressors’ output needed pre-processing before the evaluation. The predicted coordinates rarely fit any node of the network, so the nearest node to the predicted coordinates was considered the model-predicted leaking node. Three indices, accuracy, ATD, and average ranking (AR), were used. Accuracy is defined as:
A c c u r a c y = c s × 100
where s is the number of samples in the testing dataset and c is the number of samples that were correctly predicted in the testing dataset. The most commonly used metric for evaluating models developed for multi-label classification tasks [32]. Accuracy ranges from 0% to 100%, with higher values indicating better model performance.
However, since overall accuracy provides only a general assessment, we also defined a more detailed performance metric—Accuracyi—to capture how well the model performs for each individual label.
A c c u r a c y i = c i s
where ci represents the number of samples in which the leaking node is among the i-nearest nodes to the predicted node. While accuracy simply measures how many samples were localized correctly versus how many were not, this new index classifies the testing samples into n categories, where n denotes the number of nodes in the network. As i increases, it is expected that accuracyi will also increase, because locating the leaking node among a larger number of nearest nodes becomes an easier task. Despite Accuracy, this new metric does not respond uniformly to all the cases where the model fails to identify the leakage node. Instead, it highlights instances where the predicted node is nearer to the leakage node.
ATD is formulated as follows [16]:
A T D = j = 1 s d j s
where d j is defined as the shortest path on the pipelines between the real leakage node and the predicted leakage node by the model for the jth sample in the testing dataset, and s denotes the number of samples in the testing dataset. The ATD is a real number that ranges from 0 to the longest path between two network nodes. A lower value indicates better model performance.
The AR indicator is defined to examine how close the real leaking is to the model prediction compared to other nodes of the network, and it is formulated as follows:
A v e r a g e   r a n k i n g = j = 1 s r j s
To calculate the Average Ranking (AR), all network nodes are first sorted based on their distance from the node predicted by the model. For each test sample, the rank rj represents the position of the true leaking node within this sorted list, i.e., how close the actual leak is to the model’s prediction. The AR metric is then computed as the average of these ranks across all s samples in the testing dataset. The AR value ranges from 1 (best possible performance, where the true leak is always the closest node to the prediction) to N, the total number of nodes (worst case, where the true leak is always the farthest). In this study, the maximum value of this index is 31 for the Hanoi network and 22 for the Anytown network. A lower AR indicates better localization accuracy. Figure 5 provides a visual example to illustrate this concept.

3. Results

All models were trained and evaluated under five different data partitioning conditions, and the average performance according to the ATD and AR metrics on the test datasets is illustrated in Figure 6. The results of the grid search for all models, under the condition with the best performance (among the five conditions of the 5-fold cross-validation method), are shown in Table A5 in the Appendix A.

3.1. Comparing Various Models Based on ATD and AR Metrics

For both the Hanoi and Anytown water-distribution networks (WDNs), the highest accuracy was obtained when the classifier was trained with random partitioning. In descending order of performance, the next-best approaches were (i) regressors with random partitioning, (ii) regressors with nodal partitioning, and (iii) classifiers with nodal partitioning.
Random partitioning allows the training set to contain leakage samples from every node, which—because leaks simulated at the same node are far more alike than leaks at different nodes—gives the model information it would never see in practice. This “peeking” explains the consistently superior scores of models trained in this way. Within that group, classifiers outperformed regressors because classification maps the input features to a small, discrete set of outputs, enabling the model to learn the feature-to-label relationships more readily.
By contrast, nodal (temporal) partitioning with classifiers performed worst. Only 20% of the nodes (those present in the training subset) were represented during learning, so the model could localize leaks on those nodes but was effectively blind to the remaining 80%. Regressors trained with the same nodal split fared somewhat better: although they also saw only a subset of nodes, they predict continuous coordinates rather than discrete labels, so the pressure-to-location relationship learned from the known nodes could be partially transferred to unseen nodes.
Despite these differences, all classifiers trained with random partitioning still located the true leak very close to their predictions. According to the AR index, every classifier except the decision tree (DT) identified the actual leakage node within its four nearest-neighbor predictions in both WDNs. Using the ATD index, their predicted leak positions were, on average, within roughly 1 km in Hanoi and 4 km in Anytown—impressive given that the total pipeline lengths are 40 km and 90 km, respectively.

3.2. Comparing Results of Models on Train and Test Datasets

The numerical results for all models on both the training and test datasets are presented in Table 2 and Table 3. The analysis of all models, except for classifiers that used nodal partitioning, indicated that the training dataset achieved better (lower) AR and ATD. Moreover, overfitting did not occur for these models, as the performance on the training dataset is not significantly better than the performance on the test datasets [30].
Classifiers trained with nodal partitioning tell a different story. They performed markedly better on the training set than on the test set, producing a large apparent generalization gap. This gap should not be read as conventional overfitting. Under nodal partitioning, the model is exposed to leakage labels for only 20% of the nodes and is entirely ignorant of the remaining 80%. In routine practice such a setup would be deemed infeasible, but we included it to complete the factorial comparison of (i) classifier vs. regressor and (ii) random vs. nodal splits. Given that most classes are absent during training, a steep drop in test-set performance is inevitable and simply reflects the information withheld from the model rather than a failure of the learning algorithm itself.

3.3. Comparing Various KNN Models Based on the Accuracyi Index

All algorithms had the same behavior according to Accuracyi index. As an instance, the KNN models’ performance is shown in Figure 7. As the value of i increases, the difference between different KNN models becomes less discernible. The probability of the real leakage node being among the i-nearest nodes to the predicted leaking node increases as the value of i increases, despite the model development approach.
The findings from models employing random partitioning clearly highlight the limitations of this widely used practice. Random partitioning, by design, places very similar samples, representing leakages from the same node under various demand conditions, into both the training and testing datasets. Consequently, models can artificially achieve high accuracy since they have already encountered nearly identical scenarios during training. This explains the consistently high performance reported in earlier studies, regardless of the choice of machine learning algorithm or network complexity. However, this apparent success is misleading, as real-world conditions rarely guarantee prior exposure to all nodes or scenarios during model training, severely limiting the practical applicability of such classifiers.

3.4. Assessing Error Distribution of the Models

Figure 8 plots the probability distributions of normalized ATD for the DT models trained on the Hanoi and Anytown networks; analogous curves for the remaining models are provided in Figure A2, Figure A3, Figure A4 and Figure A5 of Appendix A.
For classifiers with random partitioning, the distributions are narrow and tightly clustered around the center, indicating highly consistent performance across all samples. This behavior is expected because the test instances strongly resemble the training set, enabling the models to pinpoint the corresponding leakage nodes with ease.
When classifiers are trained with nodal (temporal) partitioning, the distributions become left-skewed, with a pronounced peak at lower ATD values. This peak reflects superior performance on leakages originating from nodes that were present during training—these cases are inherently easier to localize.
The regressors exhibit similar qualitative patterns. Under random partitioning, their distributions also center on a mean but show a wider spread than the classifiers, a consequence of the more demanding task: regressors must predict two continuous outputs (x- and y-coordinates) rather than a single categorical label, and their output space is effectively unbounded.
With nodal (temporal) partitioning, the regressor curves again display a left-side peak corresponding to training-set nodes. Compared with the classifiers, two distinctions emerge: (i) the classifier peak lies closer to zero, consistent with the relative simplicity of the classification task, and (ii) the regressor distributions assign lower probability to very large errors (values approaching 1), indicating that even with limited nodal data, the regression models still capture underlying spatial patterns.

3.5. Analysis of the Spatial Performance of the Models

In models that used random partitioning, every node in the training dataset had corresponding samples, resulting in similar accuracy across all nodes. In contrast, with nodal partitioning, only some nodes contained samples from the training dataset, leading to varying performance levels across the different nodes. To illustrate this concept, the performance of the MLP regressor (in one of the five conditions evaluated using the 5-fold cross-validation method), based on the ATD metric, at each node of the Hanoi network is shown in Figure 9.
Although the overall ATD value for the network was 2473 m, the ATD for individual nodes varied, ranging from 1863 m to 2907 m. Nodes that were generally farther away from those included in the training dataset had higher ATD values, indicating that the model performed poorly in locating these nodes while they were leaking.

4. Discussion and Conclusions

In this paper, we addressed the unrealistic assumptions underlying many previous data-driven approaches for LL in WDNs. Traditionally these approaches assumed complete leakage data availability for all nodes, used random train-test partitioning, and mostly relied on classifiers to identify the leaking node based on hydraulic data. To introduce greater realism and applicability, an alternative approach of nodal (temporal) partitioning was proposed and validated in this study. Unlike random partitioning, nodal partitioning uses leakage data from only a subset of nodes during training, while testing incorporates data from all nodes. Additionally, regression models were emphasized that directly predict leak coordinates, which is inherently more applicable to practical scenarios.
The conducted comparative analysis demonstrated that random partitioning inflated model performance significantly, up to four times higher according to accuracy measures, when compared to more realistic nodal partitioning. This indicates clearly that random partitioning methods yield overly optimistic and unrealistic evaluations, reinforcing our argument for adopting nodal partitioning and regression methods to obtain more dependable, real-world applicable results.
Nevertheless, the present study has limitations, including the assumption of a single leakage at each timestep, the use of a constant emitter coefficient, and reliance on pressure data from all nodes as inputs. Future research should focus on overcoming these limitations by: (1) evaluating model performance under simultaneous (overlapping) leak scenarios, (2) optimizing sensor placements to enhance performance and applicability, and (3) repeating the process of model development and comparison of them suggested in this work for large-scale real-life networks to gain more confidence in the validity of the reported results. While the evaluation methodologies of some previous studies may be questioned, their parameter-tuning techniques and methodological insights remain valuable and should be revisited within this more realistic framework. Finally, for practitioners and decision-makers, it is essential to ensure that the LL models they adopt are evaluated using realistic assumptions. They must scrutinize the dataset generation process, verify that their models can generalize to unseen nodes, and avoid overreliance on inflated metrics derived from flawed validation strategies.

Author Contributions

Conceptualization, S.N. and P.R.; Methodology, A.A.; Software, A.A.; Validation, S.N. and P.R.; Investigation, A.A.; Data curation, A.A.; Writing—original draft, A.A., S.N. and P.R.; Supervision, S.N. and P.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data available on request due to privacy restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. The diameter values of the Hanoi WDN’s pipes.
Table A1. The diameter values of the Hanoi WDN’s pipes.
LabelDiameter (mm)
11016
21016
31016
41016
51016
61016
71016
81016
91016
10762
11609.6
12609.6
13508
14406.4
15304.8
16304.8
17406.4
18508
19508
201016
21508
22304.8
231016
24762
25762
26508
27304.8
28304.8
29406.4
30406.4
31304.8
32304.8
33406.4
341016
Figure A1. Base demand multipliers for different demand patterns [19].
Figure A1. Base demand multipliers for different demand patterns [19].
Water 17 01988 g0a1
Table A2. Nodes’ base demands and their demand patterns for the Hanoi WDN.
Table A2. Nodes’ base demands and their demand patterns for the Hanoi WDN.
LabelBase Demand (CMH)Demand Pattern
1 (Reservoir)N.A.N.A.
28901
38501
41301
57251
610051
713501
85501
95251
105251
115004
125604
139404
146151
152801
163105
178655
1813455
19605
2012752
219304
224854
2310456
248206
251706
269002
273702
282903
293603
303603
311053
328053
Table A3. 24-h demand patterns of Anytown nodes.
Table A3. 24-h demand patterns of Anytown nodes.
HourDemand Multipliers
01
11
21
30.9
40.9
50.9
60.7
70.7
80.7
90.6
100.6
110.6
121.2
131.2
141.2
151.3
161.3
171.3
181.2
191.2
201.2
211.1
221.1
231.1
Table A4. The list of all models’ hyperparameters used for the grid-search. Names and values are based on the Scikit-learn version 1.4.0.
Table A4. The list of all models’ hyperparameters used for the grid-search. Names and values are based on the Scikit-learn version 1.4.0.
Model Hyperparameters and Their Options
KNNn-neighbors: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
DTcriterion: {squared_error, friedman_mse, absolute_error, poisson} (for regressor)
criterion: {gini, entropy, log_loss} (for classifier)
max_depth: {None, 1, 2, 5, 10, 20}
min_samples_split: {2, 5, 10, 20}
min_sample_leaf: {1, 2, 5, 10, 20}
max_features: {1, 2, 5, 10, 20, 50, 100}
LRNone
LoR
(as an equivalent classifier for LR)
penalty: {None, L1, L2, elastic net}
C: {1, 2, 5, 10, 20, 50, 100}
solver: {liblinear, newton-cg, lbfgs, sag, saga}
MLPhidden_layer_sizes: {(10), (20), (30), (40), (50), (60), (70), (80), (90), (100)}
activation: {relu, tanh, logistic}
solver: {lbfgs, sgd, adam}
learning_rate: {constant, adaptive}
learning_rate_init: {0.01, 0.01, 0.001, 0.0001}
SVMC: {10,000, 20,000, 50,000, 100,000}
kernel: {linear, rbf, poly}
degree: {2, 3} (for kernel: rbf)
epsilon: {0.01, 0.02, 0.05, 0.1} (for regression}
Table A5. The best hyperparameters for all models. The optimal configuration for each model is presented for the best dataset size.
Table A5. The best hyperparameters for all models. The optimal configuration for each model is presented for the best dataset size.
ModelResults on Hanoi WDNResults on Anytown WDN
KNN regressor with temporal partitioning n-neighbors: 1n-neighbors: 1
KNN regressor with random partitioningn-neighbors: 2n-neighbors: 2
KNN classifier with temporal partitioningn-neighbors: 1n-neighbors: 1
KNN classifier with random partitioningn-neighbors: 2n-neighbors: 3
DT regressor with temporal partitioning criterion: absolute_errorcriterion: poisson
max_depth: 10max_depth: 20
max_features: 20max_features: 10
min_sample_leaf: 1min_sample_leaf: 2
min_samples_split: 2min_samples_split: 5
DT regressor with random partitioningcriterion: friedman_msecriterion: absolute_error
max_depth: Nonemax_depth: 20
max_features: 50max_features: 10
min_sample_leaf: 2min_sample_leaf: 1
min_samples_split: 5min_samples_split: 2
DT classifier with temporal partitioningcriterion: ginicriterion: log_loss
max_depth: 5max_depth: 10
max_features: 100max_features: 20
min_sample_leaf: 2min_sample_leaf: 1
min_samples_split: 2min_samples_split: 2
DT classifier with random partitioningcriterion: ginicriterion: gini
max_depth: None max_depth: None
max_features: 50max_features: 20
min_sample_leaf: 1min_sample_leaf: 1
min_samples_split: 2min_samples_split: 2
LR regressor with temporal partitioning NoneNone
LR regressor with random partitioningNoneNone
LoR classifier with temporal partitioningC: 20C: 100
penalty: L1penalty: L1
solver: liblinearsolver: liblinear
LoR classifier with random partitioningC: 50C: 100
penalty: l1penalty: l1
solver: liblinearsolver: liblinear
MLP regressor with temporal partitioning activation: logisticactivation: logistic
hidden_layer_sizes: (70)hidden_layer_sizes: (90)
learning_rate: adaptivelearning_rate: adaptive
learning_rate_init: 0.001learning_rate_init: 0.01
solver: lbfgssolver: lbfgs
MLP regressor with random partitioningactivation: reluactivation: relu
hidden_layer_sizes: (30)hidden_layer_sizes: (30)
learning_rate: constantlearning_rate: constant
learning_rate_init: 0.0001learning_rate_init: 0.01
solver: lbfgssolver: lbfgs
MLP classifier with temporal partitioningactivation: reluactivation: logistic
hidden_layer_sizes: (10)hidden_layer_sizes: (50)
learning_rate: constantlearning_rate: constant
learning_rate_init: 0.01learning_rate_init: 0.01
solver: adamsolver: lbfgs
MLP classifier with random partitioningactivation: logisticactivation: relu
hidden_layer_sizes: (80)hidden_layer_sizes: (30)
learning_rate: constantlearning_rate: constant
learning_rate_init: 0.0001learning_rate_init: 0.01
solver: lbfgssolver: lbfgs
SVM regressor with temporal partitioning C: 20,000C: 100,000
epsilon: 10epsilon: 100
kernel: rbfkernel: rbf
degree: N.A.degree: N.A.
SVM regressor with random partitioningC: 100,000C: 50,000
epsilon: 100epsilon: 100
kernel: rbfkernel: rbf
degree: N.A.degree: N.A.
SVM classifier with temporal partitioningC: 10,000C: 100,000
kernel: poly kernel: poly
degree: 2degree: 2
SVM classifier with random partitioningC: 10,000C: 100,000
kernel: poly kernel: poly
degree: 2degree: 2
Figure A2. The distribution of normalized ATD for various KNN models trained on Hanoi and Anytown networks.
Figure A2. The distribution of normalized ATD for various KNN models trained on Hanoi and Anytown networks.
Water 17 01988 g0a2
Figure A3. The distribution of normalized ATD for various LR models trained on Hanoi and Anytown networks.
Figure A3. The distribution of normalized ATD for various LR models trained on Hanoi and Anytown networks.
Water 17 01988 g0a3
Figure A4. The distribution of normalized ATD for various MLP models trained on Hanoi and Anytown networks.
Figure A4. The distribution of normalized ATD for various MLP models trained on Hanoi and Anytown networks.
Water 17 01988 g0a4
Figure A5. The distribution of normalized ATD for various SVM models trained on Hanoi and Anytown networks.
Figure A5. The distribution of normalized ATD for various SVM models trained on Hanoi and Anytown networks.
Water 17 01988 g0a5

References

  1. Sun, C.; Parellada, B.; Puig, V.; Cembrano, G. Leak localization in water distribution networks using pressure and data-driven classifier approach. Water 2020, 12, 54. [Google Scholar] [CrossRef]
  2. Fares, A.; Tijani, I.A.; Rui, Z.; Zayed, T. Leak detection in real water distribution networks based on acoustic emission and machine learning. Environ. Technol. 2023, 44, 3850–3866. [Google Scholar] [CrossRef] [PubMed]
  3. Daniel, I.; Pesantez, J.; Letzgus, S.; Khaksar Fasaee, M.A.; Alghamdi, F.; Berglund, E.; Mahinthakumar, G.; Cominola, A. A Sequential Pressure-Based Algorithm for Data-Driven Leakage Identification and Model-Based Localization in Water Distribution Networks. J. Water Resour. Plan. Manag. 2022, 148, 04022025. [Google Scholar] [CrossRef]
  4. Steffelbauer, D.B.; Deuerlein, J.; Gilbert, D.; Abraham, E.; Piller, O. Pressure-Leak Duality for Leak Detection and Localization in Water Distribution Systems. J. Water Resour. Plan. Manag. 2022, 148, 04021106. [Google Scholar] [CrossRef]
  5. Sanz, G.; Perez, R.; Escobet, A. Leakage localization in water networks using fuzzy logic. In Proceedings of the 2012 20th Mediterranean Conference on Control & Automation (MED), Barcelona, Spain, 3–6 July 2012; pp. 646–651. [Google Scholar] [CrossRef]
  6. Romero-Ben, L.; Alves, D.; Blesa, J.; Cembrano, G.; Puig, V.; Duviella, E. Leak detection and localization in water distribution networks: Review and perspective. Annu. Rev. Control 2023, 55, 392–419. [Google Scholar] [CrossRef]
  7. Burkart, N.; Huber, M.F. A Survey on the Explainability of Supervised Machine Learning. J. Artif. Intell. Res. 2021, 70, 245–317. [Google Scholar] [CrossRef]
  8. Soldevila, A.; Boracchi, G.; Roveri, M.; Tornil-Sin, S.; Puig, V. Leak detection and localization in water distribution networks by combining expert knowledge and data-driven models. Neural Comput. Appl. 2022, 34, 4759–4779. [Google Scholar] [CrossRef]
  9. Rossman, L.A. EPANET 2 USERS MANUAL. 2000. Available online: https://www.microimages.com/documentation/tutorials/epanet2usermanual.pdf (accessed on 29 April 2025).
  10. Pernot, P. Calibration in Machine Learning Uncertainty Quantification: Beyond consistency to target adaptivity. APL Mach. Learn. 2023, 1, 046121. [Google Scholar] [CrossRef]
  11. Braiek, H.B.; Khomh, F. On testing machine learning programs. J. Syst. Softw. 2020, 164, 110542. [Google Scholar] [CrossRef]
  12. Ramazi, P.; Haratian, A.; Meghdadi, M.; Mari Oriyad, A.; Lewis, M.A.; Maleki, Z.; Vega, R.; Wang, H.; Wishart, D.S.; Greiner, R. Accurate long-range forecasting of COVID-19 mortality in the USA. Sci. Rep. 2021, 11, 13822. [Google Scholar] [CrossRef]
  13. Ramazi, P.; Kunegel-Lion, M.; Greiner, R.; Lewis, M.A. Predicting insect outbreaks using machine learning: A mountain pine beetle case study. Ecol. Evol. 2021, 11, 13014–13028. [Google Scholar] [CrossRef] [PubMed]
  14. Sousa, C.; Calheiros, C.; Maria, A.; Geraldes, A.; Onukwube, C.U.; Aikhuele, D.O.; Sorooshian, S. Development of a Fault Detection and Localization Model for a Water Distribution Network. Appl. Sci. 2024, 14, 1620. [Google Scholar] [CrossRef]
  15. Mazaev, G.; Weyns, M.; Vancoillie, F.; Vaes, G.; Ongenae, F.; Van Hoecke, S. Probabilistic leak localization in water distribution networks using a hybrid data-driven and model-based approach. Water Supply 2023, 23, 162–178. [Google Scholar] [CrossRef]
  16. Tyagi, V.; Pandey, P.; Jain, S.; Ramachandran, P. A Two-Stage Model for Data-Driven Leakage Detection and Localization in Water Distribution Networks. Water 2023, 15, 2710. [Google Scholar] [CrossRef]
  17. Mazaev, G.; Weyns, M.; Moens, P.; Haest, P.J.; Vancoillie, F.; Vaes, G.; Debaenst, J.; Waroux, A.; Marlein, K.; Ongenae, F.; et al. A microservice architecture for leak localization in water distribution networks using hybrid AI. J. Hydroinformatics 2023, 25, 851–866. [Google Scholar] [CrossRef]
  18. Li, J.; Zheng, W.; Lu, C. An Accurate Leakage Localization Method for Water Supply Network Based on Deep Learning Network. Water Resour. Manag. 2022, 36, 2309–2325. [Google Scholar] [CrossRef]
  19. Lučin, I.; Lučin, B.; Čarija, Z.; Sikirica, A. Data-driven leak localization in urban water distribution networks using big data for random forest classifier. Mathematics 2021, 9, 672. [Google Scholar] [CrossRef]
  20. Mashhadi, N.; Shahrour, I.; Attoue, N.; El Khattabi, J.; Aljer, A. Use of machine learning for leak detection and localization in water distribution systems. Smart Cities 2021, 4, 1293–1315. [Google Scholar] [CrossRef]
  21. Soldevila, A.; Blesa, J.; Fernandez-Canti, R.M.; Tornil-Sin, S.; Puig, V. Data-driven approach for leak localization in water distribution networks using pressure sensors and spatial interpolation. Water 2019, 11, 1500. [Google Scholar] [CrossRef]
  22. Zhou, X.; Tang, Z.; Xu, W.; Meng, F.; Chu, X.; Xin, K.; Fu, G. Deep learning identifies accurate burst locations in water distribution networks. Water Res. 2019, 166, 115058. [Google Scholar] [CrossRef]
  23. Capelo, M.; Brentan, B.; Monteiro, L.; Covas, D. Near–real time burst location and sizing in water distribution systems using artificial neural networks. Water 2021, 13, 1841. [Google Scholar] [CrossRef]
  24. Javadiha, M.; Blesa, J.; Soldevila, A.; Puig, V. Leak localization in water distribution networks using deep learning. In Proceedings of the 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT), Paris, France, 23–26 April 2019. [Google Scholar] [CrossRef]
  25. Fujiwara, O.; Khang, D.B. A two-phase decomposition method for optimal design of looped water distribution networks. Water Resour. Res. 1990, 26, 539–549. [Google Scholar] [CrossRef]
  26. Geem, Z.W. Optimal cost design of water distribution networks using harmony search. Eng. Optim. 2006, 38, 259–277. [Google Scholar] [CrossRef]
  27. Walski, T.M.; Brill, E.D.; Gessler, J.; Goulter, I.C.; Jeppson, R.M.; Lansey, K.; Lee, H.; Liebman, J.C.; Mays, L.; Morgan, D.R.; et al. Battle of the Network Models: Epilogue. J. Water Resour. Plan. Manag. 1987, 113, 191–203. [Google Scholar] [CrossRef]
  28. Xu, J.; Wang, H.; Rao, J.; Wang, J. Zone scheduling optimization of pumps in water distribution networks with deep reinforcement learning and knowledge-assisted learning. Soft Comput. 2021, 25, 14757–14767. [Google Scholar] [CrossRef]
  29. Yang, L.; Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020, 415, 295–316. [Google Scholar] [CrossRef]
  30. Halabaku, E.; Bytyçi, E. Overfitting in Machine Learning: A Comparative Analysis of Decision Trees and Random Forests. Intell. Autom. Soft Comput. 2024, 39, 987–1006. [Google Scholar] [CrossRef]
  31. Wong, T.T.; Yeh, P.Y. Reliable Accuracy Estimates from k-Fold Cross Validation. IEEE Trans. Knowl. Data Eng. 2020, 32, 1586–1594. [Google Scholar] [CrossRef]
  32. Pereira, R.B.; Plastino, A.; Zadrozny, B.; Merschmann, L.H.C. Correlation analysis of performance measures for multi-label classification. Inf. Process. Manag. 2018, 54, 359–369. [Google Scholar] [CrossRef]
Figure 1. (a) Temporal and (b) random partitioning of a dataset into training and testing datasets. The demand pattern is a 24-h interval of a WDN with four nodes. The nodes are numbered from 1 to 4, with node 2 in yellow representing the observing node. The duration of all leakages is four hours, and data are collected in one-hour time steps. In the temporal partitioning, the 8th hour is the reference cutting point.
Figure 1. (a) Temporal and (b) random partitioning of a dataset into training and testing datasets. The demand pattern is a 24-h interval of a WDN with four nodes. The nodes are numbered from 1 to 4, with node 2 in yellow representing the observing node. The duration of all leakages is four hours, and data are collected in one-hour time steps. In the temporal partitioning, the 8th hour is the reference cutting point.
Water 17 01988 g001
Figure 2. The detailed methodology diagram.
Figure 2. The detailed methodology diagram.
Water 17 01988 g002
Figure 3. The Hanoi WDN [19].
Figure 3. The Hanoi WDN [19].
Water 17 01988 g003
Figure 4. The Anytown WDN [28].
Figure 4. The Anytown WDN [28].
Water 17 01988 g004
Figure 5. Concept of (a) accuracyi and (b) ATD and AR on an example leakage sample. The metrics accuracy1 and accuracy2 are equal to 0 because the model’s prediction (node 20) is not among the two nearest nodes to the leaking node (node 23). However, accuracy3 is equal to 1 because node 20 is within the three nearest nodes to the leaking node. It is clear that accuracy4, accuracy5, and the following metrics are all equal to 1. Nodes are ranked according to their distance from the leaking node, with node 20 receiving a rank of 3, which corresponds to the concept of AR. The shortest path between the leaking node and the model’s prediction is 2650 m in length, illustrating the concept of ATD.
Figure 5. Concept of (a) accuracyi and (b) ATD and AR on an example leakage sample. The metrics accuracy1 and accuracy2 are equal to 0 because the model’s prediction (node 20) is not among the two nearest nodes to the leaking node (node 23). However, accuracy3 is equal to 1 because node 20 is within the three nearest nodes to the leaking node. It is clear that accuracy4, accuracy5, and the following metrics are all equal to 1. Nodes are ranked according to their distance from the leaking node, with node 20 receiving a rank of 3, which corresponds to the concept of AR. The shortest path between the leaking node and the model’s prediction is 2650 m in length, illustrating the concept of ATD.
Water 17 01988 g005
Figure 6. The performance of all models trained on both case studies according to ATD and AR metrics. Blue columns represent regressors with nodal partitioning, while red columns represent regressors with random partitioning. Green columns represent classifiers with nodal partitioning, and purple columns represent classifiers with random partitioning.
Figure 6. The performance of all models trained on both case studies according to ATD and AR metrics. Blue columns represent regressors with nodal partitioning, while red columns represent regressors with random partitioning. Green columns represent classifiers with nodal partitioning, and purple columns represent classifiers with random partitioning.
Water 17 01988 g006aWater 17 01988 g006b
Figure 7. The evaluation of different KNN models on the Hanoi network according to accuracyi. It is expected from the model to show a better performance as i (the number of nearest neighbors in which the presence of the leaking node was assessed) increases, because it should perform an easier task, but according to the figure, just models with nodal partitioning had an improving performance while i was increasing. Models with random partitioning showed almost no sensitivity to this parameter and had a high performance for all values of i. This unexpected behavior of the models with random partitioning shows that the promising performance of these models is not related to their understanding of the problem but is the consequence of having samples with high similarity in both the training and testing datasets.
Figure 7. The evaluation of different KNN models on the Hanoi network according to accuracyi. It is expected from the model to show a better performance as i (the number of nearest neighbors in which the presence of the leaking node was assessed) increases, because it should perform an easier task, but according to the figure, just models with nodal partitioning had an improving performance while i was increasing. Models with random partitioning showed almost no sensitivity to this parameter and had a high performance for all values of i. This unexpected behavior of the models with random partitioning shows that the promising performance of these models is not related to their understanding of the problem but is the consequence of having samples with high similarity in both the training and testing datasets.
Water 17 01988 g007
Figure 8. The distribution of normalized ATD for various DT models trained on Hanoi and Anytown networks.
Figure 8. The distribution of normalized ATD for various DT models trained on Hanoi and Anytown networks.
Water 17 01988 g008
Figure 9. The performance of the MLP regressor at each node in the Hanoi network is presented. Nodes whose corresponding samples appeared in the training phase are shown, and all nodes are color-coded based on their ATD values.
Figure 9. The performance of the MLP regressor at each node in the Hanoi network is presented. Nodes whose corresponding samples appeared in the training phase are shown, and all nodes are color-coded based on their ATD values.
Water 17 01988 g009
Table 1. Summary of works with a similar approach to [13].
Table 1. Summary of works with a similar approach to [13].
WorkThe InputThe OutputTrained ClassifiersCase StudiesResults
[15]pressure at nodeslabel of the leaking nodelogistic regression (LoR)a district-metered area in BelgiumAverage topological distance (ATD) = 0.18 to 4.96 km
[16]flow of some pipes and pressure at some nodeslabel of the leaking nodeLoRHanoiAccuracy = 91%
Net3Accuracy = 79%
C-TownAccuracy = 30%
[17]pressure at nodeslabel of the leaking nodeelastic-net LoRa district-metered area in BelgiumATD = 0.17 to 1.2 km
[18]pressure at some nodeslabel of the leaking pipeResNetAnytownAccuracy = 94%
Net3Accuracy = 91%
[8]flow of pipeslabel of the leaking areaKNNBarcelona WDNAccuracy = 80%
[19]residual pressures at some nodelabel of the leaking nodeRandom forest (RF)HanoiAccuracy = 100%
[20]flow of some pipes and residual pressure at some nodeslabel of the leaking nodeLoRLille University networkAccuracy = 100%
[21]residual pressure at nodeslabel of the leaking nodeKNNHanoiATD = 2.3 nodes
[22]pressure at some nodeslabel of the leaking pipeANNAnytownAccuracy = 100%
Table 2. The performance of all models that trained on Hanoi and Anytown networks based on the ATD (m) metric. Bold numbers indicating lower values (better performance) in each row.
Table 2. The performance of all models that trained on Hanoi and Anytown networks based on the ATD (m) metric. Bold numbers indicating lower values (better performance) in each row.
Model
Name
Model
Type
Partitioning
Type
Hanoi
Train
Hanoi
Test
Anytown
Train
Anytown
Test
KNNRegressorNodal1895195183068553
RegressorRandom1154118053385461
ClassifierNodal810223935879919
ClassifierRandom27227720432087
DTRegressorNodal2266236710,69311,168
RegressorRandom1872197062016526
ClassifierNodal10112905425912,241
ClassifierRandom1086112141214255
LRRegressorNodal2597264911,42511,654
RegressorRandom2173221291789341
ClassifierNodal20124481742316,528
ClassifierRandom868919582005
MLPRegressorNodal1988202910,09210,297
RegressorRandom1929193994459493
ClassifierNodal11702473517310,930
ClassifierRandom13813913531355
SVMRegressorNodal2778284693269555
RegressorRandom2245231173007517
ClassifierNodal9232874390812,174
ClassifierRandom16717110091033
Table 3. The performance of all models that trained on Hanoi and Anytown networks based on the AR metric. Bold numbers indicating lower values (better performance) in each row.
Table 3. The performance of all models that trained on Hanoi and Anytown networks based on the AR metric. Bold numbers indicating lower values (better performance) in each row.
Model
Name
Model
Type
Partitioning
Type
Hanoi
Train
Hanoi
Test
Anytown
Train
Anytown
Test
KNNRegressorNodal12.3312.708.028.26
RegressorRandom4.034.123.473.55
ClassifierNodal4.7012.993.349.23
ClassifierRandom3.353.423.063.13
DTRegressorNodal8.488.869.169.57
RegressorRandom7.718.126.747.09
ClassifierNodal4.2912.333.4810.01
ClassifierRandom7.177.405.355.53
LRRegressorNodal11.4911.738.818.99
RegressorRandom2.882.935.075.16
ClassifierNodal5.3111.824.269.49
ClassifierRandom1.131.154.534.64
MLPRegressorNodal10.8811.108.058.22
RegressorRandom3.533.555.905.93
ClassifierNodal5.6711.994.138.73
ClassifierRandom2.462.472.962.97
SVMRegressorNodal12.0312.339.039.25
RegressorRandom5.715.884.704.84
ClassifierNodal4.4313.813.159.83
ClassifierRandom1.681.722.943.01
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ajoodani, A.; Nazif, S.; Ramazi, P. Towards a Realistic Data-Driven Leak Localization in Water Distribution Networks. Water 2025, 17, 1988. https://doi.org/10.3390/w17131988

AMA Style

Ajoodani A, Nazif S, Ramazi P. Towards a Realistic Data-Driven Leak Localization in Water Distribution Networks. Water. 2025; 17(13):1988. https://doi.org/10.3390/w17131988

Chicago/Turabian Style

Ajoodani, Arvin, Sara Nazif, and Pouria Ramazi. 2025. "Towards a Realistic Data-Driven Leak Localization in Water Distribution Networks" Water 17, no. 13: 1988. https://doi.org/10.3390/w17131988

APA Style

Ajoodani, A., Nazif, S., & Ramazi, P. (2025). Towards a Realistic Data-Driven Leak Localization in Water Distribution Networks. Water, 17(13), 1988. https://doi.org/10.3390/w17131988

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop