Forecasting Urban Water Demand Using Multi-Scale Artificial Neural Networks with Temporal Lag Optimization

Farah, Elias; Shahrour, Isam

doi:10.3390/w17192886

Open AccessArticle

Forecasting Urban Water Demand Using Multi-Scale Artificial Neural Networks with Temporal Lag Optimization

by

Elias Farah

^1,*

and

Isam Shahrour

²

¹

Department of Civil Engineering, School of Engineering, Holy Spirit University of Kaslik (USEK), Jounieh P.O. Box 446, Lebanon

²

Laboratoire de Génie Civil et Géo-Environnement (LGCgE), Université de Lille, 59650 Villeneuve d’Ascq, France

^*

Author to whom correspondence should be addressed.

Water 2025, 17(19), 2886; https://doi.org/10.3390/w17192886

Submission received: 27 August 2025 / Revised: 28 September 2025 / Accepted: 30 September 2025 / Published: 3 October 2025

(This article belongs to the Section Urban Water Management)

Download

Browse Figures

Versions Notes

Abstract

Accurate short-term forecasting of urban water demand is a persistent challenge for utilities seeking to optimize operations, reduce energy costs, and enhance resilience in smart distribution systems. This study presents a multi-scale Artificial Neural Network (ANN) modeling approach that integrates temporal lag optimization to predict daily and hourly water consumption across heterogeneous user profiles. Using high-resolution smart metering data from the SunRise Smart City Project in Lille, France, four demand nodes were analyzed: a District Metered Area (DMA), a student residence, a university restaurant, and an engineering school. Results demonstrate that incorporating lagged consumption variables substantially improves prediction accuracy, with daily R² values increasing from 0.490 to 0.827 at the DMA and from 0.420 to 0.806 at the student residence. At the hourly scale, the 1-h lag model consistently outperformed other configurations, achieving R² up to 0.944 at the DMA, thus capturing both peak and off-peak consumption dynamics. The findings confirm that short-term autocorrelation is a dominant driver of demand variability, and that ANN-based forecasting enhanced by temporal lag features provides a robust, computationally efficient tool for real-time water network management. Beyond improving forecasting performance, the proposed methodology supports operational applications such as leakage detection, anomaly identification, and demand-responsive planning, contributing to more sustainable and resilient urban water systems.

Keywords:

Artificial Neural Networks (ANNs); prediction; smart metering; temporal lag optimization; water consumption

1. Introduction

Urban water utilities are increasingly facing challenges in managing distribution systems under uncertain and dynamic demand patterns. The growing urban population, climate variability, aging infrastructure, and the need for sustainable resource allocation have prompted utilities to adopt advanced data analytics and predictive tools [1,2]. Accurate short-term and long-term water demand forecasting plays a crucial role in optimizing system operation, enhancing resilience, reducing energy costs, and supporting leak detection mechanisms. In recent years, the integration of smart metering technologies has enabled the collection of high-frequency water consumption data, opening new opportunities for intelligent modeling and decision support in water distribution systems [3,4].

Among the various modeling techniques employed for water demand forecasting, Artificial Neural Networks (ANNs) and hybrid machine learning models have demonstrated significant success due to their ability to capture complex linear and non-linear relationships within time series data [5,6]. ANNs, inspired by the structure of the human brain, function as adaptive systems capable of learning from examples without requiring explicit assumptions about the underlying physical processes [7,8]. This makes them particularly suitable for water-related applications where consumption behaviors are influenced by multiple interdependent and often unobservable factors such as climate, socioeconomic indicators, user habits, and institutional patterns.

Numerous studies have confirmed the effectiveness of ANNs in modeling water demand across different spatial and temporal scales. Herrera et al. [9] developed ANN models to predict urban water demand based on historical data, demonstrating high accuracy in medium-range forecasting. Romano and Kapelan [10] extended this approach by incorporating climatic variables and demand disaggregation to improve model performance. Al-Zahrani and Abo-Monasar [11] applied ANN-based techniques to predict residential water usage in arid regions and showed that neural networks outperformed classical statistical methods under data scarcity.

Recent studies have also explored a variety of data-driven approaches for improving water demand prediction. For instance, Kim et al. [12] showed that Long Short-Term Memory (LSTM) models leveraging smart meter data and incorporating prior-day consumption, weather conditions, and calendar effects significantly outperformed traditional ARIMA methods across residential, commercial, and school buildings, with mean correlation coefficients reaching 89% compared to 62% for ARIMA. These findings highlight the importance of temporal dependencies and external drivers in enhancing prediction accuracy. Complementary research by Adamowski and Karapataki [13] and Bakker et al. [14] emphasized that integrating exogenous information such as climatic variables, day-of-week patterns, and institutional schedules can capture non-linear dynamics often missed by simpler statistical methods.

Despite these advancements, forecasting peak consumption remains a critical challenge. Walker et al. [15] reported that while ANN models were capable of capturing overall consumption patterns, they struggled to predict peak values accurately due to sudden changes in user behavior and the stochastic nature of demand spikes. This highlights the need for improved feature engineering, seasonality detection, and the inclusion of external variables such as holidays and event-based signals, as demonstrated in the Hong Kong study by Wong et al. [16], where calendar and holidays contributed as much as 17% each in explaining urban water consumption variability.

Beyond forecasting accuracy, another major challenge is the treatment of uncertainty. Conventional point forecasts may underestimate risks, particularly when residuals are autocorrelated or when demand spikes occur. In this regard, Li [17] introduced an uncertainty-aware grey forecasting framework that integrates residual modeling and kernel density estimation to construct calibrated prediction intervals. Their results, validated across several Chinese cities, demonstrated that grey-based approaches remain effective when data are scarce or rapidly evolving, while simultaneously providing interval estimates that improve planning reliability. Similarly, Wu et al. [18] applied a weighted grey wave forecasting model to urban domestic water use in Chongqing and achieved high predictive accuracy, underscoring the influence of conservation policies on long-term demand trajectories. These contributions highlight the need not only for accurate point forecasts but also for robust confidence intervals to support risk-sensitive water management.

In addition to forecasting, ANNs have also been applied in anomaly detection and leakage identification. Mounce et al. [19,20] developed a suite of AI-based models combining neural networks with fuzzy inference systems to detect leaks and bursts in water networks using pressure and flow data. Their methodology, tested on a dataset from 144 District Metered Areas (DMAs) in the United Kingdom, demonstrated promising results, with 44% of the generated alerts corresponding to confirmed bursts. Another study by Mahdi et al. [21] employed Artificial Neural Networks for leakage diagnosis and localization in laboratory-scale water distribution systems. Using accelerometer and pressure data transformed into statistical features (autocorrelation coefficient and signal energy), their ANN achieved an accuracy of 86.5% demonstrating the potential of feature-engineered neural models for reliable leak detection and localization.

Building on these insights, the literature confirms the growing relevance of intelligent, multi-scale, and uncertainty-aware approaches to water demand forecasting. The Battle of Water Demand Forecasting (BWDF), organized during the 2024 WDSA/CCWI conference and reported by Alvisi et al. [22], compared multiple forecasting approaches across 10 District Metered Areas (DMAs) in Italy. The study demonstrated that while ANN-based models remain competitive, hybrid methods combining machine learning with time-series analysis often achieved superior performance. Importantly, the BWDF underscored the need for evaluation across multiple case studies and performance indicators to ensure robust conclusions.

This study builds on these insights by proposing an ANN-based methodology with explicit temporal lag optimization to predict water consumption at both daily and hourly resolutions using smart meter data from the SunRise Smart City Project in Lille, France. By focusing on heterogeneous user types, including a university restaurant, an engineering school, and a student residence, the models are trained using historical data with multiple time lags and are calibrated to differentiate between business days, weekends, and holidays to reflect behavioral variations in usage. Forecasting is conducted at both daily and hourly time scales to support operational and strategic planning. The results of this work contribute to advancing predictive analytics for smart urban water networks, with direct implications for operational efficiency, anomaly detection, and long-term resource planning.

2. Materials and Methods

2.1. Background

Artificial Neural Networks (ANNs) provide a flexible framework for capturing the nonlinear relationships inherent in complex processes by learning directly from input-output data pairs [23]. Conceptually, an ANN is composed of interconnected processing elements called neurons, which are organized into an input layer, one or more hidden layers, and an output layer [24]. Each connection between neurons carries a weight that reflects the strength of influence from one neuron to the next. The network essentially defines a mapping from the input space to the output space, which is iteratively adjusted during training.

In a feed-forward neural network architecture, the information flows unidirectionally from input to output layers. Training is typically conducted using the back-propagation algorithm, wherein the error between the predicted output and the actual observed value is calculated at the output layer. This error is then propagated backward through the network to update the connection weights using gradient descent or its variants [25]. The network’s goal is to minimize a predefined loss function, most commonly the Mean Squared Error (MSE), through successive iterations.

To prevent overfitting, a condition where the network learns noise instead of the underlying patterns, the dataset is divided into three subsets: a training set for weight adjustment, a validation set for hyperparameter tuning and early stopping, and a testing set for unbiased evaluation of predictive performance [26]. Overfitting is detected when the validation error begins to increase while training error continues to decrease, signaling that the model is memorizing the training data rather than generalizing.

Determining the optimal number of hidden neurons remains a major challenge in ANN design. While the sizes of the input and output layers are dictated by the dimensions of the input and output variables, the number of neurons in the hidden layers is often established empirically via trial-and-error to balance model complexity and generalization ability. Riad et al. [27] suggested using the mean squared error on validation data as a criterion for selecting the best network configuration.

2.2. Method

This study employs a feed-forward back-propagation ANN implemented using a custom MATLAB R2024a script. Historical water consumption time series data, including lagged variables, were integrated as input features to model temporal dependencies and forecast future consumption patterns.

Two levels of temporal resolution were explored:

Daily Forecasting Analysis: Developed to predict total daily water consumption one day ahead, this model incorporates a lag of one day (i.e., previous day’s consumption, Q_d−1 to account for autocorrelation in usage patterns.
Hourly Forecasting Analysis: Designed for short-term prediction at hourly intervals, three lag configurations were tested: 1 h (Q_h−1), 24 h (Q_h−24), and 168 h (Q_h−168), corresponding to one week), to evaluate the impact of different temporal dependencies.

For daily predictions, the input vector includes:

Day of the Week indicators (DW_i): Seven binary variables (Monday through Sunday) capturing weekly patterns.
Holiday indicators (HO): Accounting for extended breaks such as Christmas or spring vacations.
Special Day indicators (SD): Capturing isolated events (e.g., New Year’s Day, Labor Day).

Each variable is encoded as binary (1 if true, 0 otherwise), ensuring no implicit prioritization of particular days. For hourly analysis, additional input vectors representing the hour of the day (HD_i), a set of 24 binary features, are included to identify intraday patterns.

In the case of daily forecasts, the analysis was restricted to a 1-day lag. Autocorrelation analysis indicated that this horizon provided the strongest and most consistent predictive signal, whereas longer lags (e.g., 7 or 14 days) were found to have irregular correlations across the different AMRs. Furthermore, the use of a 1-day lag was considered consistent with the operational focus of the study, which aims to provide short-term forecasts that can directly support utility tasks such as pump scheduling, pressure management, and daily resource allocation.

Prior to modeling, a pre-processing step is performed to identify and exclude anomalies caused by leaks, sensor faults, or other irregularities. Outlier detection uses Chebyshev’s inequality, filtering values exceeding three standard deviations from the mean, thus ensuring model robustness to noise [28].

The cleaned dataset is divided randomly into three parts: 70% for training, 15% for validation, and 15% for testing. All input and target variables are normalized to the [0, 1] interval using Equation (1) to eliminate scale effects:

\bar{X} = \frac{X - X_{m i n}}{X_{m a x} - X_{m i n}}

(1)

where

\bar{X}

is the normalized input or target variables X, X_min and X_max represent the minimum and the maximum of the observation values respectively. The aim of the normalization is to avoid any prioritization of the variables and to remove any arbitrary effect of similarity between the objects.

The weights of the neural network are adjusted in a way to minimize the mean squared error (MSE) on training set. The MSE is computed by the Equation (2):

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}

(2)

where y_i and ŷ_i are the actual and predictedvalue respectively, ȳ is the mean value of y_i and N is the total number of observations.

The performance of the ANN prediction models is evaluated by both statistical and graphical criteria. The statistical approach consists of the root mean square error (RMSE) and the coefficient of determination (R²). These statistical coefficients are calculated using Equations (3) and (4) respectively:

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}

(3)

R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}}

(4)

The coefficient of determination (R²) varying from 0 to 1 indicates the linear correlation between the variables. The variables that have similar tendency have a correlation factor close to 1. The optimal value for RMSE is equal to 0.

Concerning the graphical criteria, a perfect superposition between the desired and calculated variables is described by a straight line where y = x along the diagonal or along 45 degrees.

2.3. Data Collection

In this study, water consumption data were obtained from a smart metering system implemented on the water supply network of the Scientific Campus at Lille University, situated in northern France. The network spans approximately 15 km and consists primarily of aging grey cast iron pipes, many of which have been in place for over six decades. It provides water to a population of around 25,000 individuals, including students, faculty, staff, and visitors.

To enable real-time monitoring and detailed analysis, the network is equipped with 93 smart meters using Automated Meter Reading (AMR) technology. These meters were installed across 80 buildings and 13 main supply points, allowing for hourly measurement of water usage. The recorded data are transmitted daily to a central system for processing and analysis [29]. The monitoring system supports efficient management of the network by identifying water losses due to leaks, unauthorized usage, or inefficiencies associated with the old infrastructure.

To enable a comprehensive and multi-scale analysis of water consumption patterns, this study utilizes data collected from distinct functional zones within the monitored network. These include a university restaurant (B1), an engineering academic building (B2), student residential facilities (B3), and the aggregated District Metered Area (DMA). The inclusion of these diverse sectors allows for the investigation of consumption behaviors across different building typologies and user profiles. Such heterogeneity in the dataset enhances the generalizability of the findings and enables the development of more robust and context-aware predictive models.

2.4. Data Analysis

The dataset employed in this study includes hourly water consumption records acquired from four distinct Automated Meter Reading (AMR) nodes within a university campus. These nodes include three specific buildings namely, the engineering school (B2), the university restaurant (B1), and a student residential facility (B3), in addition to one general AMR covering the entire District Metered Area (DMA), which encompasses multiple academic, administrative, and residential units. This configuration facilitates a multiscale assessment of consumption behavior, spanning both localized and aggregated water demand patterns.

An analysis of water consumption trends was conducted over the span of a year. Monthly aggregated consumption profiles, illustrated in Figure 1 and Figure 2, show aligned with institutional activity cycles. Specifically, a marked reduction in water consumption is observed during the summer period, which coincides with the academic recess. This decline is particularly prominent in B3 (student residence) and B1 (university restaurant), reflecting the direct impact of occupancy rates and operational schedules on demand. Conversely, during peak academic months, both buildings exhibit elevated consumption, indicative of full-capacity usage and intensified water-dependent activities.

Figure 1 presents the monthly water usage profiles of the three individual buildings. B3 consistently records the highest consumption, corresponding with its residential function and continuous occupancy. B2 demonstrates moderate but relatively stable usage, indicative of regular academic activity. In contrast, B1 displays the lowest overall consumption with notable fluctuations reflecting variations in service utilization over time.

Figure 2 displays the consumption trend for the DMA. As an aggregate of multiple facilities, the DMA shows smoother temporal variations due to the averaging effects of diverse demand profiles. While seasonal troughs are still obvious, the overall amplitude of variation is reduced relative to individual buildings, highlighting the dampening effect of scale on demand variability.

The selected monitoring points represent distinct user profiles within the network. B1 is a university restaurant serving approximately 1250 meals per day during weekday lunch hours. B2 corresponds to the engineering school, hosting about 1450 engineering students, 172 academic/research staff, 70 administrative personnel and 230 guest lecturers. The school operates within two main buildings covering a total area of 23,000 m². B3 is a student dormitory comprising 113 housing units. Finally, the DMA covers a heterogeneous urban sector that includes teaching, research, residential and administrative buildings, thus reflecting a highly diverse consumption profile. Table 1 presents the minimum, maximum, average, and standard deviation of daily and hourly water consumption across all AMRs. The dataset spans a full year of monitoring, providing a strong foundation for analyzing seasonal and temporal variations in consumption. As expected, the DMA registers the highest average daily water demand (388.413 m³/day) and the largest standard deviation (15.871 m³/day), reflecting the cumulative and variable nature of campus-wide consumption. Among the individual AMRs, B3 (student residence) shows the highest average daily flow (45.852 m³/day), followed by B2 (engineering school) with 17.807 m³/day, and B1 (restaurant) with the lowest average of 4.979 m³/day. These values align with the expected functional characteristics of the buildings: the residential building demonstrates continuous usage, the engineering school shows moderate but steady demand, and the restaurant displays relatively low and periodic consumption. The standard deviation values further support these observations, indicating the variability of water use patterns in each facility. These statistics are critical for the subsequent modeling phase, as they reveal the essential dynamics of each data source, including baseline demand and peak loads.

3. Results

3.1. Daily Prediction Analysis

The ANN models were trained and tested using one year of daily water consumption data from four Automatic Meter Reading (AMR) points: DMA (general meter), B1 (University Restaurant), B2 (Engineering School), and B3 (Students’ Residence). Two model configurations were compared:

Baseline model: without historical consumption as an input variable.
Lag model: incorporating the previous day’s consumption (lag = 1 day) in the input matrix.

The statistical accuracy measures are presented in Table 2. Across all AMRs, the inclusion of the 1-day lag substantially improved model performance, indicating a strong autocorrelation in daily water demand. For the DMA, the baseline model achieved an R² of 0.490 in testing, whereas the lag model improved to 0.827, reducing the RMSE from 0.196 to 0.110. For B1, the baseline model already achieved high accuracy (R² = 0.900), reflecting the regular and predictable nature of institutional restaurant demand. However, adding the lag feature yielded a marginal but measurable improvement (R² = 0.902). In B2, the baseline model showed moderate performance (R² = 0.606), which increased markedly to 0.808 with the lag input. Similarly, B3 showed low R² values without lag (0.420), but improved substantially when past-day consumption was included (0.806).

The visual comparisons between actual and predicted daily consumption for representative cases further confirm these statistical findings. For example, Figure 3 illustrates the performance for B2 building, where the predicted values closely follow the 1:1 reference line in both training and testing, with minimal dispersion, highlighting the model’s strong generalization capacity. In contrast, Figure 4 shows the B3 building results, where adding lagged consumption data leads to a closer alignment of predicted and actual values during testing, highlighting the important role of temporal autocorrelation in modeling residential water use. These visual outcomes, together with the statistical metrics, demonstrate that the lag-enhanced ANN configuration can robustly capture daily demand variability across heterogeneous building types.

3.2. Hourly Prediction Analysis

To assess the models’ capacity for near real-time forecasting, hourly consumption data were used with the following input configurations:

No lag: only hour of day, day of week, holiday and special day indicators.
Lag = 1 h: adding the consumption of the previous hour.
Lag = 24 h: adding the consumption from the same hour on the previous day.
Lag = 168 h: adding the consumption from the same hour one week prior.

Table 3 summarizes the performance metrics for each AMR and configuration. Across all sites, the inclusion of a 1-h lag consistently produced the best results.

For the DMA, the no-lag model achieved R² of 0.548, whereas the lag of 1 h model improved it to 0.939, with RMSE reduced from 0.110 to 0.041. Incorporating the 1-day lag variable substantially improves alignment between predicted and observed values as shown in Figure 5, reinforcing the strong temporal dependency of aggregated water demand at the DMA level. This configuration accurately captured daily cycles, peak-hour demands, and nocturnal minima.

B1 also showed strong performance without lag (R² = 0.854) but further improvement with lag of 1 h (R² = 0.928), reproducing both peak lunchtime demands and low-consumption night periods (Figure 6).

In B2, the no-lag model failed to capture peak demands and exhibited binary-like outputs differentiating only between working and non-working days. Lags of 1 h (R² = 0.865) and 24 h (R² = 0.844) improved temporal profiles, though extreme peaks (e.g., 4.93 m³) remained underestimated (Figure 7).

For B3, the best performance was also obtained with lag of 1 h (R² = 0.709) as shown in Figure 8, indicating a strong short-term memory effect in residential hourly consumption.

4. Discussion

The present study evaluated the performance of Artificial Neural Network (ANN) models for daily and hourly water demand forecasting, with particular focus on the effect of incorporating historical consumption values (lagged inputs). Across all AMRs and temporal resolutions, the inclusion of lagged consumption noticeably improved predictive accuracy, verifying the autocorrelation structure in urban water demand time series. The results of this study confirm the established understanding that incorporating lagged consumption features is not merely beneficial but essential for accurate water demand forecasting [9,13,30]. Building on this foundation, this research systematically evaluates and optimizes different lag structures (1 h, 24 h, and 168 h for hourly data; 1 day for daily data), thereby demonstrating how structured lag selection can significantly enhance model accuracy across heterogeneous user types.

In the daily models, the incorporation of the previous day’s consumption (lag = 24 h) produced substantial improvements in R², especially for AMRs with irregular usage patterns (DMA, B3). This is consistent with Boudhaouia and Wira [31], who demonstrated that including prior-day demand is particularly beneficial in contexts with high variability and intermittent peaks. By contrast, the improvement was marginal for the university restaurant (B1), which operates on highly repetitive schedules; here, exogenous calendar-based variables (day of week, holidays) already explain much of the variability, as also noted by Sardinha-Lourenço et al. [32] in their analysis of institutional demand forecasting.

For hourly forecasting, the 1-h lag consistently outperformed other configurations, producing the highest R² values and lowest RMSE across all AMRs. This finding aligns with Herrera et al. (2010) [9], who reported that short-term memory effects dominate in high-frequency demand signals, with autocorrelation coefficients declining rapidly after the first few hours. The strong performance of the 1-h lag model can be attributed to the persistence of consumption patterns within short intervals: the demand at any given moment is often a direct continuation of the preceding hour’s behavior, particularly in settings with stable occupancy such as residential buildings (B3) and institutional facilities (B1, B2). The weak performance of the 168-h lag model suggests that weekly periodicity, while present, is less deterministic than immediate consumption history. Factors such as special events, operational schedule changes, or weather fluctuations may disrupt weekly patterns, rendering them less reliable for real-time forecasting.

Despite the overall strong performance, peak demands, such as the 4.93 m³ event observed in B2, were systematically underestimated. This is a common limitation in ANN-based demand forecasting, as highlighted by Donkor et al. [33], where extreme events are often underrepresented in training data and thus inadequately captured by models optimized for average performance. The underestimation of peaks has operational implications: while mean errors are low, utilities relying solely on such forecasts may underestimate short-term capacity needs. This could be addressed by incorporating event-based variables, weather predictors, or demand amplification factors during historically peak-prone periods [23].

From a utility management perspective, these results highlight the operational value of near-real-time forecasting models that exploit short-term consumption memory. For control applications such as pump scheduling, pressure management, or leakage detection, models with 1-h lag inputs offer a practical balance between computational simplicity and predictive performance. In longer-term planning contexts, daily forecasting with 24-h lag inputs can support demand allocation, resource planning, and tariff structuring, especially in systems serving diverse customer types.

The findings of this study also complement the outcomes of the BWDF [22], which demonstrated that no single model consistently dominated across all DMAs, forecast horizons, and performance metrics. While ANN-based models performed competitively, the benchmark highlighted the superior robustness of hybrid approaches integrating machine learning and time-series methods. In this context, the present study contributes by showing that structured lag optimization enhances ANN accuracy across heterogeneous user types, thereby offering a practical refinement consistent with the comparative framework of forecasting approaches emphasized in the BWDF.

While the ANN models in this study achieved high accuracy, several limitations remain. The absence of meteorological and operational variables, the rarity of extreme events in the training dataset, and the potential lack of model generalizability across different contexts represent key limitations. Future research should explore hybrid approaches combining ANN with statistical decomposition, integrate weather and event-based predictors, and develop probabilistic forecasting schemes to quantify uncertainty and better capture rare peak demand episodes.

5. Conclusions

This study developed feed-forward backpropagation Artificial Neural Network (ANN) models to forecast short-term water demand at both daily and hourly resolutions for a District Metered Area (DMA) and three university buildings with distinct consumption profiles: an institutional restaurant (B1), an engineering school (B2), and a student residence (B3). One year of high-resolution smart metering data was used to evaluate model performance under two configurations: a baseline without historical consumption inputs and an optimized temporal lag approach incorporating prior consumption values. Results demonstrate that the inclusion of lag features substantially enhanced predictive accuracy across all sites. At the daily scale, adding a 1-day lag improved testing R² values from 0.490 to 0.827 for the DMA, 0.900 to 0.902 for B1, 0.606 to 0.808 for B2, and 0.420 to 0.806 for B3. At the hourly scale, the inclusion of a 1-h lag produced similarly strong gains, with final testing R² values of 0.944 (DMA), 0.940 (B1), 0.901 (B2), and 0.738 (B3), accurately reproducing both peak and off-peak demand patterns. These results confirm that temporal autocorrelation—particularly at short time lags—is a dominant driver of water demand variability and that ANN-based models can effectively leverage this property for operational forecasting. The proposed approach is computationally efficient, adaptable to diverse consumption contexts, and suitable for real-time deployment in smart water network management. Potential applications include near real-time leakage detection through deviations from forecasted demand, gap-filling in telemetry datasets, and improved operational planning. Future research should integrate additional exogenous drivers such as climatic, demographic, and event-based factors to further enhance peak demand capture, especially in residential settings with higher behavioral variability.

Author Contributions

Conceptualization, E.F. and I.S.; methodology, E.F. and I.S.; software, E.F.; validation, I.S.; formal analysis, E.F.; data curation, E.F.; writing—original draft preparation, E.F.; writing—review and editing, I.S.; visualization, E.F.; supervision, I.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is unavailable due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Allan, C.; Xia, J.; Pahl-Wostl, C. Climate change and water security: Challenges for adaptive water management. Curr. Opin. Environ. Sustain. 2013, 5, 625–632. [Google Scholar] [CrossRef]
Makropoulos, C.K.; Butler, D. Distributed Water Infrastructure for Sustainable Communities. Water Resour. Manag. 2010, 24, 2795–2816. [Google Scholar] [CrossRef]
Pesantez, J.E.; Berglund, E.Z.; Kaza, N. Smart meters data for modeling and forecasting water demand at the user-level. Environ. Model. Softw. 2020, 125, 104633. [Google Scholar] [CrossRef]
García-Soto, C.G.; Torres, J.F.; Zamora-Izquierdo, M.A.; Palma, J.; Troncoso, A. Water consumption time series forecasting in urban centers using deep neural networks. Appl. Water Sci. 2024, 14, 21. [Google Scholar] [CrossRef]
Bata, M.H.; Carriveau, R.; Ting, D.S.-K. Short-Term Water Demand Forecasting Using Nonlinear Autoregressive Artificial Neural Networks. J. Water Resour. Plan. Manag. 2020, 146, 04020008. [Google Scholar] [CrossRef]
Gagliardi, F.; Alvisi, S.; Franchini, M.; Guidorzi, M. A comparison between pattern-based and neural network short-term water demand forecasting models. Water Supply 2017, 17, 1426–1435. [Google Scholar] [CrossRef]
Alhendi, A.A.; Al-Sumaiti, A.S.; Elmay, F.K.; Wescaot, J.; Kavousi-Fard, A.; Heydarian-Forushani, E.; Alhelou, H.H. Artificial intelligence for water–energy nexus demand forecasting: A review. Int. J. Low-Carbon Technol. 2022, 17, 730–744. [Google Scholar] [CrossRef]
de Souza Groppo, G.; Costa, M.A.; Libânio, M. Predicting water demand: A review of the methods employed and future possibilities. Water Supply 2019, 19, 2179–2198. [Google Scholar] [CrossRef]
Herrera, M.; Torgo, L.; Izquierdo, J.; Pérez-García, R. Predictive models for forecasting hourly urban water demand. J. Hydrol. 2010, 387, 141–150. [Google Scholar] [CrossRef]
Romano, M.; Kapelan, Z. Adaptive water demand forecasting for near real-time management of smart water distribution systems. Environ. Model. Softw. 2014, 60, 265–276. [Google Scholar] [CrossRef]
Al-Zahrani, M.A.; Abo-Monasar, A. Urban Residential Water Demand Prediction Based on Artificial Neural Networks and Time Series Models. Water Resour. Manag. 2015, 29, 3651–3662. [Google Scholar] [CrossRef]
Kim, J.; Lee, H.; Lee, M.; Han, H.; Kim, D.; Kim, H.S. Development of a Deep Learning-Based Prediction Model for Water Consumption at the Household Level. Water 2022, 14, 1512. [Google Scholar] [CrossRef]
Adamowski, J.; Karapataki, C. Comparison of Multivariate Regression and Artificial Neural Networks for Peak Urban Water-Demand Forecasting: Evaluation of Different ANN Learning Algorithms. J. Hydrol. Eng. 2010, 15, 729–743. [Google Scholar] [CrossRef]
Bakker, M.; van Duist, H.; van Schagen, K.; Vreeburg, J.; Rietveld, L. Improving the Performance of Water Demand Forecasting Models by Using Weather Input. Procedia Eng. 2014, 70, 93–102. [Google Scholar] [CrossRef]
Walker, D.; Creaco, E.; Vamvakeridou-Lyroudia, L.; Farmani, R.; Kapelan, Z.; Savić, D. Forecasting Domestic Water Consumption from Smart Meter Readings Using Statistical Methods and Artificial Neural Networks. Procedia Eng. 2015, 119, 1419–1428. [Google Scholar] [CrossRef]
Wong, J.S.; Zhang, Q.; Chen, Y.D. Statistical modeling of daily urban water consumption in Hong Kong: Trend, changing patterns, and forecast. Water Resour. Res. 2010, 46, 3. [Google Scholar] [CrossRef]
Li, J.; Song, S.; Kang, Y.; Wang, H.; Wang, X. Prediction of Urban Domestic Water Consumption Considering Uncertainty. J. Water Resour. Plan. Manag. 2021, 147, 05020028. [Google Scholar] [CrossRef]
Wu, H.a.; Zeng, B.; Zhou, M. Forecasting the Water Demand in Chongqing, China Using a Grey Prediction Model and Recommendations for the Sustainable Development of Urban Water Consumption. Int. J. Environ. Res. Public Health 2017, 14, 1386. [Google Scholar] [CrossRef]
Mounce, S.R.; Boxall, J.B.; Machell, J. Development and Verification of an Online Artificial Intelligence System for Detection of Bursts and Other Abnormal Flows. J. Water Resour. Plan. Manag. 2010, 136, 309–318. [Google Scholar] [CrossRef]
Mounce, S.R.; Mounce, R.B.; Jackson, T.; Austin, J.; Boxall, J.B. Pattern matching and associative artificial neural networks for water distribution system time series data analysis. J. Hydroinform. 2014, 16, 617–632. [Google Scholar] [CrossRef]
Mahdi, N.M.; Jassim, A.H.; Abulqasim, S.A.; Basem, A.; Ogaili, A.A.F.; Al-Haddad, L.A. Leak detection and localization in water distribution systems using advanced feature analysis and an Artificial Neural Network. Desalination Water Treat. 2024, 320, 100685. [Google Scholar] [CrossRef]
Alvisi, S.; Franchini, M.; Marsili, V.; Mazzoni, F.; Salomons, E.; Housh, M.; Abokifa, A.; Arsova, K.; Ayyash, F.; Bae, H.; et al. Battle of Water Demand Forecasting. J. Water Resour. Plan. Manag. 2025, 151, 04025049. [Google Scholar] [CrossRef]
Ghiassi, M.; Zimbra, D.K.; Saidane, H. Urban Water Demand Forecasting with a Dynamic Artificial Neural Network Model. J. Water Resour. Plan. Manag. 2008, 134, 138–146. [Google Scholar] [CrossRef]
Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006; Volume 4. [Google Scholar]
Li, M. Comprehensive Review of Backpropagation Neural Networks. Acad. J. Sci. Technol. 2024, 9, 150–154. [Google Scholar] [CrossRef]
Zhang, C.; Bengio, S.; Hardt, M.; Recht, B.; Vinyals, O. Understanding deep learning (still) requires rethinking generalization. Commun. ACM 2021, 64, 107–115. [Google Scholar] [CrossRef]
Riad, S.; Mania, J.; Bouchaou, L.; Najjar, Y. Rainfall-runoff model usingan artificial neural network approach. Math. Comput. Model. 2004, 40, 839–846. [Google Scholar] [CrossRef]
Barnett, V.; Lewis, T. Outliers in Statistical Data; Wiley: New York, NY, USA, 1994; Volume 3. [Google Scholar]
Farah, E.; Shahrour, I. Leakage Detection Using Smart Water System: Combination of Water Balance and Automated Minimum Night Flow. Water Resour. Manag. 2017, 31, 4821–4833. [Google Scholar] [CrossRef]
Bakker, M.; Vreeburg, J.H.G.; van Schagen, K.M.; Rietveld, L.C. A fully adaptive forecasting model for short-term drinking water demand. Environ. Model. Softw. 2013, 48, 141–151. [Google Scholar] [CrossRef]
Boudhaouia, A.; Wira, P. Comparison of machine learning algorithms to predict daily water consumptions. In Proceedings of the 2021 IEEE International Conference on Design & Test of Integrated Micro & Nano-Systems (DTS), Sfax, Tunisia, 7–10 June 2021; pp. 1–6. [Google Scholar]
Sardinha-Lourenço, A.; Andrade-Campos, A.; Antunes, A.; Oliveira, M.S. Increased performance in the short-term water demand forecasting through the use of a parallel adaptive weighting strategy. J. Hydrol. 2018, 558, 392–404. [Google Scholar] [CrossRef]
Donkor, E.A.; Mazzuchi, T.A.; Soyer, R.; Roberson, J.A. Urban Water Demand Forecasting: Review of Methods and Models. J. Water Resour. Plan. Manag. 2014, 140, 146–159. [Google Scholar] [CrossRef]

Figure 1. Monthly water consumption of B3, B2 and B1 for a whole year.

Figure 2. Monthly water consumption of the general meter DMA for a whole year.

Figure 3. Comparison between the actual and the ANN predicted consumption values for B2 in training and testing for the daily analysis with a lag of 1 day.

Figure 4. Comparison between the actual and the ANN predicted consumption values for B3 in training and testing for the daily analysis with a lag of 1 day.

Figure 5. Comparison between the actual and the ANN predicted consumption values for DMA in training and testing for hourly analysis with a lag of 1 h.

Figure 6. Comparison between the actual and the ANN predicted consumption values for B1 in training and testing for hourly analysis with a lag of 1 h.

Figure 7. Comparison between the actual and the ANN predicted consumption values for B2 in training and testing for hourly analysis with a lag of 1 h.

Figure 8. Comparison between the actual and the ANN predicted consumption values for B3 in training and testing for hourly analysis with a lag of 1 h.

Table 1. Statistical data from the four AMRs used in ANN model.

AMRs	Min		Max		Mean		Standard Deviation
AMRs	m³/Day	m³/h	m³/Day	m³/h	m³/Day	m³/h	m³/Day	m³/h
B1	0.000	0.000	12.000	2.160	4.979	0.206	3.761	0.371
B2	0.460	0.000	58.700	7.980	17.807	0.761	13.139	1.004
B3	11.100	0.220	87.080	9.640	45.852	1.912	11.602	1.037
DMA	97.700	1.020	693.410	47.600	388.413	15.871	145.361	7.835

Table 2. Statistical accuracy measures of the ANN models at training and testing sets for the daily analysis.

AMR	Sets	Baseline Model		Lag = 1 Day
AMR	Sets	RMSE	R²	RMSE	R²
DMA	Training	0.182	0.406	0.112	0.814
DMA	Testing	0.196	0.490	0.110	0.827
B1	Training	0.123	0.845	0.089	0.918
B1	Testing	0.100	0.900	0.094	0.902
B2	Training	0.143	0.620	0.112	0.831
B2	Testing	0.139	0.606	0.110	0.808
B3	Training	0.121	0.313	0.068	0.815
B3	Testing	0.115	0.420	0.068	0.806

Table 3. Statistical accuracy measures of the ANN models at training and testing sets for the hourly analysis.

AMR	Sets	No Lag		Lag = 1 h		Lag = 24 h		Lag = 168 h
AMR	Sets	RMSE	R²	RMSE	R²	RMSE	R²	RMSE	R²
DMA	Training	0.109	0.562	0.039	0.944	0.062	0.857	0.079	0.774
DMA	Testing	0.110	0.548	0.041	0.939	0.070	0.825	0.085	0.738
B1	Training	0.064	0.861	0.043	0.940	0.054	0.900	0.063	0.863
B1	Testing	0.066	0.854	0.046	0.928	0.059	0.884	0.068	0.852
B2	Training	0.070	0.679	0.039	0.901	0.063	0.743	0.071	0.688
B2	Testing	0.084	0.596	0.047	0.865	0.072	0.684	0.069	0.684
B3	Training	0.074	0.547	0.005	0.738	0.083	0.658	0.087	0.625
B3	Testing	0.074	0.555	0.006	0.709	0.091	0.585	0.092	0.576

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Farah, E.; Shahrour, I. Forecasting Urban Water Demand Using Multi-Scale Artificial Neural Networks with Temporal Lag Optimization. Water 2025, 17, 2886. https://doi.org/10.3390/w17192886

AMA Style

Farah E, Shahrour I. Forecasting Urban Water Demand Using Multi-Scale Artificial Neural Networks with Temporal Lag Optimization. Water. 2025; 17(19):2886. https://doi.org/10.3390/w17192886

Chicago/Turabian Style

Farah, Elias, and Isam Shahrour. 2025. "Forecasting Urban Water Demand Using Multi-Scale Artificial Neural Networks with Temporal Lag Optimization" Water 17, no. 19: 2886. https://doi.org/10.3390/w17192886

APA Style

Farah, E., & Shahrour, I. (2025). Forecasting Urban Water Demand Using Multi-Scale Artificial Neural Networks with Temporal Lag Optimization. Water, 17(19), 2886. https://doi.org/10.3390/w17192886

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Forecasting Urban Water Demand Using Multi-Scale Artificial Neural Networks with Temporal Lag Optimization

Abstract

1. Introduction

2. Materials and Methods

2.1. Background

2.2. Method

2.3. Data Collection

2.4. Data Analysis

3. Results

3.1. Daily Prediction Analysis

3.2. Hourly Prediction Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI