# A Probabilistic Short-Term Water Demand Forecasting Model Based on the Markov Chain

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Markov Chain Based Demand Forecasting

#### 2.1. Overview

#### 2.2. The Markov Chain

_{i}(with i = 1,.., N) of the process (see Figure 1).

_{i}(t) that, at a generic time t, the process will be in a generic class c

_{i}, can be defined as:

_{i}(t) represents the i-th component of the row-vector

**p**(t) = [p

_{1}(t), p

_{2}(t),…, p

_{N}(t)], which contains all the probabilities that the process is at the generic time t in each of the N classes. Of course, it results $\sum _{i=1}^{N}{p}_{i}(t)=1$ for every t.

_{i}at time t to a class c

_{j}at the next time t + ∆t, the process undergoes a transition, which is associated with a probability π

_{ij}(t), called transition probability. The probabilities associated with every possible transition from t to t + ∆t are the components making up the transition matrix $\Pi (t)\in {\square}^{NxN}$:

**Π**(t) corresponds to the starting class of the process, and every column to the class of arrival: for instance, the probability π

_{ij}(t) of belonging in class c

_{j}at time t + ∆t starting from class c

_{i}at the preceding time t, is placed in the i-th row and j-th column of the matrix.

**Π**(t) can vary at every step of the process, and this behaviour is generally indicated as a non-homogeneous Markov chain. A homogeneous Markov chain is based, by contrast, on the assumption that the transition probability is independent of t. This condition implies the existence of a single transition matrix,

**Π**, which remains constant with variations in t and is characteristic of the entire process.

**p**(t) the corresponding probability vector of the process and

**Π**(t) the correspondent transition matrix, the probability vector of the process at time t + ∆t,

**p**

^{for}(t+∆t), can be estimated as:

#### 2.3. Demand Forecasting Model

_{1},c

_{2},…, c

_{N}into which the entire range of variability of the water demand can be divided.

_{i}, of the water demand at the current time t and the transition matrix referred to t are known, the Markov chain allows us to define what its state in the future Δt will be in probabilistic terms. In fact, Equation (3) can be used to estimate the probability vector

**p**

^{for}(t + Δt), where

**p**(t) is referred to a real observed value, being in a context of real-time application of the model, thus composed of N − 1 null values and a value of 1 correspondent to the class the demand q(t) belongs to. As regards the transition matrix

**,**it represents a parameter of the model that can be estimated on the basis of the observed water demands used in the model calibration phase, as detailed below. Since it is a calibrated variable, it will be henceforth indicated as $\widehat{\Pi}(t)$. The forecast can be tied up to kΔt ahead using Equation (4), thus estimating the probabilities of the demand to fall within each class at the time t + kΔt,

**p**

^{for}(t + kΔt), using the estimate made one time earlier and the correspondent transition matrix.

^{for}(t + k∆t) at a generic time t + kΔt in the following manner [28].

_{i}(with i = 1,…, N), represented in the vector

**m**= [m

_{1}, m

_{2},…, m

_{N}], is computed using the components ${p}_{i}^{for}(t+k\Delta t)$(with i = 1,.., N) of the probability vector predicted for the time t + kΔt as weights:

**p**

^{for}(t + k∆t) in the case of short-term water demands (for example, to obtain hourly water demand forecasts for the next K = 24 h). Water demands are generally characterised by periodic patterns, present on different time scales. Considering, for example, the hourly water demands over the course of a day, it is possible to observe that they follow a trend or pattern that tends to reflect the type and the habits of the users served. In the case of residential users, the demand trends show morning and evening peaks, reduced demand during the night and variable demand in the afternoon hours; water use may also differ depending on whether it is a weekday or a holiday [2]. In the trends in demand over time, it is thus possible to distinguish different phases—rising, falling, etc.—characterised by a probability of demand transitioning from one class to another, which will clearly vary from phase to phase. As we are dealing with a time series characterised by periodic patterns, it would seem appropriate to use an approach based on a non-homogeneous Markov chain. On the other hand, prior to the application of the model, the demand time series could be suitably normalised (brought to a mean of zero and unit variance), thus creating the conditions for the use of an approach based on a homogeneous Markov chain.

#### 2.3.1. Non-Homogeneous Markov Chain (NHMC) Model

_{1},f

_{2},…, f

_{F}, corresponding to the different rising and falling phases of the pattern. Incidentally, the assumption that each phase is characterised by a single transition matrix implies that the process is described through a sort of sequence of different homogeneous Markov processes.

_{1}, f

_{2},…, f

_{F}, the corresponding transition matrices ${\widehat{\Pi}}_{{f}_{1}},{\widehat{\Pi}}_{{f}_{2}},\dots ,{\widehat{\Pi}}_{{f}_{F}}$can be estimated, using the observed calibration data. An estimation of the generic component ${\widehat{\pi}}_{{f}_{w},ij}$ (with w = 1,…, F and i,j = 1,…, N) of the transition matrix ${\widehat{\Pi}}_{{f}_{w}}$ is made, during the model calibration phase, by counting the transitions from c

_{i}to c

_{j}(with i,j = 1,…, N) between successive pairs of times, for which the starting time belongs to the phase f

_{w}, and then dividing by the total transitions for which the starting time is inside the phase f

_{w}, and which have the class c

_{i}as the starting class, i.e.:

_{i}to class c

_{j}in the consecutive times for which the starting time is inside the phase f

_{w}. It is necessary to highlight that as the number of F phases increases, the accuracy of the estimate of the transition matrices will tend to decrease, because the number of data available for the purpose of the estimate decreases. Therefore, the number of the F phases adopted should not depend only on the trend in demand, but should also take into account the number of observed data available for calibrating the model.

_{w}, in which the starting time t falls, i.e.:

**p**

^{for}at the different time lags will change. The transition matrix used for each forecast will be “moving” with the forecast, instead of being fixed and equal to the one correspondent to t (i.e., the start time of the forecast). Thus, for every time lag k, the water demand of the generic time t + kΔt, which is based on the forecast made one time earlier t + (k − 1)Δt, will be estimated basing on the transition matrix associated with the phase the time t + (k − 1)Δt belongs to (rather than being based on the one correspondent to the time t).

#### 2.3.2. Homogeneous Markov Chain (HMC) Model

^{norm}(t) is the corresponding normalised value, and ${\mu}_{work/non\_work}^{h}$ and ${\sigma}_{work/non\_work}^{h}$, respectively, are the mean and the standard deviation of the data observed in the calibration phase in the h-th hour of the day, corresponding to the time t in which the original data q(t) occurs, a distinction being made between the data related to working days (work) and non-working (non_work).

_{ij}from ${c}_{i}^{norm}$ to ${c}_{j}^{norm}$ (with i,j = 1,…, N) between pairs of successive times within the entire calibration dataset, and dividing by the total number of transitions that have class ${c}_{i}^{norm}$ as the starting class. The transition matrix $\widehat{\Pi}$ thus estimated is used to estimate the probability that the normalised future water demand falls in each of the normalised classes by using the same approach as previously described for the NHMC model (see Equations (7) and (8)). However, in this case, the transition matrix does not change in time. Clearly, in this case, the vector

**p**

^{for}(t + k∆t) (with k = 1,…, 24) provides an estimate of the probability that the normalised water demand will fall into each of the normalised classes ${c}_{i}^{norm}$ (with i = 1,…, N). This information must then be brought back to the original space by de-normalising the values at the ends of the classes using the mean ${\mu}_{work/non\_work}^{h}$ and standard deviation ${\sigma}_{work/non\_work}^{h}$ previously defined at the time of normalisation, and relating to the h-th hour of the day (working or non-working) corresponding to the time t + kΔt considered. For example, with $c{l}_{i}^{norm}$ and $c{u}_{i}^{norm}$ representing, respectively, the lower and upper ends of the i-th normalised class ${c}_{i}^{norm}$, the corresponding de-normalised lower and upper ends cl

_{i}and cu

_{i}are given by:

_{i}—which in the normalised space is the sole class and independent of the time considered—will have a width varying according to the hour and type of day in which the considered time occurs. In particular, the width will increase as the standard deviation ${\sigma}_{work/non\_work}^{h}$ increases. Therefore, for example, the classes corresponding to the hours of peak demand (for example, 7 in the morning), which are characterised by a high variability in water use, will be much wider than those corresponding to night-time hours, which are typically characterised by low variability.

## 3. Case Studies

- DMA1: from 24 March 2011 to 19 December 2011 (270 days)
- DMA2: from 4 May 2011 a 31 October 2011 (181 days)
- DMA3: from 11 April 2011 a 20 November 2011 (224 days)

^{−5}and 2.135 × 10

^{−5}Hz, corresponding to 24 and 48 h. Even though periodicities are not completely removed by normalisation, the power spectral densities of the normalised series are much more smoothed, and the dominant frequencies at 24 and 48 h are less evident, leading to time series whereto the HMC can effectively be applied, as shown in the subsequent analysis of the numerical results. Similar considerations apply to DMA2 and DMA3.

## 4. Results and Discussion

^{obs}(t) is the observed water demands at the time instant t, ${\mu}_{{q}^{obs}}$ is the mean value of the observed demands, q

^{for}(t|t − kΔt) is the forecasted flow rate kΔt instances before t and, finally, nd is the number of data of the forecasted time series.

## 5. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Donkor, E.A.; Mazzucchi, T.A.; Soyer, R.; Roberson, A.J. Urban Water Demand Forecasting: A Review of Methods and Models. J. Water Resour. Plan. Manag.
**2014**, 140, 146–159. [Google Scholar] [CrossRef] - Alvisi, S.; Franchini, M.; Marinelli, A. A short-term, pattern-based model for water-demand forecasting. J. Hydroinform.
**2007**, 9, 39–50. [Google Scholar] [CrossRef] - Herrera, M.; Torgo, L.; Izquierdo, J.; Pérez-Garcìa, R. Predictive models for forecasting hourly urban water demand. J. Hydrol.
**2010**, 387, 141–150. [Google Scholar] [CrossRef] - Bakker, M.; Vreeburg, J.H.G.; van Schagen, K.M.; Rietveld, L.C. A fully adaptive forecasting model for short-term drinking water demand. Environ. Model. Softw.
**2013**, 48, 141–151. [Google Scholar] [CrossRef] - Arandia, E.; Ba, A.; Eck, B.; McKenna, S. Tailoring seasonal time series models to forecast short-term water demand. J. Water Resour. Plan. Manag.
**2016**, 142, 1–10. [Google Scholar] [CrossRef] - Tian, D.; Martinez, C.J.; Asce, A.M.; Asefa, T.; Asce, M. Improving Short-Term Urban Water Demand Forecasts with Reforecast Analog Ensembles. J. Water Resour. Plan. Manag.
**2016**, 142. [Google Scholar] [CrossRef] - Jain, A.; Varshney, A.K.; Joshi, U.C. Short-term water demand forecast modelling at IIT Kanpur using artificial neural networks. Water Resour. Manag.
**2001**, 15, 299–321. [Google Scholar] [CrossRef] - Babel, M.S.; Shinde, V.R. Identifying Prominent Explanatory Variables for Water Demand Prediction Using Artificial Neural Networks: A Case Study of Bangkok. Water Resour. Manag.
**2011**, 25, 1653–1676. [Google Scholar] [CrossRef] - Adamowski, J.; Fung Chan, H.; Prasher, S.O.; Ozga-Zielinski, B.; Sliusarieva, A. Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada. Water Resour. Res.
**2012**, 48, W01528. [Google Scholar] [CrossRef] - Campisi-Pinto, S.; Adamowski, J.; Oron, G. Forecasting Urban Water Demand Via Wavelet-Denoising and Neural Network Models. Case Study: City of Syracuse, Italy. Water Resour. Manag.
**2012**, 26, 3539–3558. [Google Scholar] [CrossRef] - Odan, F.K.; Fernanda, L.; Reis, R. Hybrid Water Demand Forecasting Model Associating Artificial Neural Network with Fourier Series. J. Water Resour. Plan. Manag.
**2012**, 138, 245–256. [Google Scholar] [CrossRef] - Dos Santos, C.C.; Pereira Filho, A.J. Water Demand Forecasting Model for the Metropolitan Area of Sao Paulo, Brazil. Water Resour. Manag.
**2014**, 28, 4401–4414. [Google Scholar] [CrossRef] - Romano, M.; Kapelan, Z. Adaptive water demand forecasting for near real-time management of smart water distribution systems. Environ. Model. Softw.
**2014**, 60, 265–276. [Google Scholar] [CrossRef] - Al-Zahrani, M.A.; Abo-Monasar, A. Urban Residential Water Demand Prediction Based on Artificial Neural Networks and Time Series Models. Water Resour. Manag.
**2015**, 29, 3651–3662. [Google Scholar] [CrossRef] - Shvartser, L.; Shamir, U.; Feldman, M. Forecasting hourly water demands by pattern recognition approach. J. Water Resour. Plan. Manag.
**1993**, 119, 611–627. [Google Scholar] [CrossRef] - Alvisi, S.; Franchini, M. Assessment of the predictive uncertainty within the framework of water demand forecasting by using the model conditional processor (MCP). Urban Water J.
**2017**, 14, 1–10. [Google Scholar] [CrossRef] - Hutton, C.J.; Kapelan, Z. A probabilistic methodology for quantifying, diagnosing and reducing model structural and predictive errors in short term water demand forecasting. Environ. Model. Softw.
**2015**, 66, 87–97. [Google Scholar] [CrossRef] - Azadeh, A.; Neshat, N.; Hamidipour, H. Hybrid Fuzzy Regression-Artificial Neural Network for Improvement of Short-Term Water Consumption Estimation and Forecasting in Uncertain and Complex Environments: Case of a Large Metropolitan City. J. Water Resour. Plan. Manag.
**2012**, 138, 71–75. [Google Scholar] [CrossRef] - Bai, Y.; Wang, P.; Li, C.; Xie, J.; Wang, Y. A multi-scale relevance vector regression approach for daily urban water demand forecasting. J. Hydrol.
**2014**, 517, 236–245. [Google Scholar] [CrossRef] - Froelich, W. Forecasting Daily Urban Water Demand Using Dynamic Gaussian Bayesian Network. In Proceedings of the 11th International Conference on Beyond Databases, Architectures and Structures, Ustroń, Poland, 26–29 May 2015; Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D., Eds.; Springer International Publishing: Cham, Germany, 2015; pp. 333–342. [Google Scholar]
- Froelich, W.; Magiera, E. Forecasting Domestic Water Consumption Using Bayesian Model. In 8th KES International Conference on Intelligent Decision Technologies (KES-IDT 2016)—Part II; Czarnowski, I., Caballero, A.M., Howlett, R.J., Jain, L.C., Eds.; Springer International Publishing: Cham, Germany, 2016; pp. 337–346. [Google Scholar]
- Magiera, E.; Froelich, W. Application of Bayesian Networks to the Forecasting of Daily Water Demand. In Proceedings of the 7th KES International Conference on Intelligent Decision Technologies (KES-IDT 2015), Palace, Italy, 17–19 June 2015; Neves-Silva, R., Jain, L.C., Howlett, R.J., Eds.; Springer International Publishing: Cham, Germany, 2015; pp. 385–393. [Google Scholar]
- Todini, E. A model conditional processor to assess predictivee uncertainty in flood forecasting. Int. J. River Basin Manag.
**2008**, 6, 123–137. [Google Scholar] [CrossRef] - Cutore, P.; Campisano, A.; Kapelan, Z.; Modica, C.; Savìc, D. Probabilistic prediction of urban water consumption using the SCEM-UA algorithm. Urban Water J.
**2008**, 5, 125–132. [Google Scholar] [CrossRef] - Vrugt, J.A.; Gupta, H.V.; Bouten, W.; Sorooshian, S. A Shuffled Complex Evolution Metropolis algorithm for optimization and uncertainty assessment of hydrologic model parameters. Water Resour. Res.
**2003**, 39, 1201. [Google Scholar] [CrossRef] - Morcous, G. Performance Prediction of Bridge Deck Systems Using Markov Chains. J. Perform. Constr. Facil.
**2006**, 20, 146–155. [Google Scholar] [CrossRef] - Yu, G.; Hu, J.; Zhang, C.; Zhuang, L.; Song, J. Short-term traffic flow forecasting based on markov chain model. In Proceedings of the Intelligent Vehicles Symposium, Columbus, OH, USA, 9–11 June 2003; pp. 208–212. [Google Scholar]
- Carpinone, A.; Giorgio, M.; Langella, R.; Testa, A. Markov chain modeling for very-short-term wind power forecasting. Electr. Power Syst. Res.
**2015**, 122, 152–158. [Google Scholar] [CrossRef] - Yapo, P.; Sorooshian, S.; Gupta, V. A Markov chain flow model with application to flood forecasting. Water Resour. Res.
**1993**, 29, 2427–2436. [Google Scholar] [CrossRef] - Benjamin, J.R.; Cornell, C.A. Probability, Statistics, and Decision for Civil Engineers; McGraw-Hill: New York, NY, USA, 1970; pp. 321–352. [Google Scholar]
- Cooley, J.W.; Lewis, P.; Welch, P. The Fast Fourier Transform and its Applications. IEEE Trans. Educ.
**1969**, 12, 28–34. [Google Scholar] [CrossRef] - Gelažanskas, L.; Gamage, K. Forecasting Hot Water Consumption in Residential Houses. Energies
**2015**, 8, 12702–12717. [Google Scholar] [CrossRef] - Nash, J.E.; Sutcliffe, J.V. River Flow Forecasting Through Conceptual Models Part I—A Discussion of Principles. J. Hydrol.
**1970**, 10, 282–290. [Google Scholar] [CrossRef]

**Figure 1.**Reference diagram of a Markov process showing the N classes into which the domain of existence of the variable X(t) is divided, and the probabilities referred to each class at the time t + 1.

**Figure 2.**Estimated posterior probability vector

**p**

^{for}(

**a**) for one step ahead and (

**b**) for k steps ahead, highlighted using a shade of grey for each class.

**Figure 3.**Average daily pattern of hourly demands during working (left-hand column, (

**a**,

**c**,

**e**)) and non-working days (right-hand column, (

**b**,

**d**,

**f**)) and initial and final ends of the time phases f

_{1}, f

_{2}, f

_{3}, and f

_{4}relating to DMA1 (row 1), DMA2 (row 2) and DMA3 (row 3).

**Figure 4.**Comparison of (

**a**) observed and (

**c**) normalised hourly demands for a one-week period in DMA1 and the frequency analysis of the entire (

**b**) observed and (

**d**) normalised time series of DMA1.

**Figure 5.**Nash–Sutcliffe index (NS) values obtained by the non-homogeneous Markov chain (NHMC), homogeneous Markov chain (HMC), artificial neural networks (ANN) and naïve models applied to the calibration (left-hand column, (

**a**,

**c**,

**e**)) and validation (right-hand column, (

**b**,

**d**,

**f**)) data for DMA1 (row 1), DMA2 (row 2) and DMA 3 (row 3).

**Figure 6.**Probabilistic demand forecasts obtained by using model HMC model applied to DMA1 at following times: (

**a**) 1 a.m. (with forecasts made from 2 a.m. to 7 a.m.); (

**b**) 6 a.m. (with forecasts from 7 a.m. to 12 p.m.); (

**c**) 1 p.m. (with forecasts 2 p.m. to 7 p.m.) and (

**d**) 6 p.m. (with forecasts from 7 p.m. to 12 a.m.).

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Gagliardi, F.; Alvisi, S.; Kapelan, Z.; Franchini, M.
A Probabilistic Short-Term Water Demand Forecasting Model Based on the Markov Chain. *Water* **2017**, *9*, 507.
https://doi.org/10.3390/w9070507

**AMA Style**

Gagliardi F, Alvisi S, Kapelan Z, Franchini M.
A Probabilistic Short-Term Water Demand Forecasting Model Based on the Markov Chain. *Water*. 2017; 9(7):507.
https://doi.org/10.3390/w9070507

**Chicago/Turabian Style**

Gagliardi, Francesca, Stefano Alvisi, Zoran Kapelan, and Marco Franchini.
2017. "A Probabilistic Short-Term Water Demand Forecasting Model Based on the Markov Chain" *Water* 9, no. 7: 507.
https://doi.org/10.3390/w9070507