Overview , Comparative Assessment and Recommendations of Forecasting Models for Short-Term Water Demand Prediction

The stochastic nature of water consumption patterns during the day and week varies. Therefore, to continually provide water to consumers with appropriate quality, quantity and pressure, water utilities require accurate and appropriate short-term water demand (STWD) forecasts. In view of this, an overview of forecasting methods for STWD prediction is presented. Based on that, a comparative assessment of the performance of alternative forecasting models from the different methods is studied. Times series models (i.e., autoregressive (AR), moving average (MA), autoregressive-moving average (ARMA), and ARMA with exogenous variable (ARMAX)) introduced by Box and Jenkins (1970), feed-forward back-propagation neural network (FFBP-NN), and hybrid model (i.e., combined forecasts from ARMA and FFBP-NN) are compared with each other for a common set of data. Akaike information criterion (AIC), originally proposed by Akaike (1974) is used to estimate the quality of each short-term forecasting model. Furthermore, Nash–Sutcliffe (NS) model efficiency coefficient proposed by Nash–Sutcliffe (1970), root mean square error (RMSE) and mean absolute percentage error (MAPE) are the forecasting statistical terms used to assess the predictive performance of the models. Lastly, as regards the selection of an accurate and appropriate STWD forecasting model, this paper provides recommendations and future work based on the forecasts generated by each of the predictive models considered.


Introduction
The most crucial factor in the planning, operation and management of water distribution systems (WDS) is the satisfaction of consumer demand.The stochastic nature of water demand during the day and week is influenced by several factors; namely, climatic and geographic conditions, commercial and social conditions of people, population growth, industrialisation, technical innovation, cost of supply, and condition of WDS [1][2][3][4].Therefore, water utilities need accurate and appropriate short-term water demand (STWD) forecasts in order to continually satisfy consumers with quality water in adequate volumes, and at reasonable pressures [5][6][7].STWD forecasting is an important component of the successful operation, management, and optimisation of any existing WDS.As a result, the selection of an accurate and appropriate STWD forecasting model is useful for [1,6,[8][9][10][11][12][13][14][15][16]: • explaining day-to-day demand variations • minimising the operating cost of pumping stations • pinpointing possible network failures (e.g., water leaks and pipe bursts) • helping utilities plan and manage water demands for near-term events • optimizing daily operations of the infrastructure (e.g., pump scheduling, control of reservoirs volume, pressure management, and water conservation program) In the light of the above, the first objective of this paper is to present an overview of forecasting methods for STWD prediction.Based on that, the second objective is to conduct a comparative assessment of the performance of alternative forecasting models from the different methods.As regards the selection of an accurate and appropriate model, the third objective of the paper is to present recommendations and future work for the forecasts generated by the forecasting models considered.

Overview of STWD Forecasting Methods
In this section, the overview of univariate time series (UTS), time series regression (TSR), artificial neural network (ANN), and hybrid methods for STWD prediction is presented (see also Table 1).

UTS Forecasting Methods
UTS methods forecast future water demand based on past observations and associated error terms [17,18].
Exponential smoothing, autoregressive (AR), moving average (MA), autoregressive-moving average (ARMA), autoregressive integrated moving average (ARIMA) and seasonal ARIMA (SARIMA) are examples of UTS forecasting models.These models are useful for short-term operational forecasts.However, they may not be the most accurate alternative when weather changes are likely to occur in the underlying determinants of water demands [11,18].Furthermore, it is discussed in [11] that stochastic process models (i.e., AR, MA, ARMA, and ARIMA) are used since exponential smoothing models sometimes cease to be adequate when time series data exhibit more complex profiles.Based on that, to achieve the second objective of this paper, the model processes of AR(p), MA(q), and ARMA(p, q) are respectively considered as given in Equations ( 1)-(3) [17,18].
where p and q are the model orders, φ is the autoregressive parameter, θ is the moving average parameter, µ is the mean value of the process, and t is the forecast error at time t.Y t is the observed value of demand at time t, k is the number of historical periods, Y t−k and ε t−k are the observation at time t−k.

Time Series Regression (TSR) Forecasting Methods
Unlike the UTS models, TSR forecasting models consider the effects of exogenous variables.This is because they generate forecasts based on the relationship between water demand and its determinants [19][20][21].TSR models include multiple linear regression (MLR), multiple and nonlinear regression (MNLR), ARMA with exogenous variable (ARMAX) and ARIMA with exogenous variable (ARIMAX).Among others, the ARMAX(p, q, b) model is considered to achieve the second objective of this paper.Equation ( 4) is useful in a case where the demand at time t is influenced by MA and AR terms, in addition to exogenous variables and their autoregressive terms [11].
where b is a single exogenous variable considered for the ARMAX model.Additionally, β k and x t−k are respectively the coefficient and observed value of the kth independent variable.
where p is the number of hidden nodes, h is the number of input nodes, f is a sigmoid transfer function, α j is the vector of the weights from hidden to the output nodes, β ij are the weights from the input to hidden nodes, and α 0 and β 0j are the weights of the arcs leaving from the bias terms.

Hybrid Forecasting Methods
Forecasting with hybrid models (i.e., combined forecasts from two or more predictive models) has found wide application [6,11,24,[30][31][32][33][34], since it leads to better forecasting performance.For instance, Equation ( 6) is applied in a case where forecasts from different models are combined in order to obtain a hybrid forecast.As regards achieving the second objective of this paper, the combined forecast is obtained by using a UTS model (i.e., ARMA) and an ANN model (i.e., FFBP-NN).
where Ŷi,t is the predicted value of the time series at time t using the i th model, β 0 is the regression intercept, β i coefficients are determined by optimisation or least squares regression to minimise the mean square error (MSE) between the hybrid forecast Ŷi,t and the actual data [11].

Forecasting Methods and Models Quantitative Assessment of Forecast Accuracy Forecast Purpose
UTS models [18,27,29]: MA, AR, ARIMA, exponential smoothing, ARMA, SARIMA It can exhibit more complex profiles.However, it does not account for the effect of exogenous variables (e.g., weather data or price) [11].
Useful for short-term operational forecasts (i.e., to minimise the operating cost of pumping stations, etc.) TSR models [1,25,26]: MNLR, ARMAX, MLR and ARIMAX TSR models produce forecasts on the basis of the relationship between water demand and its determinants (e.g., weather data, income, demographics) [19].
Useful for better prediction of daily water demand [24].Relevant for setting water rates, revenue forecasting, and financial planning exercises.
Useful for a better prediction of peak daily water demand.
To inform optimal operating policy as well as pumping and maintenance scheduling.

Presentation and Discussion of Results
In this paper, ARIMA-based models (i.e., AR, MA, ARMA, ARMAX) together with the widely used non-parametric forecasting model, FFBP-NN, have been compared with each other and against the hybrid model, a combination of two or more forecasting models (i.e., ARMA and FFBP-NN) for a common set of data (see Figure 1).Figure 1a shows the average water consumption for the 24 h of each day for a city in south-eastern Spain, obtained from all the available data provided in [1].The predictive models considered in this paper were used to forecast hourly water demands.In addition, an average weekly data of 168 h was used, and based on that, the proportion of data used for the training and testing were 60% and 40% respectively.
Figure 1a shows a similar behaviour during the early morning (e.g., all curves grow from 6:00 a.m. until 10:00 a.m.).In addition, from 10:00 a.m. to 4:00 p.m., all the curves have decreasing and increasing trend (except on weekends).According to [1], temperature is said to be the main factor that influences multiple sources of water consumption (e.g., showers, water for garden, etc.).Hence, Figure 1b shows the single exogenous variable considered for ARMAX model.
The results shown in Figures 2-5 are obtained by computing Equations ( 1)-( 6) in MATLAB.Figures 2-4 show the forecasts generated by AR and MA (see Figure 2), ARMA and ARMAX (see Figure 3), as well as FFBP-NN and hybrid model (see Figure 4).Figure 5a-c show the comparative assessment of the predictive performance of these models by using forecasting statistical terms such as root mean square error (RMSE), mean absolute percentage error (MAPE), and Nash-Sutcliffe (NS) [39].This assessment was achieved by computing Equations ( 7)-( 9).In addition, the estimate of the relative quality of AR, MA, ARMA, ARMAX, and FFBP-NN is shown in Figure 5d, and it was obtained by applying Akaike information criterion (AIC) [40], which is based on Equation (10).The forecasts presented in this paper were generated using the best model order, which is determined by the AIC.Figures 2a,c, 3a,c, and 4a,c were obtained by using the training dataset, whereas the test dataset was used to obtain Figures 2b,d, 3b,d, and 4b,d.Based on the application of AIC [40], Figure 2a,b show that a model process of AR(p = 2) was used to generate the forecasts for the training and test datasets.A model process of MA(q = 3) was also used to obtain the forecasts presented in Figure 2c,d.The forecasts shown in Figure 3a,b were obtained based on a model process of ARMA(p = 1, q = 1).The results of Figure 3c,d were obtained using a model process of ARMAX(p = 1, q = 1, b = 1).A model order of three was used to obtain the results shown in Figure 4a,b.The configuration of the neural network was achieved with a feed-forward neural network of one hidden layer (10 hidden neurons) using a Levenberg-Marquardt optimisation-based backpropagation algorithm to train the neural network weights.The training was stopped using validation data (15% of the training datasets).This process was performed 10 times (i.e., 10 cross-validation) to select the feed-forward neural network with the best predictive accuracy to compensate for neural network training variations.The model orders mentioned in this paper are the best, and were obtained using AIC.Lastly, the optimal weighting of the hybrid forecast obtained using ARMA and FFBP-NN-as shown in Figure 4c,d-was achieved by using linear least square optimisation.The RMSE and MAPE were used to evaluate the forecasting accuracy of the predictive models.In addition, NS was used to estimate the forecasting power of the models.The results of Figures 2b,d, 3b,d, and 4b,d show that the hybrid model was the best forecasting model for STWD prediction (i.e., RMSE = 0.82, MAPE = 3.56%, NS = 0.98) followed by ARMAX (i.e., RMSE = 1.03,MAPE = 3.86%, NS = 0.95), ARMA (i.e., RMSE = 1.85,MAPE = 7.63%, NS = 0.91), MA (i.e., RMSE = 2.59, MAPE = 11.42%,NS = 0.81), AR (i.e., RMSE = 2.67, MAPE = 11.59%,NS = 0.8), and FFBP-NN (i.e., RMSE = 2.8, MAPE = 12.31%, NS = 0.78).In addition, the plots of RMSE, MAPE, and NS versus model order variation are also presented in Figure 5a-c.Compared to AR, MA, ARMA, and FFBP-NN, Figure 5d shows that the AIC value for ARMAX is the smallest.This implies that the quality of the ARMAX model compared to others (i.e., AR, MA, ARMA, and FFBP-NN) is estimated to be the best.The predictive accuracy of all models decreases as the model order increases.For instance, FFBP-NN model had a remarkable decrease in accuracy compared to other models.Due to the additional piece of information (i.e., relative temperature) as shown in Figure 1b, the results obtained in Figures 3 and 5 show that ARMAX(1,1,1) provided a better forecast than ARMA (1,1).Generally, based on the forecasting statistical terms considered in this paper, the comparative assessment shows that for STWD forecasting, the hybrid model (combined forecast from ARMA and FFBP-NN) was the best model, followed by ARMAX, ARMA, MA, AR, and FFBP-NN.

MSE
where Y t is the real observation, Ŷt is the forecast value at time t, and µ Y t is the mean of real observation.RSS is the estimated residual of fitted model, and k is the number of estimated parameters in the model.

Recommendations of STWD Forecasting Models and Future Work
As regards the selection of accurate and appropriate forecasting models for STWD prediction, this section of the paper presents recommendations and future work based on the forecasts generated by AR, MA, ARMA, ARMAX, FFBP-NN, and hybrid models.
Concerning UTS forecasting models (i.e., AR, MA, and ARMA), the results obtained in Figures 2 and 3a,b show that ARMA is the best predictive model.It is useful for STWD operational forecasts to minimise the operating cost of pumping stations [1,6,15,16,18].However, as regards influencing future water demand, a major criticism of UTS predictive models is their failure to account for the effects of changing exogenous variables [11,18].In reference to UTS models, TSR models (i.e., ARMAX) is preferred since it offers a straightforward framework for quantifying the effects of exogenous variables (e.g., weather data, demographics) [11,19,[24][25][26]. Figure 3d shows that the forecast generated by ARMAX is useful for better prediction of daily water demand and for setting water rates.
It is discussed in the scientific literature that ANN models (i.e., FFBP-NN) are designed to detect complex nonlinear relationships that may be harder to summarise.In addition, it is also discussed that it is useful for a better prediction of peak daily water demand to inform optimal operating policy as well as pumping and maintenance scheduling [1,5,24,[26][27][28][29]35].Nonetheless, it requires greater computational resources than most STWD forecasting methods [11].Compared with AR, MA, ARMA, ARMAX, and hybrid model, the results obtained show that the forecasting performance of FFBP-NN was the least [24,25].However, by combining the forecasts generated by FFBP-NN and ARMA, the result obtained in Figure 4d shows that the best forecasting performance was obtained.This shows that if ARMAX and FFBP-NN are used to generate a hybrid forecast, a better forecast compared to the combination of ARMA and FFBP-NN will be obtained.Hybrid forecasting is necessary for operational purposes because it is useful for real-time near-optimal control of WDS [11,[33][34][35][36][37][38].
This study shows that UTS models (i.e., ARMA), TSR models (i.e., ARMAX), and hybrid model (combined forecast from two or more models such as ARMA and FFBP-NN) may be considered as the accurate and appropriate models for STWD prediction.However, these models are not applicable in more general decision problem frameworks, since they cannot be used to understand and analyse the overall level of uncertainty in future demand forecasts.Therefore, much more attention needs to be given to probabilistic forecasting methods for STWD prediction, since such best single valued forecasts obtained by hybrid model do not guarantee reliable and robust decisions, which can only be obtained via Bayesian Decision approaches requiring the estimation of the full predictive density [11,15,[41][42][43][44][45][46][47].Furthermore, given that the main objective of WDS management is to guarantee short-term user's demand, alternative approaches to predicting a future expected value as described in this paper will be analysed in the future.These approaches [15,42], based on the Bayesian maximisation of an "expected utility function", require forecasting the entire predictive density instead of the sole expected value, and can guarantee more reliable and robust decisions.

Conclusions
The main objective of WDS management is to guarantee short-term user demand, which implies making real-time rational decisions based on the best available information on future user demand.Deterministic forecasts such as the ones described in this paper are insufficient to provide the predictive probability distribution of future demand, conditional upon models' forecasts, which can be regarded as the maximum information to be used in any educated decision making process.
The selection of an accurate and appropriate STWD forecasting model is useful for the successive assessment of such predictive probability distribution.As a result, this paper overviews the forecasting methods and models for STWD prediction, assesses the the forecasting performances of AR, MA, ARMA, ARMAX, FFBP-NN, and hybrid model from the different methods overviewed, and provides recommendations and future work for the forecasts generated by these predictive models.
Furthermore, the forecasts generated by AR, MA, ARMA, ARMAX, FFBP-NN, and hybrid model (i.e., combined forecast using ARMA and FFBP-NN) have been compared with each other for a common set of data.AIC is used to estimate the quality of each model and forecasting statistical terms; namely, RMSE, MAPE, and NS model efficiency coefficient are used to assess the predictive performance of these models.The comparative assessment of the forecasting models show that ARMA, ARMAX, and the hybrid model may be considered as the best conditioning candidates for the assessment of the predictive probability distribution of future demands.
In a successive paper, we will show how to derive the above-mentioned predictive probability distribution conditional on one or more predictive models as the fundamental tool for estimating expected benefits (or expected losses) to be maximised (or minimised), within a Bayesian decision making framework.

Figure 1 .
Figure 1.(a) Daily water demand profile and (b) Single exogenous variable "relative temperature".

Figure 4 .Figure 5 .
Figure 4. Forecasts generated using (a,b) FFBP-NN model and (c,d) Hybrid model.The hybrid forecast was obtained by the combined forecast from ARMA and FFBP-NN.

Table 1 .
Brief summary of short-term water demand (STWD) forecasting methods and models.