Hourly Urban Water Demand Forecasting Using the Continuous Deep Belief Echo State Network

Effective and accurate water demand prediction is an important part of the optimal scheduling of a city water supply system. A novel deep architecture model called the continuous deep belief echo state network (CDBESN) is proposed in this study for the prediction of hourly urban water demand. The CDBESN model uses a continuous deep belief network (CDBN) as the feature extraction algorithm and an echo state network (ESN) as the regression algorithm. The new architecture can model actual water demand data with fast convergence and global optimization ability. The prediction capacity of the CDBESN model is tested using historical hourly water demand data obtained from an urban waterworks in Zhuzhou, China. The performance of the proposed model is compared with those of ESN, continuous deep belief neural network, and support vector regression models. The correlation coefficient (r2), normalized root-mean-square error (NRMSE), and mean absolute percentage error (MAPE) are adopted as assessment criteria. Forecasting results obtained in the testing stage indicate that the CDBESN model has the largest r2 value of 0.995912 and the smallest NRMSE and MAPE values of 0.027163 and 2.469419, respectively. The prediction accuracy of the proposed model clearly outperforms those of the models it is compared with due to the good feature extraction ability of CDBN and the excellent feature learning ability of ESN.


Introduction
Precise short-term prediction of urban water demand provides guidance for the planning and management of water resources and plays an important role in the economic operation of a water supply system.Therefore, various water demand prediction models, such as support vector regression (SVR) [1,2], random forests regression [3], artificial neural network (ANN) [4], Markov chain model [5], and hybrid models [6][7][8][9], have been widely developed in the past few decades.Research regarding water demand prediction generally focuses on methods involving ANN, which are nonparametric data-driven approaches applicable for building nonlinear mapping from input to output variables for estimating nonlinear continuous functions with an arbitrary accuracy [10].
For example, Jain et al. [11] compared ANNs, a time series model, and a regression model for the weekly water demand prediction of the Indian Institute of Technology in Kanpur, India, and found that the ANNs outperformed the two other methods.Adamowski [12] applied an ANN model to forecast peak daily urban water demands and achieved a high prediction accuracy.Bennett et al. [13] used an ANN model to predict the residential water end-use demand and confirmed that the ANN Water 2019, 11, 351 2 of 12 model is a useful predictive tool.Al-Zahrani and Abo-Monasar [14] developed a hybrid approach of a time series model and ANNs to forecast the daily water demand of Al-Khobar City.The hybrid approach provided better prediction than single ANN or time series models.Although the ANN model has a favorable performance for water demand prediction, the use of ANN still suffers from some disadvantages, including the difficulty associated with selecting optimal network parameters, a high propensity for becoming trapped in local minima, and a poor global search capability.These difficulties may lead to specific problems, such as overfitting.
The deep belief network (DBN) model, which was proposed by Hinton et al. [15], is a deep learning algorithm based on a probability generative model.Unlike the ANN model, DBN effectively avoids overfitting problems with a distinctive unsupervised training method.A DBN model has many hidden layers that are constructed by the stacking of numerous restricted Boltzmann machines (RBMs).A DBN extracts the latent features of the training dataset by using a greedy layer-wise unsupervised learning method.Specifically, layer-by-layer independent training is implemented to pre-train the initial network weights, with each layer acquiring the features of the previous layer; finally, the network returns the features of the training sample.Following independent training, the weights are fine-tuned using a back-propagation (BP) learning algorithm to achieve a powerful nonlinear expressive capacity.In recent years, the DBN model has been successfully applied in many fields, such as natural language understanding [16], image classification [17,18], fault diagnosis [19], financial prediction [20], load prediction [21], and flow prediction [22,23].Moreover, DBN models have demonstrated remarkable potential for time series prediction [24].Kuremoto et al. [25] developed a DBN model with three layers applied to predict time series.Qin et al. [26] developed a combined approach based on DBN and an autoregressive integrated moving average model for red tide time series prediction.Xu et al. [27] constructed a continuous deep belief neural network (CDBNN) to forecast a daily water demand time series.However, DBN or CDBNN models that use BP learning algorithms to adjust parameters have slow convergence and easily fall into local optima, thereby resulting in an unsatisfactory prediction accuracy [28,29].
A new recurrent neural network model called the echo state network (ESN), which was proposed by Jeager et al. [30][31][32], has a large, sparse, recursively connected reservoir and a linear output.The reservoir serves as an echo for storing historical information.The input and the internal connection weights of the reservoir remain unchanged after the initial setting.Only the output weight must be solved by the linear regression method.Therefore, training the ESN model becomes a task of linear regression.The learning algorithm is simple, the calculation speed is fast, and the solution is unique and globally optimal; moreover, the algorithm shows an excellent performance in nonlinear time series modeling and prediction [33][34][35].Sun et al. [29] introduced the ESN algorithm to a DBN and proposed the deep belief echo state network model for time series forecasting.However, the DBN model composed of RBMs can only reconstruct symmetric analog data [36].Actual water demand data are continuous; therefore, an advanced deep learning architecture is needed for effectively forecasting the urban water demand.
In this study, a hybrid deep architecture continuous deep belief ESN (CDBESN) model, which is composed of continuous DBN (CDBN) and ESN models, is proposed and applied to forecast the hourly urban water demand.In this new architecture, the CDBN model in the bottom layer is used to extract features of the original water demand data, and the ESN model in the top layer is adopted for feature regression.This method can process real continuous data and also avoid the local optimum and slow convergence caused by BP learning algorithms.
The rest of this paper is organized as follows.Section 2 details the methodology of the CDBN, ESN, and CDBESN models.Section 3 presents the study area, data, and the performance evaluation indexes.Section 4 discusses the CDBESN model, forecasting results, and comparisons with other models.Finally, Section 5 explains the conclusions.

Continuous Deep Belief Network
Chen and Murray found that an RBM with binary random units can only reconstruct symmetric analog data, and they developed a continuous RBM (CRBM) [36] with visible and hidden layer units with a continuous state, thereby enabling the CRBM to process real continuous data.Multiple CRBMs are used to stack a CDBN model, which can deal with continuous data and be used to extract features from original water demand data.The structure of the CDBN model is shown in Figure 1.
Water 2019, 11, x FOR PEER REVIEW 3 of 12

Continuous Deep Belief Network
Chen and Murray found that an RBM with binary random units can only reconstruct symmetric analog data, and they developed a continuous RBM (CRBM) [36] with visible and hidden layer units with a continuous state, thereby enabling the CRBM to process real continuous data.Multiple CRBMs are used to stack a CDBN model, which can deal with continuous data and be used to extract features from original water demand data.The structure of the CDBN model is shown in Figure 1.

Input Layer
Output Layer Hidden Layer In Figure 1, a typical CDBN model is stacked by using l CRBMs, which present an input or visible layer, an output layer, and l−1 hidden layers.Here, W l represents the weight matrix of the layers l and l−1.A typical CRBM model is marked by blue dashed lines in Figure 1, which is constructed from a hidden layer h and a visible layer v. Symmetric connections of the weight matrix exist between the two layers, but no such connections are present within a layer.
For a CRBM, sj and si denote the states of the hidden layer unit j and the visible layer unit i, respectively; moreover, wij denotes the interconnected weights of the units j and i.A group of samples is randomly chosen as input data, and the update rule of the states sj of the hidden layer unit is given as follows: (0,1) with where (0,1) j N stands for a Gaussian unit with unit variance and a zero mean, s denotes a constant, and ( ) j x j represents a sigmoid function with asymptotes at min q and max q .The noise-control parameters j a controls the slope of the sigmoid function and thus the nature of the stochastic behavior of the unit [36]. with where, as before, (0,1)  In Figure 1, a typical CDBN model is stacked by using l CRBMs, which present an input or visible layer, an output layer, and l−1 hidden layers.Here, W l represents the weight matrix of the layers l and l−1.A typical CRBM model is marked by blue dashed lines in Figure 1, which is constructed from a hidden layer h and a visible layer v. Symmetric connections of the weight matrix exist between the two layers, but no such connections are present within a layer.
For a CRBM, s j and s i denote the states of the hidden layer unit j and the visible layer unit i, respectively; moreover, w ij denotes the interconnected weights of the units j and i.A group of samples is randomly chosen as input data, and the update rule of the states s j of the hidden layer unit is given as follows: with where N j (0, 1) stands for a Gaussian unit with unit variance and a zero mean, σ denotes a constant, and ϕ j (x) represents a sigmoid function with asymptotes at θ min and θ max .The noise-control parameters a j controls the slope of the sigmoid function and thus the nature of the stochastic behavior of the unit [36].s j is used to compute the states s i of visible layer units: with where, as before, N i (0, 1), ϕ i (x i ), and a j represent a Gaussian unit, a sigmoid function, and noise-control parameters, respectively.
s i is used to compute the states s j of the hidden layer units: After minimizing contrastive divergence [37], algorithms are introduced into the CRBM as a simple training rule, the weights w ij and the parameters a i and a j are updated as follows: ) where η w and η a stand for the learning rates of the weights and noise-control parameters, respectively; s j and s i represent the states of a single-step sample of the hidden layer unit j and the visible layer unit i, respectively; and < • > denotes the average value of the training dataset.
The next training process is carried out after the change of the weight matrix is minimal or the preset maximum training time is achieved.Such conditions indicate that the current CRBM training is completed, and its outputs are used as the inputs of the following CRBM.The aforementioned training process is repeated until all CRBMs of the CDBN model are trained completely, and the training of the CDBN model is ended.

Echo State Network
The ESN model is a novel large-scale recurrent neural network, whose core is a reservoir layer consisting of numerous randomly generated and sparsely connected neurons [29].The structure of the ESN model is shown in Figure 2. In the figure, the ESN model consists of an input layer, an output layer, and a reservoir layer.Here, W in represents the weight matrix of the input layer, W is the internal weight matrix of the reservoir, W o is the weight matrix of the output layer, and W b is the feedback weight matrix.The values of W in , W, and W b are randomly produced during the initialization process and cannot be changed after generation.Only the value of W o must be adjusted during the training process of the reservoir.' i s is used to compute the states ' j s of the hidden layer units: ' ' (0,1) After minimizing contrastive divergence [37], algorithms are introduced into the CRBM as a simple training rule, the weights ij w and the parameters i a and j a are updated as follows: ( ) where w h and a h stand for the learning rates of the weights and noise-control parameters, respectively; '

Echo State Network
The ESN model is a novel large-scale recurrent neural network, whose core is a reservoir layer consisting of numerous randomly generated and sparsely connected neurons [29].The structure of the ESN model is shown in Figure 2

Input Layer
Output Layer Reservoir M , N , and L denote the numbers of the input, reservoir, and output units, respectively.The input vector ( ) u t , state connection vector ( ) z t , and output vector ( ) y t can be expressed as follows: ( ) ( ) ( ), ( ),..., ( ) M, N, and L denote the numbers of the input, reservoir, and output units, respectively.The input vector u(t), state connection vector z(t), and output vector y(t) can be expressed as follows: Water 2019, 11, 351 5 of 12 The reservoir state z(t + 1) and network output y(t + 1) at time t + 1 are updated in accordance with the following rules: where f 1 (•) and f 2 (•) are the activation functions of the reservoir and output, respectively.In this study, f 1 (•) is selected as the hyperbolic tangent, and f 2 (•) is the identity function.
To eliminate the influence of the random initial states of the reservoir, a small number of the reservoir states are abandoned.Moreover, the rest states of the reservoir are collected into a matrix Z and used as corresponding desired target outputs into a target output matrix Y.Then, the output weight matrix W o is computed using a linear regression approach by minimizing the target function of error between the network output and the desired output Y, which is given by: where • stands for the Euclidean norm.The weight matrix W o of the output layer can typically be computed using the Moore-Penrose-inversion method: where At this point, ESN training is completed, and the model can be used for specific problems, such as time series modeling.
To eliminate the influence of the random initial states of the reservoir, a small number of the reservoir states are abandoned.Moreover, the rest states of the reservoir are collected into a matrix Z and used as corresponding desired target outputs into a target output matrix Y .Then, the output weight matrix o W is computed using a linear regression approach by minimizing the target function of error between the network output and the desired output Y , which is given by: where ⋅ stands for the Euclidean norm.The weight matrix o W of the output layer can typically be computed using the Moore-Penrose-inversion method: where † 1 =( ) At this point, ESN training is completed, and the model can be used for specific problems, such as time series modeling.

CDBESN Model
The CDBN and ESN models are integrated to construct a new deep architecture CDBESN model for the prediction of hourly urban water demand.The structure of the CDBESN model is shown in Figure 3.The model consists of a CDBN with l CRBMs in the bottom layer and ESN in the top layer.Accordingly, the learning process of the CDBESN model includes two stages: feature extraction and regression.

Continuous Deep Belief Network
Input Layer Output Layer Hidden Layer In the first stage, a CDBN model is trained in a greedy layer-by-layer unsupervised learning approach and applied to learn the potential nonlinear feature of the original hourly urban water demand data.The output states of the last CRBMs in the CDBN model are the most representative features learned from the hourly water demand data.In the second stage, these features learned by In the first stage, a CDBN model is trained in a greedy layer-by-layer unsupervised learning approach and applied to learn the potential nonlinear feature of the original hourly urban water demand data.The output states of the last CRBMs in the CDBN model are the most representative features learned from the hourly water demand data.In the second stage, these features learned by the CDBN model are used as the input of the ESN model for regression.Finally, the trained CDBESN model can be applied for the prediction of hourly water demand.

Study Area and Data Collection
In this study, 7800 hourly water demand data were collected from an urban waterworks of Zhuzhou, China from 1 January 2016 to 21 November 2016.The waterworks has a capacity of 15,000 m 3 /h, and supplies water to about 600,000 urban residents and factories in that region with an area of about 500 km 2 .The original hourly water demand data were divided into two parts: 84% of the data (the first 6552 hourly data, from 1 January 2016 to 30 September 2016) were used to train the CDBESN model, and the remaining 16% were applied for the testing dataset.Figure 4 shows the original hourly water demand records obtained from the urban waterworks.the CDBN model are used as the input of the ESN model for regression.Finally, the trained CDBESN model can be applied for the prediction of hourly water demand.

Study Area and Data Collection
In this study, 7800 hourly water demand data were collected from an urban waterworks of Zhuzhou, China from 1 January 2016 to 21 November 2016.The waterworks has a capacity of 15,000 m 3 /h, and supplies water to about 600,000 urban residents and factories in that region with an area of about 500 km 2 .The original hourly water demand data were divided into two parts: 84% of the data (the first 6552 hourly data, from 1 January 2016 to 30 September 2016) were used to train the CDBESN model, and the remaining 16% were applied for the testing dataset.Figure 4 shows the original hourly water demand records obtained from the urban waterworks.

Performance Indices
In the experiments, the correlation coefficient (r 2 ), normalized root mean-square error (NRMSE), and mean absolute percentage error (MAPE) were employed to measure the prediction accuracy of the hourly urban water demand forecasting model.The respective equations were defined as follows:

Performance Index
In the experiments, the correlation coefficient (r 2 ), normalized mean-square error (NRMSE), and mean absolute percentage error (MAPE) were employed to measure the prediction accuracy of the hourly urban water demand forecasting model.The respective equations were defined as follows: where y(t) and ŷ(t) are the actual data and prediction data, respectively; y(t) and ŷ(t) are the means of the actual data and prediction data, respectively; and n is the number of prediction data.r 2 describes the linearity between the actual data and prediction data, NRMSE signifies the total accuracy of the prediction, and MAPE represents an unbiased estimator for assessing the predictive ability of a model.A large r 2 and small NRMSE and MAPE values indicate that the model has superior predictive capability.

CDBESN Modeling
The modeling process of CDBESN selects the optimal parameters of the CDBN and ESN models.The numbers of input layer units, hidden layers, and hidden layer units are the major parameters in the CDBN architecture.Currently, no mature theory guides the selection of the numbers of input Water 2019, 11, 351 7 of 12 layer units, hidden layers, and hidden layer units.Thus, an experiment was conducted to determine the three parameters in this study.The numbers of input layer units ranged from 3 to 10, which corresponds to the number of actual historical data related to the prediction data.The numbers of hidden layer units were set to 5, 10, 15, 20, and 25, and the numbers of hidden layers were set to vary from 1 to 3. The update approach and initial values of w ij in Equation (1) need to be set.A set of random initial values of w ij was used in the first CRBM, and the weight matrix was constantly adjusted until it reached stability.Then, the next CRBM's weight matrix was initialized by using the previously trained CRBM's weight matrix, and layer-wise training was performed until all CRBMs were trained completely.Fixed values of the parameters θ min and θ max in Equations ( 2) and (4) were adopted, and set to be the minimum and maximum values of the original hourly water demand data, respectively.The constant σ in Equations ( 3) and ( 5), the learning rates η w in Equation ( 6), η a in Equations ( 7) and ( 8), and the noise-control parameters a j and a i in Equations ( 7) and ( 8), respectively, were determined by utilizing the fivefold cross-validation strategy and were also considered.
The output states of the last CRBM in the CDBN model were used as the input states of the ESN model.Single-step prediction was utilized, and the model with one output unit was set.The weight matrixes W in , W, and W b were randomly initialized and remained constant until the ESN training was complete.The relevant optimal parameters of the reservoir were determined by the grid search method and fivefold cross-validation method.
The three evaluation criteria (r 2 , NRMSE, and MAPE) were used to assess the learning performance of the CDBESN model with different parameters and select the parameters with the best learning performance.According to the method described above, the optimal architecture of the CDBN is 10-5-10; that is, 10 input layer units, 5 units in the first hidden layer, and 10 units in the second hidden layer.The optimal parameters of the ESN are the reservoir units N = 1000, the spectral radius λ = 0.9, and the leaking rate α = 0.3.The results of three performance indexes of the CDBESN model for the hourly water demand prediction in the training stage are r 2 = 0.995753, NRMSE = 0.027649, and MAPE = 2.354166.

Prediction and Results
Figure 5 depicts the prediction results of the hourly water demand data by the proposed CDBESN model in the training stage.As shown in the Figure 5a, the prediction data can accurately follow the changes of the actual hourly water demand data.Figure 5b plots the correlations between the prediction data and the actual data for the training data.Evidently, the prediction data are in good agreement with the actual data.Figure 6 presents the forecasting results of the proposed CDBESN in the testing stage.Figure 6a shows the periodicity and trends of the prediction and actual hourly water demand data are successfully matched.As displayed in Figure 6b, the correlations between the prediction data and the actual data in the testing stage show good agreement.This match further confirms that the CDBESN model has a satisfactory feature extraction ability and prediction performance.The three performance indexes of the CDBESN for the hourly water demand prediction in the testing stage are r 2 = 0.995912, NRMSE = 0.027163, and MAPE = 2.469419.

Comparison Experiment
The predictive ability of the CDBESN model was further evaluated by comparisons with the corresponding performance of the ESN, CDBNN, and SVR models using the same dataset.The ESN model is introduced in Section 2.2, and the numbers of input, reservoir, and output units, as well as the values of the spectral radius, were similarly set to those of the ESN in the CDBESN model.The CDBNN model [27], which consists of CDBN and BP neural networks, uses the same modeling method as the CDBN in the CDBESN model to select the numbers of units and hidden layers.The sigmoid activation function is applied to all hidden layers, and the linear transfer function to the output layer.The BP algorithm is used to adjust the parameters.Finally, the structure of the CDBNN is set to 8-15-10-1.The SVR model is widely applied for water demand forecasting [1].The insensitive loss function and kernel function are selected by using the particle swarm optimization algorithm, and the inputs utilized are similar to those in the CDBESN model. is set to 8-15-10-1.The SVR model is widely applied for water demand forecasting [1].The insensitive loss function and kernel function are selected by using the particle swarm optimization algorithm, and the inputs utilized are similar to those in the CDBESN model.Figure 7 plots the forecasting results of the hourly water demand with the ESN, CDBNN, and SVR models in the testing stage.Figures 7a, c, and e show the prediction data and actual data of the hourly water demand using the ESN, CDBNN, and SVR models, respectively.Figures 7b, d, and f present the scatter plots of the prediction data and actual data with the ESN, CDBNN, and SVR models, respectively.Notably, the ESN, CDBNN, and SVR models follow the trends of the actual hourly water demand data.However, the values of r 2 shown in the figure reveal that the CDBESN model slightly outperforms the comparison models in predicting the hourly water demand during the testing stage.is set to 8-15-10-1.The SVR model is widely applied for water demand forecasting [1].The insensitive loss function and kernel function are selected by using the particle swarm optimization algorithm, and the inputs utilized are similar to those in the CDBESN model.Figure 7 plots the forecasting results of the hourly water demand with the ESN, CDBNN, and SVR models in the testing stage.Figures 7a, c, and e show the prediction data and actual data of the hourly water demand using the ESN, CDBNN, and SVR models, respectively.Figures 7b, d, and f present the scatter plots of the prediction data and actual data with the ESN, CDBNN, and SVR models, respectively.Notably, the ESN, CDBNN, and SVR models follow the trends of the actual hourly water demand data.However, the values of r 2 shown in the figure reveal that the CDBESN model slightly outperforms the comparison models in predicting the hourly water demand during the testing stage.Figure 7 plots the forecasting results of the hourly water demand with the ESN, CDBNN, and SVR models in the testing stage.Figure 7a,c,e show the prediction data and actual data of the hourly water demand using the ESN, CDBNN, and SVR models, respectively.Figure 7b,d,f present the scatter plots of the prediction data and actual data with the ESN, CDBNN, and SVR models, respectively.Notably, the ESN, CDBNN, and SVR models follow the trends of the actual hourly water demand data.However, the values of r 2 shown in the figure reveal that the CDBESN model slightly outperforms the comparison models in predicting the hourly water demand during the testing stage.The performance evaluation indexes r 2 , NRMSE, and MAPE are employed to estimate the forecasting performances of the ESN, CDBNN, and SVR models by using the same testing dataset, as shown in Table 1.The CDBESN model has the best predictive performance, having the largest r 2 value and the smallest NRMSE and MAPE values among all models.Compared with the ESN, CDBNN, and SVR models, the proposed CDBESN model shows increases in r 2 of approximately 0.27%, 0.53%, and 1.12%; reductions in NRMSE of 21.91%, 33.28%, and 55.05%; and reductions in MAPE of 25.18%, 36.20%, and 56.55%.The CDBESN approach also has a higher prediction accuracy than the other comparison models in predicting the hourly water demand during the testing stage, The performance evaluation indexes r 2 , NRMSE, and MAPE are employed to estimate the forecasting performances of the ESN, CDBNN, and SVR models by using the same testing dataset, as shown in Gaussian unit, a sigmoid function, and Commented [ybxu2]: All the equations edited with MathType V6.9.If the softw garbled characters will be displayed.
j s and ' i s represent the states of a single-step sample of the hidden layer unit j and the visible layer unit i , respectively; and <⋅ > denotes the average value of the training dataset.The next training process is carried out after the change of the weight matrix is minimal or the preset maximum training time is achieved.Such conditions indicate that the current CRBM training is completed, and its outputs are used as the inputs of the following CRBM.The aforementioned training process is repeated until all CRBMs of the CDBN model are trained completely, and the training of the CDBN model is ended.
. In the figure, the ESN model consists of an input layer, an output layer, and a reservoir layer.Here, in W represents the weight matrix of the input layer, W is the internal weight matrix of the reservoir, o W is the weight matrix of the output layer, and b W is the feedback weight matrix.The values of in W , W , and b W are randomly produced during the initialization process and cannot be changed after generation.Only the value of o W must be adjusted during the training process of the reservoir.
CDBN and ESN models are integrated to construct a new deep architecture CDBESN model for the prediction of hourly urban water demand.The structure of the CDBESN model is shown in Figure 3.The model consists of a CDBN with l CRBMs in the bottom layer and ESN in the top layer.Accordingly, the learning process of the CDBESN model includes two stages: feature extraction and regression.Water 2019, 11, x FOR PEER REVIEW 5 of 12

Water 2019 ,
11, x FOR PEER REVIEW 6 of 12

Commented [ ybxu3 ]:
According to th authors of the journal, the figure we in resolution is 600dpi.The manuscript is printed as a PDF file.Commented [ybxu4]: Don't put the tit page, start with another page.

Figure 4 .
Figure 4. Original hourly water demand data from 1 January 2016 to 21 November 2016.

Figure 5 .Figure 6 .
Figure 5. Forecasting results and scatter plots by the model for the training data: (a) Shows the forecasting results, and (b) shows the scatter plots.

Figure 5 .
Figure 5. Forecasting results and scatter plots by the CDBESN model for the training data: (a) Shows the forecasting results, and (b) shows the scatter plots.

Figure 5 .Figure 6 .
Figure 5. Forecasting results and scatter plots by the CDBESN model for the training data: (a) Shows the forecasting results, and (b) shows the scatter plots.

Figure 6 .
Figure 6.Forecasting results and scatter plots by the CDBESN model for the testing data: (a) Shows the forecasting results, and (b) shows the scatter plots.

Figure 7 .
Figure 7. Prediction results and scatter plots of the hourly water demand in the testing stage using different models: (a) and (b) for ESN, (c) and (d) for CDBNN, and (e) and (f) for SVR.

Figure 7 .
Figure 7. Prediction results and scatter plots of the hourly water demand in the testing stage using different models: (a) and (b) for ESN, (c) and (d) for CDBNN, and (e) and (f) for SVR.
In this study, a new CDBESN model is proposed for the prediction of the original hourly urban water demand.The model is constructed by integrating a CDBN-based feature extraction model and an ESN-based regression model.The CDBN model is a stack of multiple CRBMs with continuous state values, which can deal with actual hourly water demand data.The ESN model replaces the BP algorithm of the traditional CDBN model for regression, and can thus effectively overcome the local optimum and slow convergence of the classical BP learning algorithm.The original hourly water demand records obtained from an urban waterworks in Zhuzhou, China are adopted to exploit the proposed CDBESN model.The forecasting performance of the CDBESN model is compared with those of the ESN, CDBNN, and SVR models.Three performance evaluation indexes, namely, r 2 , NRMSE, and MAPE, are used to estimate the forecasting performances of these models.The empirical results show that the proposed CDBESN model more accurately predicts the hourly urban water demand of the urban waterworks in Zhuzhou, China than the other models.The excellent performance of the proposed CDBESN model is due to the powerful feature extraction capacity of the CDBN model and the good feature regression ability of the ESN model.
r describes the linearity between the actual data and prediction data, NRMSE signifies the total accuracy of the prediction, and MAPE represents an unbiased estimator for assessing the predictive ability of a model.A large2r and small NRMSE and MAPE values indicate that the model has superior predictive capability.

Table 1 .
36e CDBESN model has the best predictive performance, having the largest r 2 value and the smallest NRMSE and MAPE values among all models.Compared with the ESN, CDBNN, and SVR models, the proposed CDBESN model shows increases in r 2 of approximately 0.27%, 0.53%, and 1.12%; reductions in NRMSE of 21.91%, 33.28%, and 55.05%; and reductions in MAPE of 25.18%, Water 2019, 11, 351 10 of 1236.20%, and 56.55%.The CDBESN approach also has a higher prediction accuracy than the other comparison models in predicting the hourly water demand during the testing stage, partly because of the excellent feature extraction capabilities of the CDBN model and the good regression performance of ESN model in the new deep learning architecture.

Table 1 .
Forecasting results of the CDBESN, ESN, CDBNN, and SVR models in the testing stage.