Hybridised Artiﬁcial Neural Network Model with Slime Mould Algorithm: A Novel Methodology for Prediction of Urban Stochastic Water Demand

: Urban water demand prediction based on climate change is always challenging for water utilities because of the uncertainty that results from a sudden rise in water demand due to stochastic patterns of climatic factors. For this purpose, a novel combined methodology including, ﬁrstly, data pre-processing techniques were employed to decompose the time series of water and climatic factors by using empirical mode decomposition and identifying the best model input via tolerance to avoid multi-collinearity. Second, the artiﬁcial neural network (ANN) model was optimised by an up-to-date slime mould algorithm (SMA-ANN) to predict the medium term of the stochastic signal of monthly urban water demand. Ten climatic factors over 16 years were used to simulate the stochastic signal of water demand. The results reveal that SMA outperforms a multi-verse optimiser and backtracking search algorithm based on error scale. The performance of the hybrid model SMA-ANN is better than ANN (stand-alone) based on the range of statistical criteria. Generally, this methodology yields accurate results with a coe ﬃ cient of determination of 0.9 and a mean absolute relative error of 0.001. This study can assist local water managers to e ﬃ ciently manage the present water system and plan extensions to accommodate the increasing water demand.


Introduction
Security of municipal water is fundamental to gain a sustainable environment in modern cities, especially under the impact of global warming and socio-economic variables. Additionally, most cities are located close to freshwater sources to ensure the prosperity of both industry and agriculture. For the programmes [5]. Altunkaynak and Nigussie [9] stated that the artificial neural network is a preferred option to simulate water demand because it can deal with non-linear time series.
Different metaheuristic optimisation algorithms could be applied to tackle a range of problems for various application domains. The main advantages of optimisation algorithms are their ability to select the optimal values of parameters of the system under different conditions, and they have time-saving qualities. Recently, a multi-verse optimiser (MVO) proposed by Mirjalili et al. [23] to solve various optimisation problems, for example, has been used for energy management in smart cities [24] and multi-level image segmentation [25]. Additionally, a backtracking search algorithm (BSA) has been utilised to tackle several optimisation issues, such as predicting urban water demand depending on previous water consumption data [26], photovoltaic models [27] and power signals [28]. Moreover, a slime mould algorithm (SMA) has been proposed by Li et al. [29] and used in several optimisation issues, but has not been investigated in the urban water sector, such as the spring design problem [30], photovoltaic models [31] and image segmentation method [32].
In addition, the literature has emphasised the importance of using data pre-processing to improve the quality of time series and to determine the best independent variables. More attention has recently focused on data cleaning. Therefore, several signal pre-treatment techniques have been employed to clean and/or detect the trend, seasonal and stochastic components of water consumption time series, such as singular spectrum analysis (SSA) [33,34], wavelet transform (WT) [20,35], variational mode decomposition (VMD) [36] and empirical mode decomposition (EMD) [37,38]. Another significant aspect of data pre-processing is selecting the best independent variables, such as principal component analysis (PCA) [39,40], mutual information (MI) [41,42] and variance inflation factor (VIF) [22,43].
Despite the fact that different techniques and approaches were used to forecast the future water demand, water companies still face challenges in estimating the accurate water demand, especially with the influence of climatic factors and their implications for future water demand. Therefore, additional research studies are required to accurately estimate the growing water demand [8].
In this context, the main contributions of the current research are: 1.
The employment of 10 climatic factors over 16 years to assess the impact of climate change on urban water demand.

2.
Development and analysis of a new hybrid algorithm SMA-ANN for the water demand optimisation problem, and choosing the optimal hyperparameters of the ANN approach. 3.
The application of two hybrid algorithms, MVO-ANN and BSA-ANN, for analysing and validating the proposed SMA-ANN algorithm. 4.
Using the novel methodology, which contains data pre-processing techniques (EMD and tolerance) and hybrid SMA-ANN algorithm, to simulate the monthly stochastic pattern of water demand based on the best scenario of climatic factors over 16 years.

5.
Minimising the uncertainty by applying three metaheuristic algorithms for more validation, and using the ANN (stand-alone) to confirm the results of the SMA-ANN model. Additionally, employing 10 climatic factors that give scientific insight (i.e., to what extent climate change has driven water demand) for policymakers to achieve sustainability.
To the best of the authors' knowledge, the present study explores a novel methodology for the first time: the effects of climate change on the monthly stochastic pattern of urban water demand. The structure of the research is organised as follows: case study and data used are presented in Section 2; the proposed methodology for predicting monthly stochastic water demand is described in Section 3; Section 4 provides the analysis and compares the obtained results; and finally, Section 5 presents the final conclusions with some considerations of the study.

Case Study and Data Used
The suggested methodology was applied to the observed water consumption and climatic factor data relating to the South East Water (SEW) utility, which is one of three retail water utilities that Water 2020, 12, 2692 4 of 18 purchase water wholesale from the Melbourne Water company in Melbourne, Australia. The sources of freshwater are 11 large storage facilities, which are refilled regularly by stormwater harvesting [44]. The urban water system network of SEW utility serves more than 1.7 million individuals in a 3640 km 2 area, and the company has approximately 72,700 customers, categorised into residential, industrial and commercial [45].
The collected data comprise monthly urban water consumption (megalitre, ML), maximum temperature (Tmax) ( • C), minimum temperature (Tmin) ( • C), mean temperature (Tmean) ( • C), solar radiation (Srad) (MJ/m 2 ), potential evapotranspiration (FA-O56) (mm), vapour pressure (VP) (hpa), rainfall (Rain) (mm), evaporation (Eva) (mm), maximum relative humidity (RHmax) (%) and minimum relative humidity (RHmin) (%) from 2000 to 2015. Figure 1 shows the time series and box plot of monthly water consumption for SEW utility. The figure reveals the decrease in water consumption due to drought, and water-conserving policies and initiatives. After that, the consumption increased, possibly because restrictions were eased after the impact of the drought lessened. It may also be due to the strategies that Melbourne Corporation pursued by upgrading the dams and relying on other resources, such as water desalination and water recycling [44].
Water 2020, 12, x FOR PEER REVIEW 4 of 18 of freshwater are 11 large storage facilities, which are refilled regularly by stormwater harvesting [44]. The urban water system network of SEW utility serves more than 1.7 million individuals in a 3640 km 2 area, and the company has approximately 72,700 customers, categorised into residential, industrial and commercial [45]. The collected data comprise monthly urban water consumption (megalitre, ML), maximum temperature (Tmax) (°C), minimum temperature (Tmin) (°C), mean temperature (Tmean) (°C), solar radiation (Srad) (MJ/m 2 ), potential evapotranspiration (FA-O56) (mm), vapour pressure (VP) (hpa), rainfall (Rain) (mm), evaporation (Eva) (mm), maximum relative humidity (RHmax) (%) and minimum relative humidity (RHmin) (%) from 2000 to 2015. Figure 1 shows the time series and box plot of monthly water consumption for SEW utility. The figure reveals the decrease in water consumption due to drought, and water-conserving policies and initiatives. After that, the consumption increased, possibly because restrictions were eased after the impact of the drought lessened. It may also be due to the strategies that Melbourne Corporation pursued by upgrading the dams and relying on other resources, such as water desalination and water recycling [44].

Proposed Methodology
There is a relatively small body of literature that is concerned with the impact of climate change only on the municipal water demand. Accordingly, this paper proposes a novel combined methodology for investigating the impact of climate change on water demand. It could be divided into five main categories: data pre-processing, slime mould algorithm (SMA), artificial neural network (ANN), hybrid metaheuristic algorithm-based artificial neural network and model evaluation. Figure 2 shows the structure of the proposed methodology to predict monthly stochastic data of water demand based on climatic factors.

Proposed Methodology
There is a relatively small body of literature that is concerned with the impact of climate change only on the municipal water demand. Accordingly, this paper proposes a novel combined methodology for investigating the impact of climate change on water demand. It could be divided into five main categories: data pre-processing, slime mould algorithm (SMA), artificial neural network (ANN), hybrid metaheuristic algorithm-based artificial neural network and model evaluation. Figure 2 shows the structure of the proposed methodology to predict monthly stochastic data of water demand based on climatic factors.

Data Pre-Processing
Recent developments in urban water predictive methodologies have highlighted the need to apply different data pre-processing techniques, which could be classified into normalisation, cleaning and selection of best model input [5,46]. To be in accordance with Tabachnick and Fidell [47], the natural logarithm was used to normalise all raw time series of water and climatic factors to reduce both the impact of the outliers and the multi-collinearity between independent factors.
Zubaidi et al. [8] mention that the relation between water demand time series and climatic time series is stochastic. Stochastic models will also offer a better reflection of reality and insight into the system's dynamics [48]. Hence, an empirical mode decomposition (EMD) approach was applied to decompose the original time series of dependent and independent variables into trend, seasonal, stochastic and noise components, and to detect the stochastic component after that. EMD is used in the analysis of various problems, such as machinery fault diagnosis [49] and biomedical signal analysis [50].

Data Pre-Processing
Recent developments in urban water predictive methodologies have highlighted the need to apply different data pre-processing techniques, which could be classified into normalisation, cleaning and selection of best model input [5,46]. To be in accordance with Tabachnick and Fidell [47], the natural logarithm was used to normalise all raw time series of water and climatic factors to reduce both the impact of the outliers and the multi-collinearity between independent factors.
Zubaidi, et al. [8] mention that the relation between water demand time series and climatic time series is stochastic. Stochastic models will also offer a better reflection of reality and insight into the system's dynamics [48]. Hence, an empirical mode decomposition (EMD) approach was applied to decompose the original time series of dependent and independent variables into trend, seasonal, stochastic and noise components, and to detect the stochastic component after that. EMD is used in the analysis of various problems, such as machinery fault diagnosis [49] and biomedical signal analysis [50].
EMD is also used for analysing geodetic data [51]. In this study, the authors have used EMD to analyse the natural variability of sea level and its effect, among other factors, on the trend of the sea level. Recently, Chu and Huang [52] utilised EMD for synthetisation and generation of flow data and for increasing the number of flow time series for the same time period, which is used in the simulation of a water supply system. This technique decomposes a time series into a number of time domain components called intrinsic mode functions (IMFs). The latter must have two properties:


The maximum difference between the number of local maxima and minima is one.  The mean value of an IMF is zero.
For a time series x(t), the extraction of IMFs could be described briefly in the following steps [51]:  EMD is also used for analysing geodetic data [51]. In this study, the authors have used EMD to analyse the natural variability of sea level and its effect, among other factors, on the trend of the sea level. Recently, Chu and Huang [52] utilised EMD for synthetisation and generation of flow data and for increasing the number of flow time series for the same time period, which is used in the simulation of a water supply system. This technique decomposes a time series into a number of time domain components called intrinsic mode functions (IMFs). The latter must have two properties:

•
The maximum difference between the number of local maxima and minima is one.

•
The mean value of an IMF is zero.
For a time series x(t), the extraction of IMFs could be described briefly in the following steps [51]: 1.
Assume h k − 1 (t) = x(t), and h i,k − 1 (t) = x(t), where i and k refer to the IMF number and the iteration number for finding the accurate ith IMF, respectively.

2.
Identify all the maxima and minima points of the series h i,k − 1 (t).

3.
Connect the maxima points by cubic spline interpolation and do the same thing for the minima points. The linked maxima points are called the upper envelope, U i,k − 1 (t), while the linked minima points are called the lower envelope, L i,k − 1 (t).

4.
The mean of the upper and lower envelopes is found using this formula:

5.
Form the following formula: The component h i,k (t) is primarily described as the first IMF. To determine the first IMF accurately, the h i,k (t) is considered as a new signal, and the mean of upper envelope, lower envelope and the mean (i.e., U I,k (t), L i,k − 1 (t) and m i,k of the h i,k (t)) are calculated. The new component h i,k (t) is checked to see whether it has IMF properties or not. If it does, then it (i.e., h k (t)) is identified as an IMF. If not, the process will be repeated until IMF properties are obtained. The number of the repetitions to identify an IMF is called iterations and is notated by k, while the IMF number is notated by i. 6.
When the ith IMF is obtained, the residue is obtained: The residue res i is now treated as the signal h i+1,k − 1 and the same steps 2-6 are repeated until no more IMFs can be extracted.
The EDM process above is applied for all dependent and independent variables in this study.
Regarding the selection of the best model input, Pallant [52] recommended using a tolerance method to choose the independent variables that have a tolerance value of more than 0.1, because values less than 0.1 indicate the presence of multi-collinearity.

Slime Mould Algorithm (SMA)
The SMA is one of the recent nature-inspired algorithms. It refers to the mathematical model of simulating the propagation wave of slime mould when forming the optimal path for connecting foods. This model adaptively simulates the process of producing negative and positive feedback during the propagation wave. This algorithm is incorporated into different optimisation problems, including the engineering ones. The main two stages in the SMA algorithm are called approaching food and warp food.

a.
Approaching food In this stage, the slime is approaching food based on its odour in the air, and this behaviour is mathematically described as follows: where → vb is a parameter which ranges from −a to a. → vc represents a parameter which decreases from one to zero in a linear form. X b represents the current individual location corresponding to high odour concentration. t is the current iteration. X is the location of the slime mould. X A and X B are randomly selected individuals from the mould. W is the weight of the slime mould. The formula of p can be represented as follows: S(i) represents the fitness of → X. DF represents the best fitness over all the iterations.
As mentioned above, → vb ranges from −a to a, and a can be described as follows: Water 2020, 12, 2692 The → W formula can be described as follows: where r denotes the random value within the interval [0, 1]. bF represents the optimal fitness obtained in the current iterative process. wF represents the worst fitness value obtained in the current iterative process. SmeelIndex refers to the sequence of fitness values.

b. Warp food
In this stage, the behaviour of the slime in conducting contraction of its venous structure is mathematically described as follows.
where LB and UB are the lower and upper boundaries of the search range, and rand and r are random parameters ranging from 0 to 1. Further details of the SMA can be found in Li et al. [29]. In this study, the SMA algorithm is combined with the ANN model to determine the optimum parameters of the ANN model (see Section 3.4).

Artificial Neural Network (ANN)
In recent years, there has been an increasing interest in using the ANN model to predict urban water demand, because it is capable of accurately simulating the nonlinear time series. Additionally, Rahim et al. [5] reported that different multi-layer feedforward neural networks (MLFFNNs) have been successfully developed utilising propagation networks in a water demand estimation model for short and medium terms. The Levenberg-Marquardt (LM) backpropagation algorithm was employed for training the ANN model because it can effectively simulate any independent/dependent map [53]. The structure of the ANN model can be categorised into four layers: the input layer, two hidden layers and an output layer. The input layer contains the independent variable (climatic factors), the output layer involves the dependent variable (water demand) and the hidden layers comprise the hidden neurons, which are responsible for stimulating the nonlinear relationship between water consumption and climatic factors. As in Zubaidi et al. [43], the tansigmoidal activation function was considered in both of the hidden layers and the linear activation function was utilised in the output layer. In this research, total data can be randomly categorised into training (70%), testing (15%) and validation datasets (15%) [54]. Following González Perea et al.
[55], the ANN model was integrated with a metaheuristic algorithm to determine the optimal hyperparameters of the ANN model that includes the learning rate coefficient (LR) and the number of neurons hidden in the first (N1) and second (N2) hidden layers.

Hybrid Metaheuristic Algorithm-Based Artificial Neural Network
In the ANN technique, before achieving the stages of training, testing and validation, it is important to locate two hyperparameters, which are the learning rate coefficient (LR) and the number of neurons hidden (N1 and N2) for the hidden layer one and two, respectively. These hyperparameters are responsible for mapping the nonlinear relationship among the stochastic signals of water consumption and climatic factors. The determination of these hyperparameters, based on a trial and error procedure, may not present the optimum solutions. For this purpose, the ANN model is hybridised with the slime mould algorithm (SMA-ANN) (automated machine learning) to select the best LR, N1 and N2 for the ANN model. Additionally, two extra metaheuristics were hybridised with the ANN, the multi-verse optimiser (MVO-ANN) and the backtracking search algorithm (BSA-ANN), to assess and validate the results of the SMA-ANN algorithm. Five population sizes (10, 20, 30, 40 and 50 popsize) with 200 iterations were employed for each hybrid algorithm to select the popsize that could offer the lower value of fitness function (root mean square error, RMSE).

Model Evaluation
In this research, several performance statistical criteria were employed to evaluate the performance of the suggested methodology, because there are no global performance criteria that are appropriate for a particular usage. The performance criteria employed in this research are categorised into absolute, relative and dimensionless errors [36]. The absolute error contains the mean absolute error (MAE, Equation (7)) and mean square error (MSE, Equation (8)). The relative error comprises the mean absolute relative error (MARE, Equation (9)). The dimensionless error contains the coefficient of determination (R 2 , Equation (10)). In addition, a Bland-Altman scatterplot is used to graphically represent the upper and lower limits of agreement area between (actual data-simulated data) on the y-axis, and ((actual data + simulated data)/2) on the x-axis. Moreover, Augmented Dickey-Fuller (ADF) and Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests are used to examine the stationarity of the stochastic component for dependent and independent variables.
where C i : measured water consumption, P i : predicted water demand, C i : mean of measured water consumption, P i : mean of predicted water demand, N: length of data.

Preparation of Dependent and Independent Variables
Firstly, data on water consumption and 10 climatic factors were normalised and cleaned according to Section 3.1. Figure 3 shows the normalised and cleaned water data. Figure 3A shows that the variance of the seasonal periods along the time series had been reduced in comparison to Figure 1A. Figure 3B shows that the time series, after the normalisation and treating of the outliers, tended to follow a normal distribution in comparison with Figure 1B.  Then, the EMD approach was applied to analyse the normalised data of water consumption and all climatic factors to unravel the stochastic components of each time series. Figure 4 presents the normalised and cleaned data for water consumption and its decomposed components, including trend, seasonal, stochastic and noise. The Augmented Dickey-Fuller and Kwiatkowski-Phillips-Schmidt-Shin tests were used to assess the stationarity of the stochastic component for each factor (i.e., these two tests are used to test and select the stochastic signal). It can also be seen that other components (trend and seasonal) represented the deterministic signal, which was driven by socioeconomic factors.  Table 1 presents the correlation coefficients between water consumption and climatic factors time series in the raw and stochastic stage. The table shows that data pre-processing techniques produced significant improvements in the quality of the data, such as increasing the correlation coefficient between water consumption and maximum temperature time series (from 0.63 to 0.93). Additionally, the correlation coefficient between the stochastic signal of water consumption and climatic factors over 16 years confirmed the relation between water consumption and climatic factors.  Then, the EMD approach was applied to analyse the normalised data of water consumption and all climatic factors to unravel the stochastic components of each time series. Figure 4 presents the normalised and cleaned data for water consumption and its decomposed components, including trend, seasonal, stochastic and noise. The Augmented Dickey-Fuller and Kwiatkowski-Phillips-Schmidt-Shin tests were used to assess the stationarity of the stochastic component for each factor (i.e., these two tests are used to test and select the stochastic signal). It can also be seen that other components (trend and seasonal) represented the deterministic signal, which was driven by socio-economic factors.
Water 2020, 12, x FOR PEER REVIEW 9 of 18 Then, the EMD approach was applied to analyse the normalised data of water consumption and all climatic factors to unravel the stochastic components of each time series. Figure 4 presents the normalised and cleaned data for water consumption and its decomposed components, including trend, seasonal, stochastic and noise. The Augmented Dickey-Fuller and Kwiatkowski-Phillips-Schmidt-Shin tests were used to assess the stationarity of the stochastic component for each factor (i.e., these two tests are used to test and select the stochastic signal). It can also be seen that other components (trend and seasonal) represented the deterministic signal, which was driven by socioeconomic factors.  Table 1 presents the correlation coefficients between water consumption and climatic factors time series in the raw and stochastic stage. The table shows that data pre-processing techniques produced significant improvements in the quality of the data, such as increasing the correlation coefficient between water consumption and maximum temperature time series (from 0.63 to 0.93). Additionally, the correlation coefficient between the stochastic signal of water consumption and climatic factors over 16 years confirmed the relation between water consumption and climatic factors.   Table 1 presents the correlation coefficients between water consumption and climatic factors time series in the raw and stochastic stage. The table shows that data pre-processing techniques produced significant improvements in the quality of the data, such as increasing the correlation coefficient between water consumption and maximum temperature time series (from 0.63 to 0.93). Additionally, the correlation coefficient between the stochastic signal of water consumption and climatic factors over 16 years confirmed the relation between water consumption and climatic factors. In the final stage of data pre-processing, a tolerance method was used to locate the best scenario of independent factors, climatic factors, that could accurately simulate the stochastic component of water demand and omit redundant factors to avoid multi-collinearity. The tolerance values for all of the climatic factors in the initial stage were less than the minimum limit of acceptance (i.e., it should be more than 0.1) except the Rain, which had a tolerance value equal to 0.43. So, the climatic factors that had multi-collinearity were removed separately, one by one, until the tolerance values of the selected model reached more than 0.1, as shown in Table 2. The latter shows that Tmax, RHmin and Rain were selected to be the best scenario of independent factors based on the tolerance value. As presented in Table 2, the tolerance value for each climatic factor was more than 0.1, which means the multi-collinearity assumption was not violated. Another graphical technique can show the significance of data pre-processing. Figure 5 presents the box plot of the stochastic components of water consumption, Tmax, RHmin and Rain. It can be seen that there were no outliers within the data, and the median was zero for water consumption and Tmax, and nearly zero for all RHmin and Rain factors. The stochastic components of all factors showed a normal distribution, which was confirmed by the Kolmogorov-Smirnov test with significance values (Sig.) of more than 0.05 (i.e., the Sig. values were 2 for both minimum relative humidity and rainfall factors). Additionally, Figure 5 reveals how the EMD technique enhanced the normal distribution of water consumption compared to the distribution of water consumption in Figure 3B.
After preparing the stochastic signals of the dependent and independent factors, data were organised into three sets (as mentioned in Section 3.3): training (70%, 134 datapoints), testing (15%, 29 datapoints) and validation (15%, 29 datapoints) to build and assess the prediction model. Accordingly, this was to ensure that each dataset had data selected from along total time series (i.e., the values of the key statistical parameters, such as the maximum limit, minimum limit and standard deviation for the mentioned datasets were very comparable).

Model Configuration
The ANN model needed to be integrated with the metaheuristic algorithm to determine the optimum hyperparameters of the ANN model, including LR, N1 and N2. Thus, the SMA algorithm was hybridized with the ANN model, and the results, for more validation, were compared with the MVO-ANN and BSA-ANN algorithms. Each algorithm was run five times based on population sizes (10, 20, 30, 40 and 50 popsize) with 200 iterations to increase the solution range, as depicted in Figure 6. It can be noticed that popsize of 50 offered the best solution for all hybrid algorithms based on the fitness function (RMSE) (i.e., offered the lowest RMSE value). seen that there were no outliers within the data, and the median was zero for water consumption and Tmax, and nearly zero for all RHmin and Rain factors. The stochastic components of all factors showed a normal distribution, which was confirmed by the Kolmogorov-Smirnov test with significance values (Sig.) of more than 0.05 (i.e., the Sig. values were 2 for both minimum relative humidity and rainfall factors). Additionally, Figure 5 reveals how the EMD technique enhanced the normal distribution of water consumption compared to the distribution of water consumption in Figure 3B.   After preparing the stochastic signals of the dependent and independent factors, data were organised into three sets (as mentioned in Section 3.3): training (70%, 134 datapoints), testing (15%, 29 datapoints) and validation (15%, 29 datapoints) to build and assess the prediction model. Accordingly, this was to ensure that each dataset had data selected from along total time series (i.e., the values of the key statistical parameters, such as the maximum limit, minimum limit and standard deviation for the mentioned datasets were very comparable).

Model Configuration
The ANN model needed to be integrated with the metaheuristic algorithm to determine the optimum hyperparameters of the ANN model, including LR, N1 and N2. Thus, the SMA algorithm was hybridized with the ANN model, and the results, for more validation, were compared with the MVO-ANN and BSA-ANN algorithms. Each algorithm was run five times based on population sizes (10, 20, 30, 40 and 50 popsize) with 200 iterations to increase the solution range, as depicted in Figure  6. It can be noticed that popsize of 50 offered the best solution for all hybrid algorithms based on the fitness function (RMSE) (i.e., offered the lowest RMSE value).  To scrutinise and validate the influence of integrating the ANN model with the SMA algorithm, the performance of the ANN technique was inspected. Thus, wide scenarios of the trial and error process were used to select the hyperparameters of the ANN technique. The outcomes presented that the values of hyperparameters were LR = 0.6, N1 = 2 and N2 = 3.

Performance Evaluation
After determining the hyperparameters of the ANN model, the ANN model became ready to simulate the monthly stochastic signal of municipal water demand. The ANN model was implemented several times to get a better network (weights) that could accurately predict the water demand. Different types of statistical tests were used to assess the ability of the model to forecast water demand based on climatic factors (validation stage).
Three statistical metrics were applied to examine the ability of the model to generalise data in the validation stage and compare the results of the SMA-ANN with the ANN model. Table 3 provides three metrics, MAE, MSE (absolute error) and MARE (relative error), to assess the non-linear dependency between the actual and simulated water demand for both models. According to Dawson,et al. [56], both models showed good accuracy, but the SMA-ANN could predict water demand rather well based on the MARE value. In addition, Figure 8 presents the coefficient of determination (R 2 ) for the SMA-ANN and ANN models. The values of R 2 delivered information for the linear relationship between the actual water consumption (Target, ML) and predicted water demand (Output, ML) for both models. Similar to the error tests (absolute and relative), both models offered good results according to Dawson, et al. [56]. However, the value of R 2 for the SMA-ANN model was 0.9, which is more accurate than that of the ANN model (0.87). Additionally, the scatter data for the SMA-ANN model were falling closer to the ideal line than the scatter data for the ANN model. To scrutinise and validate the influence of integrating the ANN model with the SMA algorithm, the performance of the ANN technique was inspected. Thus, wide scenarios of the trial and error process were used to select the hyperparameters of the ANN technique. The outcomes presented that the values of hyperparameters were LR = 0.6, N1 = 2 and N2 = 3.

Performance Evaluation
After determining the hyperparameters of the ANN model, the ANN model became ready to simulate the monthly stochastic signal of municipal water demand. The ANN model was implemented several times to get a better network (weights) that could accurately predict the water demand. Different types of statistical tests were used to assess the ability of the model to forecast water demand based on climatic factors (validation stage).
Three statistical metrics were applied to examine the ability of the model to generalise data in the validation stage and compare the results of the SMA-ANN with the ANN model. Table 3 provides three metrics, MAE, MSE (absolute error) and MARE (relative error), to assess the non-linear dependency between the actual and simulated water demand for both models. According to Dawson et al. [56], both models showed good accuracy, but the SMA-ANN could predict water demand rather well based on the MARE value. In addition, Figure 8 presents the coefficient of determination (R 2 ) for the SMA-ANN and ANN models. The values of R 2 delivered information for the linear relationship between the actual water consumption (Target, ML) and predicted water demand (Output, ML) for both models. Similar to the error tests (absolute and relative), both models offered good results according to Dawson et al. [56]. However, the value of R 2 for the SMA-ANN model was 0.9, which is more accurate than that of the ANN model (0.87). Additionally, the scatter data for the SMA-ANN model were falling closer to the ideal line than the scatter data for the ANN model.  Figure 9 shows the Bland-Altman for the SMA-ANN and ANN models. The SMA-ANN has a mean = −0.002998 ML with limits of the agreement being −0.03358 and 0.02759 ML, while these values for the ANN model were −0.003439, −0.03536 and 0.02947 ML for the mean, lower and upper limits of agreement area, respectively. Additionally, a good agreement was noticed for the SMA-ANN model because up to 97% of the data were scattered between the limits of agreement, whereas the proportion was 90% for the ANN model. Generally, the obtained results revealed that the SMA-ANN model had limits of agreement much closer to the mean, and it had a higher agreement percentage compared to the ANN model. Additionally, there was no trend in the scattered data for both Figures  8 and 9, and the randomness of the residual data was assessed and confirmed using the ADF test. That meant the tolerance method was successfully used to select the best model input. The obtained results emphasised that the determination of the ANN's parameters using hybridisation with the SMA algorithm was better than that conducted by trial and error procedure. In the former process (i.e., SMA-ANN), the parameters were determined automatically, whereas the latter was conducted manually.
All the statistical tests examined and validated the SMA-ANN model and, for more examination, ADF and KPSST tests were used to check the stationarity of the simulated stochastic signal of water demand and residual analysis. The results showed that the simulated time series of water demand was stationary and the residual data were normally distributed, which was assessed by the Kolmogorov-Smirnov test. Furthermore, a graphical test was used to confirm the SMA-ANN model by comparing the observed and simulated water time series in the validation stage as shown in Figure  10, which shows that the model could closely follow the trend and cycles of the observed stochastic  Figure 9 shows the Bland-Altman for the SMA-ANN and ANN models. The SMA-ANN has a mean = −0.002998 ML with limits of the agreement being −0.03358 and 0.02759 ML, while these values for the ANN model were −0.003439, −0.03536 and 0.02947 ML for the mean, lower and upper limits of agreement area, respectively. Additionally, a good agreement was noticed for the SMA-ANN model because up to 97% of the data were scattered between the limits of agreement, whereas the proportion was 90% for the ANN model. Generally, the obtained results revealed that the SMA-ANN model had limits of agreement much closer to the mean, and it had a higher agreement percentage compared to the ANN model. Additionally, there was no trend in the scattered data for both Figures 8 and 9, and the randomness of the residual data was assessed and confirmed using the ADF test. That meant the tolerance method was successfully used to select the best model input.  Figure 9 shows the Bland-Altman for the SMA-ANN and ANN models. The SMA-ANN has a mean = −0.002998 ML with limits of the agreement being −0.03358 and 0.02759 ML, while these values for the ANN model were −0.003439, −0.03536 and 0.02947 ML for the mean, lower and upper limits of agreement area, respectively. Additionally, a good agreement was noticed for the SMA-ANN model because up to 97% of the data were scattered between the limits of agreement, whereas the proportion was 90% for the ANN model. Generally, the obtained results revealed that the SMA-ANN model had limits of agreement much closer to the mean, and it had a higher agreement percentage compared to the ANN model. Additionally, there was no trend in the scattered data for both Figures  8 and 9, and the randomness of the residual data was assessed and confirmed using the ADF test. That meant the tolerance method was successfully used to select the best model input. The obtained results emphasised that the determination of the ANN's parameters using hybridisation with the SMA algorithm was better than that conducted by trial and error procedure. In the former process (i.e., SMA-ANN), the parameters were determined automatically, whereas the latter was conducted manually.
All the statistical tests examined and validated the SMA-ANN model and, for more examination, ADF and KPSST tests were used to check the stationarity of the simulated stochastic signal of water demand and residual analysis. The results showed that the simulated time series of water demand was stationary and the residual data were normally distributed, which was assessed by the Kolmogorov-Smirnov test. Furthermore, a graphical test was used to confirm the SMA-ANN model by comparing the observed and simulated water time series in the validation stage as shown in Figure  10, which shows that the model could closely follow the trend and cycles of the observed stochastic The obtained results emphasised that the determination of the ANN's parameters using hybridisation with the SMA algorithm was better than that conducted by trial and error procedure. In the former process (i.e., SMA-ANN), the parameters were determined automatically, whereas the latter was conducted manually.
All the statistical tests examined and validated the SMA-ANN model and, for more examination, ADF and KPSST tests were used to check the stationarity of the simulated stochastic signal of water demand and residual analysis. The results showed that the simulated time series of water demand was stationary and the residual data were normally distributed, which was assessed by the Kolmogorov-Smirnov test. Furthermore, a graphical test was used to confirm the SMA-ANN model by comparing the observed and simulated water time series in the validation stage as shown in Figure 10, which shows that the model could closely follow the trend and cycles of the observed stochastic time series based on the scale of error. There were several slight deviations in the simulated time series that may have came from the influence of fluctuation of the climatic factors. However, based on both scale of error and the result of the Bland-Altman for the SMA-ANN, the error could be considered statistically insignificant.  The most interesting conclusions that could be drawn from the above results were: (1) the EMD technique had a significant rule to decompose the raw data to select the stochastic signals of dependent and independent variables. Additionally, the tolerance method was effective in determining the best scenario of independent factors. (2) The optimal hyperparameters of the ANN model were determined based on the novel hybrid model, SMA-ANN, which outperformed both MVO-ANN and BSA-ANN algorithms depending on the RMSE value. (3) The novel combined methodology that comprised data analytic and machine learning could effectively simulate the stochastic signal of water demand concerning climatic factors. (4) An automated machine learning outperformed trial and error procedure based on several statistical tests. (5) Using three metaheuristic algorithms to build the prediction model and validate the results by the ANN (stand-alone) and employing 10 climatic factors decreased the uncertainty and increased the forecasting range. (6) The hybridisation of the model, as well as the way of categorisation of the training, testing and validation samples, presented a promising application of the developed model for covering unknown extreme events, particularly when it was applied to predict data that were not used before in the model configuration. (7) The research provided important scientific insights for managers and policymakers in SEW utility to manage the water supply system under sudden increases in water demand due to the variability of the stochastic pattern of climatic factors. They could, for example, feed the model with predicting climatic factors to forecast water demand for medium-term (i.e., future). After that, they could compare the water demand in the future with the capacity of the water system and decide whether the water system is capable of successfully working under the extreme events or not. (8) The obtained results confirmed the association between climate change and water demand for the medium term.

Conclusions
In this study, the potential of novel coupled data pre-processing and automated machine learning for monthly stochastic urban water demand prediction based on several climatic factors was investigated. Data for water consumption and 10 climatic factors for the South East Water utility in Melbourne were utilised for building and assessing the proposed methodology. Data pre-processing techniques were considered to analyse and select the stochastic signals of water consumption and climatic factors time series by EMD approach, and to detect the best independent variables by the tolerance method. The automated machine learning included the ANN model, which integrated with the SMA algorithm to find optimal hyperparameters of the ANN model. The results highlighted the importance of data pre-processing to prepare the stochastic pattern of dependent and independent variables and to select the best scenario of independent variables. Additionally, the SMA-ANN was The most interesting conclusions that could be drawn from the above results were: (1) the EMD technique had a significant rule to decompose the raw data to select the stochastic signals of dependent and independent variables. Additionally, the tolerance method was effective in determining the best scenario of independent factors. (2) The optimal hyperparameters of the ANN model were determined based on the novel hybrid model, SMA-ANN, which outperformed both MVO-ANN and BSA-ANN algorithms depending on the RMSE value. (3) The novel combined methodology that comprised data analytic and machine learning could effectively simulate the stochastic signal of water demand concerning climatic factors. (4) An automated machine learning outperformed trial and error procedure based on several statistical tests. (5) Using three metaheuristic algorithms to build the prediction model and validate the results by the ANN (stand-alone) and employing 10 climatic factors decreased the uncertainty and increased the forecasting range. (6) The hybridisation of the model, as well as the way of categorisation of the training, testing and validation samples, presented a promising application of the developed model for covering unknown extreme events, particularly when it was applied to predict data that were not used before in the model configuration. (7) The research provided important scientific insights for managers and policymakers in SEW utility to manage the water supply system under sudden increases in water demand due to the variability of the stochastic pattern of climatic factors. They could, for example, feed the model with predicting climatic factors to forecast water demand for medium-term (i.e., future). After that, they could compare the water demand in the future with the capacity of the water system and decide whether the water system is capable of successfully working under the extreme events or not. (8) The obtained results confirmed the association between climate change and water demand for the medium term.

Conclusions
In this study, the potential of novel coupled data pre-processing and automated machine learning for monthly stochastic urban water demand prediction based on several climatic factors was investigated. Data for water consumption and 10 climatic factors for the South East Water utility in Melbourne were utilised for building and assessing the proposed methodology. Data pre-processing techniques were considered to analyse and select the stochastic signals of water consumption and climatic factors time series by EMD approach, and to detect the best independent variables by the tolerance method. The automated machine learning included the ANN model, which integrated with the SMA algorithm to find optimal hyperparameters of the ANN model. The results highlighted the importance of data pre-processing to prepare the stochastic pattern of dependent and independent variables and to select the best scenario of independent variables. Additionally, the SMA-ANN was found to be superior to both the BSA-ANN and MVO-ANN algorithms based on RMSE as an objective function. Moreover, the performance of the hybrid model, SMA-ANN, was more accurate than the ANN (stand-alone) approach depending on different statistical tests. Furthermore, the outcomes indicated that the suggested methodology can be successfully applied in regions that suffer from climate change (i.e., drought), such as Melbourne. Consequently, the South East Water utility can take advantage of this study's findings to establish effective strategies for optimised system operation and to maintain a balance between water requested and delivered. It also helps to establish appropriate pricing plans, schedule new changes in the network and optimise the operating procedures, such as pumps, to enhance the water quality, and to reduce the uncertainty. Based on recent literature, severe weather will probably become more common in the future. Thus, there is an urgent need for more studies that use the same or different data analytic and artificial intelligence techniques to simulate the stochastic component of urban water demand based on climate factors for regions that suffer from climate change. However, the availability of reliable data for water consumption and climatic factors for the medium or long term is considered a principal limitation of this methodology.