Urban water security is essential to get a resilient environment in smart cities, particularly under the stress of climate change and socio-economic factors [1
]. Moreover, cities located close to water resources are driven by all kinds of industries, hence a lack of water lack is considered a classic problem for decision makers [3
]. Since the last century, gradual changes in freshwater resources have been observed [5
]. Recent studies related to climate change have shown that it plays a key role in freshwater resources due to the potential decrease in rainfall amount [6
]. Specifically, it has been shown that climate change adversely impacts freshwater resources in the center of cities, which in turn impacts the sustainable development of water availability and consequently impacts socio-economic activities [7
]. In addition, several studies have shown that freshwater resources are generally adversely affected by pollution [8
Different regions in the world have been facing water scarcity situations, which implies that the gap between water supply and demand is likely to increase in the future. The European Environment Agency in 2010 reported that municipal water consumption is driven by complicated interactions between anthropogenic and natural system factors at multiple spatial and temporal scales [10
]. In the Gauteng Province, the Republic of South Africa, the municipal water delivered has been less than the demand. This imbalance is due to the impact of climate change, rainfall reduction, as well as others that are human related, such as economic expansion and population growth. The lack of freshwater resources and the increase in water demand has put pressure on the municipal water supply system. This highlights the importance of using the prediction of water demands as an effective approach for optimizing the operation and management of the system, or planning for future expansion or reduction under the variability of climate and socio-economic factors [2
House-Peters and Chang [15
], Donkor et al. [16
], Ghalehkhondabi et al. [17
] and de Souza Groppo et al. [18
] stated that different methods and models have been applied in previous studies to predict municipal water demand, including traditional, Artificial Intelligence (AI), and hybrid AI models. Traditional models, such as time-series analysis and regression [19
], were firstly employed in water demand simulation. However, traditional approaches lacked accuracy when forecasting water demand, which can cause significant issues in the operation and management of the water supply system. Additionally, the growth of the impact of climate change and urbanization cause high uncertainty, making the prediction and forecasting more complex, which also motivated researchers to further develop their models [21
], including the use of AI techniques.
Data-driven techniques have far-ranging applications, such as wastewater [22
], water demand [24
], and groundwater levels [26
]. Some of these techniques include the support vector machine (SVM) [27
], extreme learning machine (ELM) [24
], and random forest (RF) [28
]. One of these AI techniques is Artificial Neural Networks (ANN) [29
], which is a powerful technique that has been widely used in hydraulic modelling in recent years. It has the capability to deal with complex and nonlinear relationships between inputs and outputs [30
]. The results obtained when applying ANN have been superior to all types of conventional model in many scenarios, for example, Mouatadid and Adamowski [32
] and Guo et al. [33
]. However, there are cases where conventional methods performed as well as or even better than ANN in terms of accuracy, such as Li et al. [27
]. The latter can be due to a number of reasons, for example that the models falling into a local instead of the global minimum, leading to a sub-optimal solution [34
], or not using the right network design or hyperparameters for training the neural network [35
]. Hence, in order to avoid these drawbacks, different approaches have been combined with the ANN model, such as heuristic algorithms [36
], and different hybrid models have been proposed.
A hybrid model contains two or more techniques; one of them would work as the primary model, while others would act as pre-processing or post-processing approaches [37
]. Hybrid models have been used to simulate municipal water demand using different techniques and in different scenarios, and the results have revealed that these models are robust and insightful, e.g., Altunkaynak and Nigussie [38
], Seo et al. [24
], Pacchin et al. [39
], Ebrahim Banihabib and Mousavi-Mirkalaei [2
] and Rasifaghihi et al. [40
Eggimann et al. [41
] reviewed various techniques of data pre-processing that have been used for municipal water management. The reviewed article reveals that data pre-processing techniques have an important potential advantage for optimizing the performance of prediction models. It has applied successfully in different areas of study, e.g., monthly rainfall forecasting [42
], irrigation water prediction [43
] and urban water demand prediction [24
Various optimization techniques have been applied to solve problems in engineering applications. The optimization algorithms aim to detect optimal values for the parameters of the system under various conditions [44
]. Lately, the crow search algorithm (CSA), a recently proposed metaheuristic algorithm, has been used to tackle a variety of optimization engineering issues [45
]. CSA was applied to solve optimization issues in different engineering sectors, such as the optimization of energy problems [45
], economic environmental dispatch [46
], the selection of the optimal size of conductor in radial distribution networks [47
], water demand prediction [48
] and to solve constrained engineering [49
]. In this study, the CSA will be hybridized with the ANN model to select the best hyperparameters of the ANN model.
From the application area viewpoint, another significant consideration is the selection of the best model input that drives the dependent variable [50
]. Several techniques were applied in different studies, such as principal component analysis (PCA) [52
], variance inflation factor (VIF) [21
] and mutual information (MI) [54
]. In this study, the mutual information technique was used to select the best scenario of model input based on several historical observed water consumption data.
According to the literature review, another significant consideration is that most of the studies focus on a short-term water demand estimate, while only a few deal with medium- to long-term prediction. Lately, various studies, such as [33
], have employed historical data of water consumption as a single input in their short-term prediction models.
However, a challenge still exists for managers of water utilities and policymakers due to the uncertainty to gain knowledge about the capacity of the water system under potential rapid growth in urban water demand as a consequence of socio-economic, demographic and climate factors. Moreover, as mentioned previously, only a few studies have considered medium-term municipal water demand based on previous water consumption. Therefore, these aforementioned problems motivated us to propose an approach that would refine those existing approaches, providing managers with scientific, more accurate insights about the future water demand, reducing the uncertainty.
The main objectives of this research study are:
To improve the quality of the data and to choose the best model input scenario by applying data pre-processing techniques.
To select the optimum values of ANN hyperparameters by using the Backtracking Search Algorithm and Artificial Neural Network (BSA-ANN) technique. Moreover, to evaluate how BSA-ANN performs in comparison with a CSA-ANN algorithm.
To assess the performance of the novel methodology to predict medium-term municipal water demand in relation to some lags time of observed water consumption.
To reduce the uncertainty for decision makers by using a novel and refined model, which involves data pre-processing methods (to improve the quality of data and select the model input), and employing a more sophisticated approach for model prediction (using combined techniques to enhance the accuracy of results, and the stand-alone ANN to confirm the results of the hybrid model).
Based on the literature review, the research is thought to be the first study that used this novel combined methodology, which includes data pre-processing and automated machine learning to forecast municipal water demand depending on some lags’ values of water consumption as model input. As such, it is considering the effect of all climate, demographic and socio-economic factors.
In this manuscript, the performance of novel combined models that include pre-treatment signal, mutual information and the BSA-ANN technique were assessed to estimate the monthly municipal water needed based on previous water consumption. Historical data of monthly water consumption over ten years from the Gauteng province, South Africa, was utilized to build and evaluate the predictive model developed. The outcomes show that data pre-processing is a crucial step to enhance the quality of the data before feeding it into the model by denoising time series and selecting the best scenario of model input. Moreover, the hybrid BSA-ANN algorithm can be successfully applied to select optimum ANN hyperparameters, and it outperforms the CSA-ANN algorithm based on fitness function (RMSE). In addition, the ANN model (stand-alone) was used to decrease the uncertainty by validating the outcomes of the hybrid model (BSA-ANN). Moreover, the results confirm the appropriateness of the combined model to forecast water demand depending on the historical water consumption of a city under variability in climate and socio-economic factors, such the Gauteng province. The advantages of the proposed methodology are: easy to be implemented, high accuracy with less uncertainty, time-saving qualities, and applicability when the climate and socio-economic factors are missing (i.e., lost the information of factors that drive water demand). Hence, these results can accurately inform Rand Water (i.e., its decision makers and managers), helping this water utility company to better manage the existing municipal water system and to better plan for extensions in response to the increasing consumption, which would lead to better service and the better management of resources in the Gauteng province. Therefore, taking into consideration all the benefits mentioned before, we recommend that additional studies are conducted in other regions with similar or different climatic and socio-economic factors, or regions that lack climatic and socio-economic factors but have reliable water consumption data. Moreover, based on the outputs of the current study, we recommend exploring the use of different techniques of data pre-processing and several hybrid models in the simulation of municipal water demand depending on historical water consumption for other cities in the world due to the fact that there is no global method that surpasses all the models for predicting water demand.