1. Introduction
Wetlands are among the most significant ecological systems in the world. The Ramsar convention defines wetlands as ‘areas of marsh, fen, peat land or water, whether natural or artificial, permanent or temporary, with water that is static or flowing, fresh, brackish or salt, including areas of marine water the depth of which at low tide does not exceed six meters’ [
1]. Wetlands play a major role in ecological systems [
2]. They are among the most productive ecosystems and have multiple functions. Wetlands provide habitats for flora and fauna by maintaining a remarkable level of biodiversity [
3]. These ecosystems significantly contribute to ecological rejuvenation and biodiversity conservation [
4]. Wetlands accumulate flooding, by mitigating flood risk in downstream areas [
5]. Wetlands are effective at trapping sediment and heavy metals from surface runoff; hence, their role is significant in terms of nutrient retention and the purification of water flowing through these ecosystems [
6]. Therefore, wetlands are considered to be the kidneys of the environment. Wetlands are important in mitigating the impact of climate change [
7]. Records show that wetlands globally store 44 million tons of CO
2 per year [
8]. Wetlands can influence precipitation patterns and atmospheric temperatures [
9]. Additionally, wetlands are more predominant in in terms of socio-economic features as well [
10]. They directly or indirectly support the living conditions of humans. Wetlands provide recreational opportunities [
11]. Many tourists are attracted to wetlands, playing a key role in economic development [
12]. Furthermore, coastal and inland wetlands are responsible for two-thirds of global fish harvest [
10]. Therefore, the role of wetlands in environmental, social, and economic terms is even more prominent.
On the other hand, wetlands are one of the world’s most endangered ecosystems [
13]. Most wetlands have been drained for agricultural and industrial purposes [
14]. It is impossible to provide an accurate numerical figure of the global areal extent of wetlands. However, different associations and researchers have estimated this. As per the UNEP—World Conservation Monitoring Center, the world’s wetland spatial coverage was estimated to be 570 million hectares (5.7 million km
2). It is about 6% of the world’s land cover [
15]; however, these sensitive areas cannot be easily neglected because of their importance. Changes in wetlands take place because of natural circumstances, as well as anthropogenic activities. Anthropogenic activities can be unintentional, but the majority are intentional. Unintentional activities are due to a lack of knowledge of the importance of wetlands, and intentional activities are due to negligence and less value given to wetlands [
16]. Wetlands are highly affected due to anthropogenic activities such as transforming the wetlands into agricultural and aquacultural lands, vegetation clearance, construction activities such as dams and other water management structures, human settlements, etc. [
17]. Furthermore, the rapid growth of human populations and urbanization are other major threats to wetlands [
18]. On the other hand, insufficient inflows and a lower quality of water runoff due to urbanization and excessive consumption of water for agricultural purposes have resulted in poor water quality in the wetlands [
19]. According to Xu et al. [
20], wetland degradation takes place due to pollution (54%), biological resources use (53%), natural system modification (53%), and agriculture and aquaculture (42%). Research studies have shown that, from 1985 to 2010, the loss rate of wetlands was estimated to be 16.57 mile
2/year (42.91 km
2/year) [
21]. Therefore, we should take the necessary actions to protect this precious gift of nature.
Wetland water level fluctuations are important for hydrological systems [
22]. In addition, wetland water levels affect the chemical and biological characteristics of soil, ecological functions of the wetlands, etc. [
23]. Therefore, it is vital to understand and quantify the processes which affect water level fluctuations in wetlands. Wetland water levels mainly depend on the water holding capacity of the wetland, water inflows, and water outflows. Water inflows include precipitation, upstream water flow to the wetland, and groundwater flow, whereas the outflows include evaporation loss, downstream water flow from the wetland, etc. In addition, the antecedent moisture content in soil is also responsible for short-term water level variations [
24]. Hydro-climatic data such as precipitation, temperature, evaporation, relative humidity, wind speed, etc., as well as geological data such as soil permeability, moisture, etc., can be considered important factors in determining wetland water level [
25]. Moreover, wetland water levels are affected by several anthropogenic activities, such as commercial developments, drainage schemes, the extraction of minerals and peat, construction of dams and dikes, etc. [
14]. However, measuring wetland water levels is limited in many countries [
26]. Therefore, water level prediction is more important for the proper management and protection of wetlands and their surrounding areas.
Colombo, Sri Lanka, is noted as a unique capital city with several wetlands. However, due to severe urbanization, flash floods are common in Colombo city. The wetlands in the Colombo area act as flood detention basins. Therefore, it is essential to predict wetland water levels. However, other than some water quality analyses of the Colombo wetland system, no research has been conducted on predicting water levels. This paper presents an initial study to predict the water level of the Colombo flood detention basin concerning surrounding meteorological parameters, including rainfall, evaporation, temperature, relative humidity, and wind speed. This is the first-ever study to predict wetland water levels as a function of meteoroidal parameters in Sri Lanka. Sri Lanka is rated as one of the most influenced countries due to the changing climate by the Intergovernmental Panel on Climate Change. Therefore, this research has greater potential to address the upcoming climate-related issues in the capital of Sri Lanka, which is frequently flooded.
2. Artificial Neural Networks (ANN) to Predict Wetland Water Levels
Water level predictions can be conducted using both physical and data-driven approaches [
27]. Physical-based approaches increase the levels of complexity while requiring significant time to conduct and to develop [
26]. Physical methods have the disadvantage that they require a thorough knowledge of hydrological processes and a wide variety of data, including inflows, outflows, bathymetry, meteorology, etc. [
28]. Due to the limitations of traditional methods, machine learning techniques have recently gained attention [
29]. Machine learning techniques, such as Artificial Neural Networks (ANNs), have many advantages, such as their simplicity in terms of implementation, their rapid running speed, rapid convergence, and strong adaptability [
30].
Interest in the use of ANNs for water level prediction models has increased in recent years, as they can identify complex non-linear relationships within the raw data [
31]. In 1943, McCulloch and Pitts introduced the concept of the Artificial Neural Network [
32]. An ANN is a computational method that mimics the human brain, which consists of several interconnected neurons [
33]. The neurons are organized into two or more layers with weighted connections [
34]. A simple application of an ANN in water level prediction is shown in
Figure 1. It consists of an input layer, a hidden layer, and an output layer, which are initially used to train the neural network on known datasets and then to forecast unknown outputs from the known inputs.
As shown in
Figure 1, each neuron receives a set of known variables (referred to as ‘X’ values in
Figure 1). In the case of water level predictions, X variables can be the meteorological data for a selected time frame and water levels of the catchment in the same time frame. The hidden layer consists of a set of neurons, which identify the weighted connections of each input parameter. Let X
1 and X
2 be the independent variables and the W
1 and W
2 be the weightage of each parameter, respectively, the hidden layer will identify the weighted connection as W
1. X
1 + W
2. X
2. In addition, to avoid overfitting the model, a bias ‘b’ will be added. The activation function converts the input signal to the output signals. Equation (1) represents a basic mathematical expression of an artificial neural network.
The availability of the data is a key element for constructing a learning algorithm in neural networks [
35]. In wetland water level predictions, meteorological data such as precipitation, temperature, relative humidity, wind speed, etc., can be considered independent variables, while hydrological data such as previous water levels can be considered dependent variables [
26].
Dadaser-Celik and Cengiz [
22] predicted water levels in the Sultan Marshes wetland in Turkey. Climatic variables including precipitation, air temperature, and evapotranspiration were used in their study. Model training was conducted using the conjugate training backpropagation method. The developed model was tested for its accuracy using the root mean square error (RMSE) and the coefficient of determination (R
2) values. Furthermore, they conducted sensitivity analysis by considering the relative importance of each variable in the ANN model. It was concluded that the ANN model was the most sensitive to the previous months’ water levels.
Altunkaynak [
36] used neural networks to forecast water levels in Lake Van, in Turkey, which has an accompanying wetland area. The model was trained using a backpropagation algorithm. The researchers suggested that the artificial neural networks gave accurate results, even though the relationships between the parameters, such as rainfall and consecutive water levels, were complex. In three different cases, they trained the neural network by having various arrangements in the input nodes. All three models produced fairly similar results. They have found that traditional methods were more complex and less reliable than neural network models. Results for the neural network model showed that the relative error for this model was below 10%, which was acceptable.
Choi et al. [
26] showcased the importance of predicting wetland water levels; however, they also pointed out the difficulty of that process due to data limitations. They predicted the water level of the largest wetland in South Korea, the Upo wetland, using several machine-learning techniques, including ANN, decision trees (DT), Random Forest (RaF), and support vector machines (SVM). The dependent variables were the daily water level data over seven years, from 2009 to 2015, while the independent variables were meteorological data and upstream water level data. The correlation coefficient (CC), root mean square error (RMSE), and Nash Sutcliffe efficiency (NSE) were the three statistical indicators used for the evaluation of the model’s performance. Prediction performance indicators demonstrated excellent accuracy for their work.
Artificial neural networks were used to investigate water level variations in the Kerala Vembanad Wetland [
37]. The input parameters were rainfall and river discharge data, as well as the previous day’s water levels. The output was one day ahead of the water levels at the selected stations. The model results were expressed in terms of several numerical indices, such as the correlation coefficient and the root mean square error. Furthermore, they found that neural networks failed to accurately predict water levels when there was no information on wetland storage conditions. Therefore, previous water level inputs needed to be considered as the initial requirements of the model. Saha et al. [
38] conducted wetland water depth and area prediction using artificial neural networks and a non-linear regression model. They used Landsat satellite images of wetlands in the Atreyee River basin during the pre-and post-monsoon seasons to map wetlands in that river basin. In conclusion, they stated that both models performed well in terms of accuracy. Nevertheless, researchers have put their faith in Artificial Intelligence (AI) because it is a physical process-based model.
4. Case Study
As stated, the study area in this research was selected as the Colombo flood detention basin. Colombo has a tropical monsoon climate. The mean annual temperature in the Colombo region varies between 26.5 °C to 28.5 °C, while it receives an average annual rainfall of 2300 mm. Colombo city is highly vulnerable to flooding. Frequent floods due to heavy rainfall have occurred during the last two decades. Due to various developments, the flood detention basin capacity has been reduced by 30% [
45]. The primary causes of flooding can be identified as increased surface runoff due to accelerated development, the shrinking of retention areas, and a lack of capacities in the canal networks and the wetlands. Therefore, flood risk in the Colombo district can be addressed by implementing strategies to manage and protect the Colombo flood detention area.
Figure 2 presents the study area in Colombo, Sri Lanka. The Colombo flood detention area consists of three wetlands/marshes: Kotte marsh, Kolonnawa marsh, and Heen marsh (refer to
Figure 2). The altitude of the study area is approximately 3.0 m above the mean sea level. Metro Colombo Basin has a basin cover of 105 km
2, and total wetland coverage in the Colombo metropolitan region is 20 km
2. There are six water level monitoring stations located within the basin: G1—Diyawanna Lake; G2—Kotte North Canal; G3—Kotte Canal; G4—Kimbulawala bridge; G5—Heen Canal; G6—Dematagoda Canal (
Figure 2).
A prediction model was developed using meteorological data and the previous year’s water levels. According to the literature, the majority of the related studies have utilized the time series data of meteorological data as the input data to the model [
26,
27,
46]. Meteorological data were collected from the Department of Meteorology, Sri Lanka, while the water levels were collected from the Land Development Corporation, Sri Lanka. Recent meteorological data were available for the study; however, the water levels were unavailable and inconsistently recorded due to poor maintenance. In addition, some water level measuring points were removed by illegal settlers. As independent variables to the model, daily rainfall, daily evaporation, daily minimum temperature, daily maximum temperature, daily relative humidity at day and night, and daily average wind speed were used. The abovementioned meteorological factors affect water level fluctuations in wetlands through several processes. Rainfall is considered the primary factor affecting the water balance at the space scale and time scale. As the infiltration rate in wetlands is lower, they are more sensitive to severe and sudden rainfall. Therefore, wetlands are highly vulnerable to changes in the atmosphere. As temperature increases, rainfall patterns change, and evaporation increases by alerting the wetlands. In fact, when temperature increases, relative humidity decreases, as it represents the moisture present in the atmosphere. On the other hand, high wind speeds result in more evaporation in water bodies, reducing their water levels.
Figure 3 presents the temporal variation of meteorological parameters for the selected time span. Significant variations can be found from year to year.
Figure 3a shows rainfall variation from 2004 to 2012. Within the selected time frame, Colombo received the highest rainfall in 2010, which was 3370 mm per year, and, in 2011, it received the lowest rainfall, which was 1775 mm per year. Evaporation in the Colombo area takes significantly higher values as higher as 1250 mm/year. The recorded maximum temperature was 31 °C and the minimum temperature was 24.4 °C. The relative humidity during the daytime was lower than at nighttime. Relative humidity during the day varied from 74 to 77, whereas, at night, it varied from 86 to 88. Maximum average wind speed was higher in 2011 and lower in 2006.
Figure 4 shows water level data for the six monitoring stations (G1: Diyawanna Lake, G2: Kotte North Canal, G3: Kotte Canal, G4: Kimbulawala bridge, G5: Heen Canal, and G6: Dematagoda Canal). For most of the time periods, the water level remains below 1 meter. During the rainy season, sudden changes in the water levels can be seen. Water level data in the morning and evening at six stations (as per
Figure 4) in the study area was obtained, and the daily average value was considered for the model’s development.
Data processing was conducted to make sure that neural network training was more efficient, which led to better model performance. Furthermore, it sped up the learning process. It was clearly observed that the data ranges were not highly deviated. Therefore, it did not require the normalization of data. It was assumed that the data had a Gaussian distribution as well. Outliers in the data sets were identified at the beginning, and corrections were made. The available data were randomly split into three subsets to avoid overfitting the model. Model training was conducted using 70% of the data, while model testing was conducted using 15% of the data. The remaining 15% of the data was used for model calibration.
The seasonality of the time series improved the performance of the model. As four seasons can be observed in Sri Lanka, analysis was performed on a seasonal timescale. The first inter-monsoon is from March to April, and the southwest monsoon is from May to September. The second inter-monsoon is from October to November, and the northeast monsoon is from December to February. Colombo is on the west coast of Sri Lanka and experiences heavy rainfall, mostly from May to September. Nevertheless, Colombo is exposed to rainfall throughout the year, as it has a tropical monsoon climate. In fact, global warming affects climate patterns worldwide.
Levenberg Marquardt (LM) and Scaled Conjugate (SC) algorithms were used as the training algorithms of the neural network model. A trial-and-error procedure was followed to obtain the optimal neural network structure. The network’s most complex computations were carried out by the hidden layer. As neural network models are sensitive to the number of neurons in the hidden layer, a rigorous analysis was carried out by changing the number of neurons in the hidden layer. Neurons were changed from 1 to 40. The coefficients of correlation (R) and mean squared error (MSE) were used as the performance indices of the models.
5. Results and Discussion
As explained in the preceding section, the daily water levels at the Colombo flood detention area were simulated with inputs of daily rainfall, evaporation, maximum and minimum temperatures, relative humidity, and average wind speed. Seasonal time series analysis was conducted to obtain the best performance from the models. The best outcomes were obtained when there were 12 neurons in the hidden layer.
Seasonality could be recognized as characteristic of a time series in which the data experience recurring changes that can be predictable every year. Therefore, conducting this analysis based on seasonal changes directed this research in a significant direction.
Figure 5a–d shows the scatter plots which were derived from the analysis conducted based on the Levenberg Marquardt algorithm for the first inter-monsoon time period (March to April). The plots show comparatively stronger correlations for training, validation, testing, and all (combination) modes, with coefficients of correlation greater than 95%. Training results show the strongest correlation, with a 97% correlation.
Figure 6a–d shows the scatter plots for the southwest monsoon season (May to September) based on the LM algorithm. Stronger correlations can be observed in all forms, except for the testing results. While other forms show more than a 95% level of correlation, testing results show an 88% coefficient of correlation, which can be considered a stronger correlation, but it is not the strongest compared to the others.
When the second inter-monsoon period (October to November) is considered based on the LM algorithm, as shown in
Figure 7a–d, a stronger coefficient of correlations can be observed (more than 95%) in all forms. Validation results show the strongest level of correlation, scoring a 98% coefficient of correlation out of all other forms.
As shown in
Figure 8a–d, the northeast monsoon period (December to February) analysis based on the LM algorithm shows comparatively stronger correlations, but it is not as strong as before (about 90%). Out of all the forms, validation results show the strongest bond, scoring a 93% level of correlation.
By collectively comparing all of the seasons, the second inter-monsoon period shows the strongest correlation between observed and predicted wetland water levels, with a level of correlation greater than 96%, as per the scatter plots shown in
Figure 6,
Figure 7,
Figure 8 and
Figure 9. After that, the first inter-monsoon period shows the next strongest correlation and, following that, the southwest monsoon period, and finally, the northeast monsoon period. Overall, all of the seasons show solid correlation patterns among observed and predicted wetland water levels in the Colombo flood detention basin, with correlation levels exceeding 88%.
With respect to the LM algorithm, similar results were shown in the scaled conjugate (SC) analysis as well. As presented in
Figure 9a–d, SC analysis for the inter-monsoon period (March to April) shows stronger correlations among observed and predicted wetland water levels. More than 86% of levels are scored, while the training result plot shows the highest correlation level (93%) out of the other forms in that season. However, the other forms also show stronger correlations: 86%, 91%, and 92% for validation, testing, and all (the combination of all forms), respectively.
Figure 10a–d depicts the SC analysis conducted for the southwest monsoon period (May to September). Comparatively stronger correlations are obtained (more than 91%) for all forms. Testing results show the highest level of correlation (96%), while training results show a similar stronger relationship (95%) between predicted and observed wetland water levels. Validation results show the lowest coefficient of correlation (91%) among the other forms in the southwest monsoon period, but it cannot be considered a weaker correlation.
Similarly, high levels of correlation were obtained when analysis was conducted for the second inter-monsoon period (October to November) on the basis of the SC algorithm (
Figure 11a–d). A 97% level of correlation was achieved by both training and validation results, while the testing results showed the lowest coefficient of correlation among the other forms, scoring an 89% level of correlation.
Comparatively moderate correlations were obtained for the analysis conducted for the northeast monsoon period (December to February) data, based on the SC algorithm (
Figure 12a–d). Testing on the scatter plot showed the lowest correlation (74%), compared to the other forms, which scored about 87–88% levels of correlation. However, the northeast monsoon period analysis cannot be treated as weaker as it comparatively scored more than 74% level of correlation, which can be considered moderately stronger.
When comparing all four seasons together, the strongest correlation (about 96%) could be observed for the second inter-monsoon period, depending on the SC algorithm, and, after that, the southwest monsoon (about 94%), the first inter-monsoon (about 92%), and finally, the northeast monsoon (about 86%) showed correlations, respectively.
When considering both algorithms together, the same pattern was observed. The strongest coefficients of correlation were obtained for the second inter-monsoon period. Then, the levels of correlation were gradually reduced for the southwest monsoon, the first inter-monsoon period, and finally, the northeast monsoon period, respectively. Therefore, a stronger trend could be found among observed and predicted wetland water level data during the southwest monsoon period in Sri Lanka for the Colombo flood detention basin.
Figure 13a–d depicts the model validation performances for the LM algorithm in all four seasons. It can be clearly seen that the northeast monsoon achieved the lowest mean square error of all four models, which was 0.0014. It took only five epochs to reach the best validation performance. Among the four seasons, the second inter-monsoon showed the highest mean square error, which was 0.01, while taking four epochs to reach it. It can be stated that the mean square errors in all four seasons were close to zero, which showed better validation performances.
When considering the validation performances based on the SC algorithm, as shown in
Figure 14a–d, the northeast monsoon showed the lowest mean square error of 0.002, while taking 39 epochs to reach the best validation performance. The second inter-monsoon results showed the highest mean square error value of 0.01, while taking 34 epochs to achieve the best validation performance. When collectively comparing all of the seasons, all the plots showed lower mean square error values, which highlighted comparatively better validation performances.
With respect to the above plots, it can be clearly observed that the number of epochs taken by each of the seasonal results is lower in the LM algorithm compared to the SC algorithm.
However, in both analyses, it is clearly proven that both algorithms are showing lesser mean square error values closer to zero, which is a sign of better validation performance. As shown in the performance plots with epochs and having lower mean square values at the end of the training phases, it is clearly proven that desired outputs and the artificial neural network’s outputs for the training sets are very close to each other. The following table (
Table 1) shows a summary of all the results obtained for each of the seasons using both algorithms.
Table 2 displays the results of the uncertainty analysis. The d-factor of all first inter monsoon, southwest monsoon, second inter monsoon, and northeast monsoon water levels is low, indicating that the model is reasonably accurate in predicting wetland water level data.
The results showcased the importance of predicting wetland water levels in the Colombo flood detention wetland. The computational accuracy and robustness are significant in the prediction model. Therefore, planners can use this model as an initial model by which to understand the behavior of the wetland water levels. Sri Lanka is noted as one of the counties most influenced by climate change. This is very important as Sri Lanka is a densely populated island in the Indian Ocean. In addition, the case study was carried out in the main strategic area of the country which is Sri Lanka’s administrative and commercial capital. Therefore, prediction of wetland water levels in the surrounding areas is essential.
However, the accuracy of the prediction model is entirely based on the quality of the data provided in training the ANN model. Sri Lanka has a dense rain gaging system; however, it lacks all other meteorological data. Therefore, the spatial distribution of meteorological data can influence the prediction model as the Colombo flood detention wetland has a wider area. Furthermore, the temporal variations of the meteorological data were mostly recorded on a daily basis. However, the finer resolution of extreme weather events was not considered in daily data. Sometimes, these extreme events can happen on an hourly basis, and those were not recorded. Therefore, this could also impact the results of the study.
Furthermore, one of the major difficulties faced in conducting this research was the unavailability of continuous water levels in the Colombo wetlands. This is due to several reasons. Poor maintenance of the water gages is one of the major issues identified during the research. Furthermore, there have been several developments and reclamations (legal as well as illegal) associated with the wetland in the Colombo region. Due to the ignorance and negligence of the relevant authorities and the community, the wetlands are still not properly treated.
However, as already stated, this is the first study on wetland water level prediction in Sri Lanka. Therefore, there are no studies with which to compare the results obtained from this analysis. However, the literature demonstrates some similar studies and presented similar performances based on the analysis. Choi et al. [
26] used similar independent variables to predict the wetland water levels of the Upo wetland and found acceptable results. However, these studies should not be used for a comparative analysis within the Sri Lankan context. The climate patterns, including monsoons and other extreme events, are different in Sri Lanka to other countries. In addition, the Colombo flood detention wetland is unique in its topography and soil conditions.
6. Conclusions
This study indicates the applicability of artificial neural networks for modeling daily water levels in wetlands. The Colombo flood detention area, which consists of three marshes/wetlands, was selected as the study area. It is an area highly vulnerable to flooding due to the rapid shrinkage of the wetlands in that area. As the inputs to the neural network model, daily meteorological data were selected, including rainfall, evaporation, minimum and maximum temperatures, relative humidity, and average wind speed. Two types of neural network training algorithms, namely the Levenberg Marquardt algorithm and the Scaled Conjugate algorithm, were used to train the models. Model training was conducted using 70% of the input data. Model testing and validation were conducted using 15% of the data for each. Analysis was performed on a seasonal timescale by considering the four seasons in Sri Lanka. It was found that when there were 12 neurons in the hidden layer, the model gave better results.
Model results revealed that both the Levenberg Marquardt algorithm and the Scaled Conjugate algorithm outperformed each other in simulating wetland water levels. However, the performance indicators gave a better approach under the LM algorithm. Similarly, the coefficient of correlation values also suggested better usage of the LM algorithm. Therefore, it can be concluded that the LM algorithm produces better results by which to model the wetland water level than the SC algorithm.
Nevertheless, we selected the Colombo flood detention area as the case study area as it can be considered a critical wetland in Sri Lanka. As the capital city of Sri Lanka, Colombo city is being drastically developed. On the other hand, wetland coverage in the Colombo region is being rapidly reduced. Therefore, we tried to emphasize the value of the Colombo flood detention area and the importance of developing a model to predict the water levels in the Colombo flood detention area. This study can be considered fundamental, and we would recommend further developing this model to avoid the above-mentioned issue.
Therefore, the prediction model can be used to perform any future forecasting analysis within a local context. The availability of metrological data is highly essential for such analysis. The model, therefore, can be used to predict short-term water levels in the Colombo flood water basin. Given the availability of future climate data, the model can be used to predict long-term water levels. With the modeled Representative Concentration Pathway and Shared Socioeconomic Pathways climatic data, the model could be used to predict future wetland water levels in long-term analysis. However, these modeled data have to be bias-corrected before they are fed into the prediction model. These results can be effectively used to develop various policy decisions in the management and conservation of wetlands.