A Multi–Step Approach for Optically Active and Inactive Water Quality Parameter Estimation Using Deep Learning and Remote Sensing

: Water is a fundamental resource for human survival but the consumption of water that is unﬁt for drinking leads to serious diseases. Access to high–resolution satellite imagery provides an opportunity for innovation in the techniques used for water quality monitoring. With remote sensing, water quality parameter concentrations can be estimated based on the band combinations of the satellite images. In this study, a hybrid remote sensing and deep learning approach for forecasting multi–step parameter concentrations was investigated for the advancement of the traditionally employed water quality assessment techniques. Deep learning models, including a convolutional neural network (CNN), fully connected network (FCN), recurrent neural network (RNN), multi–layer perceptron (MLP), and long short term memory (LSTM), were evaluated for multi–step estimations of an optically active parameter, i.e., electric conductivity (EC), and an inactive parameter, i.e., dissolved oxygen (DO). The estimation of EC and DO concentrations can aid in the analysis of the levels of impurities and oxygen in water. The proposed solution will provide information on the necessary changes needed in water management techniques for the betterment of society. EC and DO parameters were taken as independent variables with dependent parameters, i.e., pH, turbidity, total dissolved solids, chlorophyll– α , Secchi disk depth, and land surface temperature, which were extracted from Landsat–8 data from the years 2014–2021 for the Rawal stream network. The bi– directional LSTM obtained better results with a root mean square error (RMSE) of 0.2 (mg/L) for DO and an RMSE of 281.741 ( µ S/cm) for EC, respectively. The results suggest that a hybrid approach provides efﬁcient and accurate results in feature extraction and evaluation of multi–step forecast of both optically active and inactive water quality parameters.


Introduction
Water is an essential resource for human survival on Earth. However, water quality deterioration is a common occurrence due to various anthropogenic activities, including the improper disposal of sewage and other waste materials, construction and poor agricultural practices [1,2]. Water bodies can also be physically affected by natural factors such as the erosion of soil [3]. It is important to continuously monitor any deterioration in quality and plan for appropriate recovery mechanisms such as the use of aerators, linings, biological treatments, embankments, etc. The most commonly used parameters for analyzing water quality include physico-chemical parameters such as pH, conductivity and turbidity. These parameters are usually gathered manually and later tested in laboratories to measure water quality, which can be a tedious and time-consuming task. In Pakistan, as in most other countries, these traditional methods and tools are used for collecting and analyzing water samples [4][5][6]. Moreover, this requires human intervention and depends on the ready availability of data collection sites. Overall, this can lead to delayed action in response to events, leading to a deterioration in water quality. Traditionally, water quality estimation studies focus on predicting the water quality index (WQI) value, which is a multi-classification problem. However, water quality indices are biased as they are developed for a specific place and use a limited number of parameters. Thus, such indices are not applicable to all water types as they are dependent on the core physico-chemical water parameters, the location and the frequency of data sampling. With the recent advancements in remote sensing technology, a more generic approach can be used for acquiring timely data and increasing coverage in assessing water quality for any drinking water reservoir [7][8][9][10]. In remote sensing, the water quality is monitored by measuring the parameters that change the spectral properties of water bodies upon their interaction with light. These are known as the optically active constituents of water. On the other hand, there also exist components that do not show any direct detectable signals but can be estimated as they show high correlations with the detectable water quality parameters and these are referred to as the optically inactive parameters of water [11,12]. However, remote sensing alone does not have the capability to assess the water quality with precise and accurate results. Thus, modern techniques involving the combination of remote sensing and AI for accurate and timely water quality forecasts can be a more useful approach [13,14].
As for multi-step forecasting, researchers have been looking for more suitable models as the state-of-the-art artificial neural networks (such as MLP) directly consider each time point independently and discard much of the information in historical data in order to make a prediction at each time step [15,16]. Here, deep-learning-based regression models have been proven to be more effective as compared to machine learning models in solving complex regression problems such as multi-output multivariate time series forecasting [17]. The traditional models lack the ability to capture real-world dependencies, whereas deep neural networks such as recurrent neural network (RNN) and long short term memory (LSTM) models can be very powerful in this regard [18,19]. This is especially true for multi-output problems, where temporal dependencies need to be detected to make future forecasts, as in the case of weather forecasting [20].
The use of deep learning on remote sensing data for water quality parameter estimation is very limited. However, the work on water quality estimation through remote sensing has been utilized in this study. Due to the availability of various satellite images, water quality parameters have been investigated and various researchers have proposed different estimation algorithms for calculating water quality parameters. These studies have used satellites including Landsat, Sentinel and MODIS. Most of the studies have focused on optically active parameters, such as Chl-α [21,22], temperature [11], turbidity [23] and total suspended solids [24,25]. The reflection characteristics of optically active variables have allowed researchers to estimate parameters using semi-empirical/semi-analytical methods. These methods are used to establish patterns between the band wavelengths and the water quality parameters and to derive formulas for parameter estimations. For example, turbidity is calculated using bands 2 to 5 [26] and wavelength bands of 645 nm and 859 nm [27] of Landsat 8 images. Chl-α is extracted from images of Sentinel-2A [28]. However, parameters with weak optical characteristics are also important for assessing the water environment. Such water quality parameters can be derived from the optical active parameters [29]. Optically inactive parameters are also retrieved through remote sensing [30]. Similarly to optically inactive parameters, DO is retrieved through regression methods applied to establish patterns comparing the remote sensing and field data based on the ratio of Bands 2 and 4 [31]).
With the advent of artificial intelligence, machine learning is gradually being applied on remote sensing data. The use of machine learning techniques for water quality parameter estimation is traditionally carried out with models such as support vector machines (SVMs) [32]. Similarly in [33], 12 water quality parameters including DO, EC, nitrate, nitrite, pH, turbidity, etc., were extracted from the Karun River and the water quality index (WQI) was estimated with the use of a M5 Model Tree classifier that exhibited an RMSE of 1.412 and an MAE of 0.0274, in combination with the Gamma test technique, which was applied to the acquired data for data reduction purposes. An artificial neural network (ANN) model in combination with a linear regression model was used to extract total phosphorus and total nitrogen concentrations from Landsat 8 images [34]. Other regression-based models, including evolutionary polynomial regression, have been used to predict DO, biochemical oxygen demand (BOD) and chemical oxygen demand (COD) with nine independent variables i.e., pH, turbidity, nitrite, nitrate nitrogen, phosphate, calcium, magnesium, sodium and EC, giving RMSE values of 4.417, 4.999 and 5.557 for DO, COD and BOD, respectively [35]. A deep neural network (DNN) was proposed, using multiple hidden layers between the input and output layers and this network performed well in resolving complex problems with high accuracy [36]. Deep-learning-based regression models are very effective as compared to traditional models in solving complex regression problems such as the forecasting of water quality parameters. A CNN model was used to estimate the concentrations of phycocyanin and chl-α using airborne hyperspectral imagery [37]. In [38], deep-learning-based regression models were applied to remote sensing images of the Guanhe river in China to estimate optically inactive water quality parameters-zinc, the permanganate index, total nitrogen, and total phosphorus-with a coefficient of determination (R 2 ) greater than 0.6. A hybrid approach using a traditional model (ARIMA) and neural network model was investigated for water quality time series prediction, resulting in RMSE values of 0.039, 0.063, and 0.051 for water temperature, boron and DO, respectively [39]. A regression convolutional neural network (RegCNN) was proposed for multi-step wastewater treatment prediction with an MSE of 0.05 [40].
The literature has revealed that, overall, the use of remote sensing techniques for the estimation of water quality parameters is a much faster and economical method, with minor concerns regarding the accuracy of the parameters retrieved. In addition, the studies have discussed the importance of deep learning models in multi-step water quality forecasts. However, less work has been conducted on utilizing the combination of both techniques for water quality monitoring. Thus, in this study, an approach utilizing both remote sensing and deep learning techniques applied to optically active and inactive water quality parameter estimation was investigated.
In this study, data were acquired for the stream network of the Rawal watershed. The Rawal watershed area consists of land as well as water streams. Hence, the stream network was extracted from the Rawal watershed using GIS tools. A digital elevation model (DEM) was created with Shuttle Radar Topography Mission (SRTM) data to extract the stream network. A total of eight water quality parameters were extracted from Landsat 8 (Collection 1 Level 1(C1 L1)) images for the period from 2014 to 2021. Amongst these eight parameters, six were optically active and two were optically inactive parameters. The optically active water quality parameters included "turbidity", "total dissolved solids (TDS)", "electric conductivity (EC)", "Chlorophyll-α (chl-α)", "Secchi disk depth (SDD)" and "land surface temperature (LST)". The optically inactive parameters were "pH" and "dissolved oxygen (DO)". Out of the eight parameters, seven were taken as dependent variables to estimate the future concentrations of the inactive parameter 'DO', which was considered an independent variable. Similarly, 'EC' was considered an independent variable amongst the eight parameters, whereas the remaining seven parameters were taken as dependent variables. The estimation of the EC and DO concentrations was chosen as these parameters are crucial in monitoring water quality. EC and DO help to identify the level of impurities and the level of oxygen in the water bodies, which can help analyze the survival of fish and other aquatic organisms. In addition, to analyze the performance of deep learning models on multivariate multi-step forecasts; various deep learning models including a convolutional neural network (CNN), fully connected network (FCN), recurrent neural network (RNN), multi-layer perceptron (MLP) and five variants of LSTMs [41] that included vanilla, stacked, bidirectional, convolutional and CNN LSTMs were evaluated. This study was limited to the satellite imagery collected for the years 2014 to 2021 that covered the Rawal watershed area. Moreover, the optically active and inactive water quality parameters, i.e., EC and DO, were estimated for current and future events, using different water quality parameters with deep learning models. The study revealed that LSTMs demonstrated significantly goodperformance in multi-step forecasting for both optically active and inactive (EC and DO) parameters. The major contributions of this study are as follows: 1.
The extraction of the stream network for the Rawal watershed from the SRTM DEM.

2.
The extraction of a total of eight water quality parameters, six optically active and two optically inactive water quality parameters, by applying estimated band equations on Landsat 8 satellite imagery for the Rawal watershed stream network pertaining to the years 2014-2021.

3.
The application of deep learning models for current and future multi-step forecasting of an optically active parameter, i.e., EC, and an optically inactive parameter, i.e., DO, using optically active/inactive water quality parameters. The analysis conducted using the deep learning models demonstrated the decline in water quality over the eight-year period and revealed that the factors that have contributed to the deterioration in water quality include seasonal variations and other environmental variables.
The value of using a remote sensing and machine learning approach was that it led to some important conclusions, including the identification of (i) the fact that the quality of water declined over the eight-year period, as well as (ii) the factors that contributed to this deterioration in water quality. In this study we aimed to find practical methods to analyze the factors affecting the water quality and to investigate the changes needed in the traditional water quality monitoring techniques for the betterment of society on a global scale. This will improve the socio-economic environment, which is dependent on an appropriate standard of water quality for its development, which may include activities such as agricultural operations. Therefore, the proposed solution can be used as a guideline for applications in other drinking water reservoirs besides the current study area. The hybrid deep learning and remote sensing approach can promote innovation in state-of-the-art water quality management and assessment techniques.
The paper is organized as follows. Section 2 covers the proposed methodology for the extraction of the optically active and inactive water quality parameters and the application of deep learning models is discussed. The results of the deep learning models are elaborated in Section 3. In Section 4, the conclusions and future works in this area of research are presented.

Materials and Methods
In this paper, we introduces a multi-step forecasting-based deep learning model for the multi-step parameter estimation of two optically active and inactive water quality parameters, i.e., EC and DO, for the study area of the Rawal watershed stream network. Figure 1 illustrates the methodology employed for creating the desired model. The process is divided into four main steps. These steps and the respective methods used are discussed in this section.

Study Area
The Rawal watershed covers an area of 272 sq km within longitudes 73°03'-73°24' E and latitudes 33°41'-33°54' N [42]. The watershed area is surrounded by highly populated places, which results in water quality deterioration due to anthropogenic activities such as improper sewage disposal. Water is received from 4 major streams and 43 small streams. The Rawal watershed encompasses land, as well as the water tributaries. Thus, to extract the parameter values from only the water bodies, the study area was enhanced by producing a stream network using SRTM DEM data, and this can be seen in Figure 2. Stream Network: The production of the stream network required the latest map of the Rawal watershed area. However, due to construction and development over the years, the most recent map of the watershed did not show a high-resolution image of the area of the water streams. To overcome this issue, GIS tools were utilized to extract only the water bodies from the Rawal watershed. The resultant stream network was produced using the SRTM data. The SRTM images of the desired area were mosaicked together using ArcGIS software [43]. Later, flow direction and flow accumulation were calculated with the Hydrology toolset in ArcGIS software to produce the required DEM. This tool helped to model the flow of water across the Rawal DEM. The Rawal stream network can be seen in Figure 2.

Data Acquisition
The data were acquired in the form of multi-spectral satellite images from Landsat 8 (C1 L1) satellite data from the archives of the United States Geological Survey (USGS) [44]. The Landsat images were observed for the years 2014 to 2021, which comprised of a total of 327 images. However, in the data preprocessing phase, 167 out of 327 satellite images were found to show or cover the Rawal lake area. These preprocessed satellite images were used to perform band calculations to acquire water quality parameters from only the water streams located in the watershed. A stream network was produced using SRTM DEM data and then both optically active and inactive parameters, including pH, turbidity, DO, TDS, EC, chl-α, SDD and LST, were extracted. Five-thousand data/sample points were retrieved from each satellite image after the calculation of the water quality parameters, i.e., 820,848 data points in total, as seen in Table 1. Figure 3 shows the acquisition process for a single Landsat image and the features extracted for a single data point. Each parameter selected for this study plays a key role in the monitoring of the water health [45]. For example, the LST parameter is responsible for many water-borne processes. Similarly, high and low values of pH determine the usability of water. pH values in the range of 6.5 to 8 are considered ideal for the productivity of fish and other aquatic organisms. EC is an important indicator of pollution or some other discharge in the water body. On the other hand, the turbidity and SDD parameters signify the clarity of water, which can determine the depth of photosynthesis that can take place in the water body. Thus, aquatic organisms are dependent on water turbidity for survival as highly turbid water can impact the level of DO, which will affect the growth rate. The Chl-α parameter indicates the presence of algae growth, which is essential for photosynthesis and oxygen production. Another important parameter is DO, which has the highest significance amongst the other variables, as all respiring organisms are dependent on it for their survival. Moreover, the Landsat 8 remote sensor, used to retrieve data on these water quality parameters, has a spatial resolution of 15-100 m, with the presence of 11 bands. The parameters that were successfully retrieved based on the band calculations used in previous studies include the pH, turbidity, DO, TDS, EC, chl-α, SDD and LST. These eight parameters were then reproduced for the selected study area.  Water Quality Parameter Extraction from Landsat Images Landsat 8 (C1 L1) satellite data for an eight year time period, i.e., 2014 to 2021, were used to extract the optically active and inactive water quality features, using different band combinations. These satellite images have 11 bands with high-quality Landsat scenes, i.e., 30 m (Bands 1-7, 9), 100 m (Bands 10, 11), and 15 m (Band 8). A total of 167 images were retrieved and 5000 samples were extracted from the stream network for each image. To extract each water quality feature from the images, an estimation algorithm was applied, which involved the following steps.

1.
Conversion of Digital Numbers (DN) to Top-Of-Atmosphere (TOA) Reflectance: Preprocessing of the satellite images comprised operations including atmospheric or geometric correction and normalization. The first step of retrieving the water quality features involved the conversion of the DNs or the pixels in the satellite image to TOA reflectance values. TOA reflectance values include factors from clouds, atmospheric aerosols and gases. These DNs are converted to ToA reflectance values using rescaling coefficients and parameters found in the metadata file provided with the data and using the following expression: In Equation (1), R x = TOA reflectance for band number x; M P = REFLECTANCE_MULT _BAND_x, Q cal = standard pixels of band x or DN of band x; and A P = REFLECTANCE_ADD _BAND_x where x is band 2, 3, 4, 5 and 6, respectively. This conversion formula uses values such as REFLECTANCE_MULT_BAND and REFLECTANCE_ADD_BAND , which are kept in the metadata set with each image. REFLECTANCE_MULT_BAND is multiplied for the reflectance correction valueto be applied with each input band and its default value for Landsat 8 is 0.00002. Similarly, the REFLECTANCE_ADD_BAND is the addictive correction value for reflectance to be applied with each input band and its default value is −0.1 [46].

2.
Application of the Estimated Equations: The optically active/inactive features were then calculated by applying the algorithms given in Table 2. These methods were selected as they performed the best amongst others for the selected study area. Band math analysis was applied to the images using the Google Earth Engine. A total of 0.82 M sample points for every feature were extracted.
Each feature was calculated based on different band combinations. The optically inactive pH feature used a combination of bands 3, 4 and 6. The optically active turbidity feature was extracted with bands 3, 4 and 5. Bands 1 and 5 were used to extract EC and TDS. A combination of bands 2 and 4 were used to extract DO and SDD. Finally, bands 2, 3, 4 and 5 were used to retrieve chl-α.

3.
Evaluation of the Equation: The methods were evaluated by comparing them with the observed ground parameters for the study area. The best-performing method on the selected study area was selected for extracting the sample points. LST L 10Λ = M L × Q cal + A L

Deep Learning Models
In this study, various deep learning models, including MLP, CNN, FCN, RNN, and five variants of LSTMs, were considered for the comparison of their estimations of the water quality parameter concentrations. For time series problems, deep learning models such as RNN and LSTM can discover dependence in the historical data with the patterns in their networks. Deep networks such as CNN are used for image and video classification problems but they can also be used for sequential data. In the following, we introduce the parameters and the structures of the deep learning models used in this study.

Multi-Layer Perceptron (MLP)
In this study, the MLP model was made up of three layers in a dense layer. The first layer had 128 neurons and the second layer had 64 neurons, each followed by a rectified linear unit (ReLU). The ReLU activation function was used as it is fast, simple and works well with a deep neural network, compared to other activation functions. The second layer was followed by a dropout activation function. To avoid overfitting/ underfitting problems, the deep neural network used a regulatory layer known as the dropout layer [52]. The optimization hyperparameter was used to minimize the loss function to an acceptable level.

Convolutional Neural Network (CNN)
A one-dimensional CNN was employed for estimating water parameter concentrations in this study, and this did not differ much from a regular CNN model [53,54] other than the fact that the convolutional hidden layer operated on one-dimensional sequential data [55]. In this study, the first convolutional layer was followed by a second layer and then a pooling layer that summarized the features by filtering the output of the preceding convolutional layers. The convolutional and pooling layers were followed by a flatten layer to reduce the input to a single one-dimensional vector. Then a dense fully connected layer was implemented to interpret the extracted features.

Fully Connected Network (FCN)
The FCN model employed for this study was the same as the architecture originally proposed by Wang et al. in 2017 [56], composed of three convolutional blocks, in which each convolution is followed by a batch normalization fed to a ReLU activation function with a slight change in the pooling layer where instead of taking the average, the results are fed to a max pooling layer. Finally, this is followed by a dense layer to obtain the final output.

Recurrent Neural Network (RNN)
The RNN employed in this study for predicting multi-step water quality parameter concentrations is known as the stacked RNN. It uses a combination of multiple recurrent neural networks [57]. The model had 2 layers; the first RNN layer was followed by a dropout layer and then the second RNN layer, followed by the final dense layer.

Long Short Term Memory (LSTM) and Its Variants
LSTM was first proposed by Hochreiter and Schmidhubercin 1997 [58], and it is popular due to its internal self-looped cell that captures the dynamic characteristics of a time series problem. Five variants of LSTM models were evaluated in this study. These Vanilla LSTM (V-LSTM) is the most commonly used LSTM proposed by Graves and Schmidhuber. It has a single hidden layer with forget gates and an output layer. S-LSTM is simply an LSTM model that has multiple hidden layers, each stacked one on top of another. All layers use the output of the previous layer as their input. The final output is passed on to a full-connect layer for classification. Bi-LSTM learns both forwards and backwards, as proposed in [59]. This model is capable of accessing long-range context in both directions. The model is trained using back-propagation through time (BPTT) [60]. The Conv-LSTM model has units that directly read the convolutional input. Conv-LSTM tends to preserve spatial information, which can help in the reconstruction of data. CNN-LSTM is a hybrid of the CNN model with an LSTM backend. This hybrid model uses a CNN to interpret subsequences of input and passes them together to the LSTM model to interpret. Each input is passed through a convolutional and max pooling layer.

Training of the Deep Learning Model
Once the water quality features were extracted from the clipped Rawal stream network, the data were then prepped for the multi-step forecasting problem. This step involved data preprocessing and normalization, before outputting datafor the training of the deep model. The procedure is described below.

1.
Data Preparation: The dataset was converted into a time series dataset by transforming the timestamp column as an index.

2.
Normalization: All features in the dataset were normalized in the range of 0 to 1, in a process referred to as min-max normalization. For every water quality feature, the values were in different units. For example, the pH of water was mostly in the range of 6 to 9. Similarly, the EC of water lay in the range of 400 μS/cm to 1000 μS/cm. Thus, to bring uniformity into the dataset, the values were normalized for each feature in the range of 0 to 1.

3.
Series to Supervised: The dataset was then further transformed for a supervised learning problem by splitting the input sequence, i.e., the input data at the current time (t) were split into a three dimensional shape (samples, time steps, features) for a multiple input multi-step time series, where a lag time (t − n) and further time steps (t + 1, t + 2, . . . , t + n) were defined for features (n).

4.
Training and test sets: The transformed dataset was then split into training and test sets. Here, the last 2 years' worth of data (220,845 samples) were selected as the test set and 5 years' worth of data (600,000 samples) were selected as the training set.

5.
Model Parameters: The parameters (neurons, epochs, and hidden layers) of the deep model were initialized. Here, each deep model had a different set of parameters with a difference in dropout layers and hidden layers. 6.
Model Evaluation: The model was evaluated based on the three loss functions, i.e., root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). RMSE and MAE give the error in the same units as the predicted variable and MAPE is given as a percentage (%).

Results and Discussion
The aim of this study was to explore the use of different deep learning models in current and multi-step parameter estimation for both optically active and inactive water quality parameters, i.e., EC and DO. The results and findings are discussed in detail in this section. The models were assessed in terms of three loss functions i.e., the root mean square error (RMSE), mean absolute error (MAE) and the mean absolute percentage error (MAPE). The RMSE and MAE both measure the error in the same units as the predicted variable. On the other hand, the MAPE indicates the error margin in the model forecast and is expressed as a percentage (%). Moreover, there are some temporal dependencies for time series forecasting problems. To overcome such dependencies, the data were trained by determining a split point without shuffling them. Hence, the training was performed on 0.6M samples without shuffling the data. A sample of the features calculated from the Landsat 8 images for the year 2021 is depicted in Figure 5 and the last twenty samples are shown in Table 3. The results of the deep learning algorithms-the CNN, FCN, RNN, MLP and LSTM variants-are assessed and each model performance is compared on the basis of the lowest RMSE reached with the same number of epochs.  Table 2 for the year 2021.
The regression time series problem was framed inthe following two formulations:

1.
Predict the DO and EC at the current time event (t) given the eight water quality features at the prior time steps, that is, a lag time period of three (t − 3, t − 2, t − 1).

2.
Predict the DO and EC for the next three events (t + 1, t + 2, t + 3) based on the eight water quality features at the prior time steps with a lag time period of one (t − 1). Next, the results for both of these formulations are discussed. The LSTM variants showed exemplary performance as compared to the other deep learning models.
Predictions of current event parameters: For current predictions of the optically active and inactive parameters, the last three lag events (t − 3, t − 2, t − 1) were used to predict the current time event (t). Figure 6 displays the results for the current EC predictions. It can be seen that S-LSTM outperformed the other deep models, followed by the bi-LSTM with RMSE values of 281.689 and 281.811 (μS/cm), respectively. Overall, the LSTM variants displayed a much better performance for the current time event prediction task. This shows that the LSTM-dominated variants outperformed the LSTM-integrated ones. On the other hand, FCN and RNN models exhibited high RMSE values up to 301 (μS/cm). Figure 7 displays the results for the current DO prediction task. The best results were achieved with V-LSTM and conv-LSTM, with RMSE values of 0.197, 0.198 (mg/L), respectively. Here, the LSTM variants showed a better performance when compared with other deep models, with V-LSTM giving only an 0.109 % MAPE. Similarly, for DO prediction, the RNN model demonstrated a high RMSE of 0.242 (mg/L).  Predictions of future event parameters: For multi-step forecasts, a lag time period of one (t − 1) was used to predict the next three events, i.e., t + 1, t + 2, and t + 3. For the future time event predictions of optically active and inactive parameters, EC and DO, Bi-LSTM performed the best among the other LSTM variants. For DO, V-LSTM and Bi-LSTM showed the minimum RMSE values of 0.2 and 0.199 (mg/L), respectively. Other variants, such as CNN-LSTM and Conv-LSTM, showed much better results than other deep models for the multi-step forecasting of DO, as shown in Table 4. The RNN model exhibited a high RMSE of 0.238 (mg/L). For EC, the best results were shown by the two variants of LSTM as well, i.e., S-LSTM and Bi-LSTM with RMSE values of 281.93 and 281.741 (μS/cm), respectively, as seen in Table 5. Thus, for both current and future water quality forecasts, the LSTM variants showed much better results than the other deep models. However, Bi-LSTM was the best performer when compared with the other LSTM variants. For EC, FCN and CNN showed high RMSE values of 296.46 and 294.38 (μS/cm), respectively.  Note(s): 1 The lowest MAPE retrieved. 2 The second lowest RMSE retrieved. 3 The lowest MAE retrieved. 4 The lowest RMSE retrieved. Note(s): 1 The lowest MAPE retrieved. 2 The lowest MAE retrieved. 3 The second lowest RMSE retrieved. 4 The lowest RMSE retrieved. Figures 8 and 9 show a year-wise comparison of the Bi-LSTM model for DO and EC, respectively. The performance of the Bi-LSTM model was the best among the deep learning models. The actual and predicted forecasts for both DO and EC parameters for the years 2020 and 2021 can be seen. Figure 8 shows that, for each time step, the error margin for the DO predictions was very low. However, for EC, the forecasts for October through December 2020 were not that accurate as seen in Figure 9. This could be due to the fact that EC shows variations during the summer and winter seasons. EC values in winter are generally lower than those in the summer season due to the high evaporation losses in summer and the increased drainage water inflow [61]. Moreover, the year-wise analysis showed a decline in the water quality over the eight-year period, as we can observe a decline in the observed concentrations of the EC and DO water quality variables. The decline in concentrations over the years can be attributed to seasonal variations and other environmental variables [62].

Conclusions
Rawal Lake is the main source of drinking water for the residents of Islamabad and Rawalpindi. However, the lake water is unfit to drink from as it receives untreated sewage and other wastewater due to the increase in population. Water quality assessments are made using manual labor and in laboratories, which is time-consuming. Thus, using the advancements in remote sensing and other technologies, water quality monitoring tasks can be made simple and robust. In this study, eight water quality features for the years 2014 to 2021 were calculated using Landsat 8 images of the study area of the Rawal stream network that were extracted with SRTM DEM data, using hydrological GIS tools. Six optically active water quality parameters, including turbidity, Chlα, SDD, TDS, EC, and LST, and two optically inactive features, i.e., DO and pH, were taken as inputs to observe the water quality parameter estimations for current and future events.
The experiments were limited to predicting only one of the active and inactive water quality parameters, i.e., EC and DO. The multi-step water quality forecasts were made using different deep learning models, i.e., CNN, FCN, MLP, RNN, and five variants of the LSTM model, which included LSTM-dominated and LSTM-integrated versions, including vanilla, stacked, bi-directional, convolutional, and a CNN LSTM hybrid. These models were then compared on the basis of the lowest RMSE achieved. The results showed that the LSTM variants displayed the best performance in the current and future multi-step parameter estimations for both optically active and inactive parameters with the bi-directional LSTM emerging as the leading variant among them. Moreover, the performance of the LSTM-dominated variants was better when compared with the LSTM-integrated version for the observed problem.
The proposed approach, using the combination of remote sensing and machine learning, identified that the water quality declined over the eight-year period, as observed through the concentrations of the water quality variables. Moreover, the factors that contributed to this water quality deterioration include the concentrations of water quality variables that are affected by seasonal variations and other environmental variables. Thus, in the future, some additional water quality parameters can be used for multi-step water quality parameter estimations and forecasts. These environmental variables, which may include air quality parameters, slope, soil type, and the geology and lithology of the study area, can be considered to examine water quality parameters.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: