Effects of Meteorological Parameters on Surface Water Loss in Burdur Lake, Turkey over 34 Years Landsat Google Earth Engine Time-Series

The current work aims to examine the effect of meteorological parameters as well as the temporal variation on the Burdur Lake surface water body areas in Turkey. The data for the surface area of Burdur Lake was collected over 34 years between 1984 and 2019 by Google Earth Engine. The seasonal variation in the water bodies area was collected using our proposed extraction method and 570 Landsat images. The reduction in the total area of the lake was observed between 206.6 km2 in 1984 to 125.5 km2 in 2019. The vegetation cover and meteorological parameters collected that affect the observed variation of surface water bodies for the same area include precipitation, evapotranspiration, albedo, radiation, and temperature. The selected meteorological variables influence the reduction in lake area directly during various seasons. Correlations showed a strong positive or negative significant relationship between the reduction and the selected meteorological variables. A factor analysis provided three components that explain 82.15% of the total variation in the data. The data provide valuable references for decision makers to develop sustainable agriculture and industrial water use policies to preserve water resources as well as cope with potential climate changes.


Introduction
The availability of water resources is necessary for human economic prosperity, agriculture, drinking, and the promotion of sustainable development [1][2][3]. Climate change, in combination with urbanization, population growth, and water consumption is projected to intensify the strain on natural water resources [4,5]. Interannual and intra-annual changes in surface waters can be dramatically affected by climate change and human activity with far-reaching implications for habitats and human society [6,7].
Climate change has adverse effects on the water bodies in Turkey and in the rest of the countries [8][9][10][11]. The Burdur Lake basin, located in the Turkish Mediterranean region, has been subjected to significant environmental degradation, including increased emissions and a decline in water levels [12]. It is critical for the future development of the area to determine the origin of this disaster and whether it is natural (climate, seismic) or anthropogenic [13]. To ensure the river basin's long-term economic and social development, as well as the ecosystem's resilience, the spatial-temporal variance characteristics of surface waters need to be precisely mapped [14]. Former research used several algorithms and different data sources to map the surface water [15]. Due to the high frequency, low cost, and repeatability of measurements, satellite-based methods have advantages in surface water mapping over conventional methods [16]. Recently, Landsat, IKONOS, QuickBird MODIS, and Sentinel [17,18] satellite images were used to explore and evaluate the areas of surface water at global, continental, and regional scales.
In the meantime, several satellite-based methods for detecting surface water have been established. There are two types of surface water detection algorithms: thematic water surface extraction algorithms and classification algorithms. Classification algorithms consist of Maximum Likelihood [19], Artificial Neural Networks [20][21][22], Support Vector Machine [23][24][25], Random Forest [26,27], Decision Tree [28], Naïve Bayes [29,30], and Logistic Regression [30]. However, since these approaches require human experience for the manual tasks involved in the data processing, skills to pick samples and train algorithms as well as a significant technological infrastructure such as high-performance computing machines capable of delivering the necessary processing, analysis, and storage platform, they are challenging to use to rapidly map water bodies using multitemporal images over a wide area, a large region, or on a global scale [29,31]. In contrast, satellite spectral bands and various types of water indices are the two main components of thematic water surface extraction algorithms. Multiband water indices such as the Normalized Difference Water Index (NDWI) exploit the variations in reflectance between the visible and infrared bands of the electromagnetic spectral spectrum. The low reflection and high absorption of long wavelengths of electromagnetic energy in water, especially in the near-infrared sections of the electromagnetic spectrum, are used in most water indices. Meanwhile, water indices have been successfully utilized for mapping surface water using remotely-sensed data. The main advantages of water indices are the precise, quick, easy, and reproducible mapping of surface water to investigate the dramatic interannual and intra-annual surface water volatility [31]. Xu utilized the Modified Normalized Difference Water Index (MNDWI) to enhance water features. The study reported that zero threshold works much better for the MNDWI than the NDWI [32]. However, the short-wave infrared bands (SWIR) are not available in Landsat 1, 2, 3, and 4.
Although several satellites have advanced capabilities to map water bodies such as Sentinel-1 SAR imagery, which does not have the cloud limitation [33], these satellites have only been available for a few years. Landsat imagery is considered as the most suitable images to extract the NDWI maps for a long period [34]. In addition, the Normalized Difference Vegetation Index (NDVI) was employed because it is well known and easy to apply. Furthermore, much of the research used these indices because of their reliable results. Moreover, the NDVI is used to determine land cover changes [35][36][37], drought monitoring [38], and forest health tracking [39], etc.; the NDWI is used to detect the water surface and its changes [40].
Water bodies have significant interannual and intra-annual differences in surface area. Therefore, there are many unknowns when using a single image to extract the regions of surface water [41]. Although some former studies in the literature that analyzed the surface area of Burdur Lake [15,42], all the available articles have utilized few images to analyze the degradation in the area. Since surface water has interannual water variability, describing long patterns of surface water variability with several years of images taken at the same time can result in inconsistent areas of surface water and uncertainties in the outcomes. In addition, it is challenging to identify the suitable period in a year for a single image and the same date's image over various years. Meanwhile, the areas of surface water could be mapped during rainy seasons, while the permanent water area could be detected during dry seasons. Choosing an acceptable time to take satellite images for various reasons can be difficult. Therefore, using all available images within a year helps to prevent these mapping challenges, inconsistencies, and reduce such uncertainty. As a result, a wide-ranging investigation is needed to examine the continuous shift of surface water areas using sufficient photos taken over several seasons and years.
Among the current satellites, Landsat images have the longest-running terrestrial satellite records and the highest consistency spectrum. As a result, Landsat images have been successfully utilized in mapping the surface water and studying the effects of humancaused factors and climate change. Since January 2008, petabytes of long-term Landsat image files have been available for free download due to the open-data policy of Landsat [43,44]. Simultaneously, some cloud computing systems have been created to handle vast amounts of geospatial data. Google Earth Engine (GEE) is an example of a cloud computing platform that can efficiently handle big data due to the capabilities of parallel computing. GEE is made up of multipetabyte remote sensing data that has been preprocessed and made accessible and usable [45,46]. Recently, the GEE platform has been widely employed to extract the crop planting areas [47,48], forests [35,[49][50][51][52], coastal tidal flats [53,54], surface water bodies [55][56][57] from Landsat images at a national or regional scale.
This paper aims to examine the spatial and temporal variation in the Burdur Lake surface water body areas from 1984 to 2019 as well as to determine the key driving influencers (vegetation cover, precipitation, evapotranspiration, albedo, radiation, and temperature) that affect the observed variation of surface water bodies. In this paper, the degradation and seasonal variation in the Burdur Lake surface water area were extracted and analyzed over 34 years utilizing the GEE cloud computing platform and using the time series of Landsat imagery. The mapping of the Burdur Lake surface water for such a long period with this continuation are not yet established. The statistical relationship between meteorological variables and the reduction in the area of the lake is presented and assessed. The correlation matrix and factor analysis between the selected variables and the degradation of the surface area were assessed and discussed. The research provides valuable references for decision makers to develop sustainable agriculture and industrial water use policies to preserve water resources as well as to cope with potential climate changes.

Study Area
The study area is located in the southwest of Turkey, which lies in the administrative area of the Burdur and Isparta provinces. Geographically, it is located in the Mediterranean region and is under the rain shadow of the Taurus Mountains. The lake is fed from the rainfall of the surrounding watershed with a total area of 3211 km 2 as shown in Figure 1. The annual average temperature of the area is 13.2 • C, and the annual total precipitation of the area is 428.3 mm [58]. According to the Köppen classification, the area is Csa, which represents lukewarm in the winter and very hot in the summer [59].

Data Source (Landsat5 TM, 7 ETM+, 8 OLI Data, NDVI Data, and Climate Data)
Burdur Lake is covered by Paths 178 and 179 and Row 34. By utilizing the GEE Landsat-hosted dataset, 570 scenes have been obtained from 1984 to 2019.
The 3-hourly Global Land Data Assimilation System (ERA5-Land) product [63], which ingests ground-based observational and satellite data, was used to collect the annual cumulative precipitation, annual cumulative evapotranspiration, albedo, radiation, and annual average temperature. The data of the vegetation cover was obtained from the combined 16 days MODIS NDVI through the GEE platform [64]. The MODIS-NDVI data was only available from 2000 to 2019.

Surface Water Body Extraction
The detailed methodology consists of three stages as illustrated in Figure 2: Firstly, there are the Landsat 5, 7, and 8 images listed from the GEE datasets across the Burdur area since 1984 to 2019. A total of 883 images were collected from Landsat 5, 612 images were collected from Landsat 7, and 425 images were gathered from Landsat 8. Although the GEE provides the possibility of filtering and removing the low-quality images by shadows, snow, and clouds, all the images found were masked based on the study area. Analyzing the maximum obtainable images enhances the time series continuity. Therefore, during the exploration of the Landsat images, several Landsat tiles (tiles within Paths 178 and 179 and Row 34 based on the Landsat World Reference System WRS-2) were found mainly cloudy but not cloudy within the area of Burdur Lake. Thus, these Landsat tiles were utilized to support the continuity of the extracted time series. Secondly, the NDWI index was calculated from the various Landsat images in the GEE platform. Due to the limited 5 min of time processing and limited 5000 processing points provided by the GEE platform or temporal and spatial analysis, batch processing was implemented using Python within QGIS using the GEE plugin to overcome this problem. The NDWI method delineates the body of surface water by extracting the normalized difference between a reflected green visible band and a near-infrared band [65]. The NDWI index enhances the presence of surface water bodies and eliminates the presence of terrestrial vegetation and soil features. The NDWI index is calculated by using Equation (1): where theNDWI is the normalized difference water index, Green is the reflected green visible light and the NIR is the reflected near-infrared energy. Based on the data used in this analysis, the surface water bodies were extracted by Equations (2)-(4).
The postprocessing operations were performed in the second stage in the ArcGIS environment using a model builder iterator. The NDWI images were calculated on a pixel-by-pixel basis, then different thresholds were defined between the range of <0 and 0.1 to 1 based on different Landsat images and based on the visual interpretation. When the condition was met, the pixels were categorized as surface water and assigned 1 value. When the condition was not met, the pixels were categorized as non-surface water and assigned 0 value (5). Most of the threshold values were assigned 0 except some of the images were assigned 0.1 due to the complex image digital values. The different threshold values were assigned to distinguish the normal water bodies from water with high sand content and mudflats with high water content.
To extract the final vector polygon of Burdur Lake we proceeded as follows: (1) The results from the binary NDWI images were smoothed using a low-pass filter (kernel 3 by 3 pixels) to remove the noisy pixels and reduce the implication of abnormal cells. (2) The smoothed images again enhanced the subdued edges by utilizing a high-pass filter (kernel 3 by 3 pixels). (3) The results from the enhanced images were converted to a GIS vector format, and only the polygons of Burdur Lake were stored. The lake polygons were separated from other discrete plates by spatially intersecting them with the general diagonal line of Burdur Lake. A model builder with a vector iterator was built in ArcGIS, then ran over the polygons to extract only the Burdur Lake polygons. (4) Based on the visual interpretation using the satellite image as a base image for verification, several polygons were excluded from the list due to high gaps in the polygon area. Then, the area of each polygon was calculated as shown in Figure 3. (5) The monthly base time series were formed in Microsoft Excel using the PivotTable. The timeline of the collected data was classified into four seasons using the meteorological seasons in the Northern Hemisphere. The meteorological autumn (September, October, and November), winter (December, January, and February), spring (March, April, and May), and summer (June, July, and August) [66].

Accuracy Assessment of Extracted Surface Water Data
This section aims to prove the efficiency of the proposed method of surface water body extraction and illustrates the quality of the extracted data of the lake borders. The process involves 3 datasets as follows: (1) The proposed dataset uses the raw Landsat images to extract the lake borders to monitor the degradation process. To generate a base for an accuracy assessment and to avoid any statistical misrepresentation of the location of assessment points, pixelby-pixel accuracy assessment methods were employed. By applying the pixel-by-pixel accuracy assessment method, hundreds of thousands of pixels were taken around the coastline of the lake. A rectangle was drawn over the lake area and clipped the lake area from the Sentinel satellite image. The clipped image consists of 3588 rows and 3210 columns, which lead to producing 4,035,395 pixels. These pixels were converted to points. The points were used as a base to extract the pixels values from the NDWI images. The confusion matrix was calculated again using 4,035,395 pixels. The confusion matrix was generated to compute the User Equation (5), Producer Equation (6), and Overall Accuracy Equation (7) [67]. In addition, Cohen's Kappa index was calculated to measure the agreement among the two datasets [68]: Producer accuracy = TP Overall Accuracy = TN + TP TN + TP + FP + FN * 100% (7) where TP = true positive value, FP = false positive value, TN = true negative value, and FN = false negative value.

Statistical Analysis
The extracted results from the Landsat images were analyzed using descriptive statistics and the Pearson correlation analysis [69]. Furthermore, a factor analysis was used to identify the sources of variation in the data. A factor analysis (FA) is a multivariate technique used to produce new uncorrelated variables using the original data values. A principal components analysis (PCA) is usually used to extract the factors from the correlation matrix of the selected variables. The extracted components are usually rotated to reduce the contribution of less significant variables [70,71]. The process will result in a few factors that account for the same amount of information as the original dataset.

Accuracy Assessment of Surface Water Map
The extracted results from the Landsat images were compared point-to-point with the Sentinel-2 images over the last four years in our time series duration (2016, 2017, 2018, and 2019). Table 1 presents each water and non-water point extracted from Sentinel-2 images and their User's Accuracy, Producer's Accuracy, as well as Overall Accuracy and the number of points in each category. The Overall Accuracy of our extraction methodology was more than 99%, the Producer's Accuracy for non-water areas and water areas was more than 99%, and the User's Accuracy was more than 99% for non-water areas and water areas over the years 2016, 2017, 2018, and 2019. The obtained results show a high degree of accuracy for the proposed extraction method results in this study and the benchmark data. This high degree of accuracy is a supportive indicator of the quality of the extracted data and the employed methodology where the dissimilarity is reasonable since there should be a margin of error even in the benchmark dataset extraction methodology as well as the proposed method.
Ultimately, this assessment reflects the effectiveness and the simplicity of the proposed method, triggers the potential for publishing the data on an open-source platform, and paves the way for applying the same principle to other attractive geographical areas for further investigation.

Spatial and Temporal Changes of Burdur Lake Area from 1984 to 2019
The analysis of long-term, remotely-sensed data of Burdur Lake has shown that the water body of the lake has dramatically decreased from 1984 to 2019. The trend of the decrease in the lake area was observed by utilizing Landsat images at roughly five-year intervals. The follow-up of the changes in the area covering the lake was examined within seven periods as 1984-1990, 1995, 2000, 2005, 2010, 2015, 2019 as shown in Figure 4. As a result, the areal changes of Burdur Lake were determined, considering the periods as 206. 6, 192.2, 172.9, 158, 156.8, 144.2, 135.8, 125.5 km 2 , respectively.
Burdur Lake lies in a southwest-northeast direction, and the depth of the lake is shallower on the northeast side. For this reason, the maximum areal changes of the lake had taken place by 1984-2000, which is the first three periods on the northeast edge of the water body [72,73]. The radical reduction in the lake area led to a severe loss of wetland habitat, resulting in the drying up of shallow areas of great importance to waterfowl. Figure 5 shows the trend and temporal variation in the lake area between 1984 and 2019. It is obvious that during the thirty-four years, the lake area continually decreased except for the period between 2003 and 2004. Unfortunately, after 2006, the area continued to decrease with an almost fixed rate of reduction. The trend of the annual area shows the same reduction pattern after 2006 until 2019. Figure 5 indicates that the decrease in the lake area is ongoing and will lead to further reduction. According to the general trend of the lake, the annual reduction in the area is 2.35 km 2 per year. In case the lake conditions persist, starting in 2019, the lake will lose half of its area in eight and a half years.  Moreover, the seasonal and annual redaction variation in the lake area between 1985 and 2019 are presented in Figure 6. Generally, the reduction in the lake area happened most of the time except the years 1985, 1998, 2002, 2003, 2004, and 2005, when the area of Burdur Lake increased annually to 0.39, 0.15, 1.4, 2.24, 0.83, and 0.36 km 2 , respectively. Between 2003 and 2006, the decrease in the area stopped and started to recover. The lake area slightly increased from 154 to 158 km 2 . The worst redaction was in 2011, where the area was reduced by 6.23 km 2 in one year. Figure 6 also presents the seasonal variation in redactions. The two subfigures show that the redactions happened during the summer and autumn seasons, while in the winter and spring seasons, the lake gained some area over a few years especially after the year 2000. These outcomes indicate the strong linkage between the lake area and the rainy-snowy seasons as well as the source of lake water.

Descriptive Statistics of Climatological Data, Vegetation, and Temporal Area
The results for temperature, evaporation, precipitation, albedo, radiation, and the reduction of the lake area are summarized as a minimum, maximum, mean, and standard deviation ( Table 2). The lowest average value was observed for evaporation, which was −0.002, and the highest average value was observed for radiation, which was 1.6478000. Furthermore, the standard deviation was low and acceptable, except radiation was high, which was 6,449,530. This was due to the seasonal variation as well as cloudy times. The collected data were further analyzed based on the season. The boxplot for the reduction of the area of the lake and selected variables are presented in Figure 7 showing the first, second (median), and third quartiles with minimum and maximum values. Figure 7 exhibits a very big fluctuation for the selected variables during the various season and within each season as well. The higher changes in the area of the lake may be attributed to the high changes in all selected meteorological parameters, especially in winter and spring. In summary, the selected variables behave differently during each season and between various seasons.

Correlation Analysis
The relationship between selected variables (temperature, evaporation, precipitation, albedo, radiation, and NDVI) of the lake area was studied to understand the behavior of each variable in the presence of other variables. The correlation matrix for the coefficient of correlation is presented in Table 3. The correlation matrix showed a strong positive or negative significant relationship (p-value < 0.01 or p-value < 0.05) as presented in Table 3 between the reduction and the selected variables. Furthermore, a strong correlation was exhibited between various variables in the presence of other variables. In summary, the selected variables influence the reduction in the lake area directly during various seasons.

Factor Analysis
The collected data were further analyzed to identify the amount of explained variation in the data. A factor analysis is a multivariate method used to describe the relationship between various selected variables which are highly correlated with a smaller number of new, uncorrelated variables called factors. A factor analysis is considered as a data reduction method that proposes how many variates are important to explain the observed variances in the data. The analysis was provided only three components which explain 82.15% of the total variation in the data. The contribution of each variable to each factor is represented by the coefficient associated with each variable in Equations (8) The first factor (F1) explains more than 45% (45.06) of the total variation in the data, this contribution is mainly due to the difference between temperature, radiation, and from the other side evaporation and precipitation. The second factor (F2) explains 18.90% which is mainly due to the difference between vegetation cover (NDVI) and albedo, whilst the third factor (F3) explains only 18.19% of the total variation in the data which is due to the reduction in the area of the lake. The contribution of the selected variables in explaining the total variation during various seasons is represented in a boxplot as shown in Figure 8 showing the effect during various seasons. The values of the factors behave differently during various seasons. Furthermore, winter exhibited the highest fluctuation for factors 2 and 3 which is due to the NDVI, albedo, and reduction of the lake area. The lake is fed by rainfall of the basin, the rivers that flow into the lake, and groundwaters. Since there is no water flow out of the lake, water losses occur only through evaporation and infiltration [72]. The water level of the lake has decreased by 14 m, and the total volume of the lake has decreased by 38%. A high correlation was found between the decrease in the lake level and the temperature and total evaporation [73].  According to some studies, the decreasing trend of the lake's water level is rooted in increasing human activity in the Burdur Lake basin. Especially dams constructed on rivers feeding the lake to irrigate agricultural lands in the basin reduce the amount of water feeding the lake. Besides, excessive water consumption of the marble quarries in the basin is among the important reason for the water level decrease [74]. According to another study, the use of rivers feeding the lake for mains water and the use of groundwater for agriculture caused the water level of the lake to decrease [75]. In other words, the reasons for the decrease in the water level of the lake are the increase in water consumption due to the expansion of urban areas and agricultural lands in harmony with the population increase. In addition, because the area where the lake is located is tectonically active, the direction and drying of the underground waters due to the earthquakes may have caused the water level of the lake to decrease [76]. Figure 9 shows the relationship between the reduction in the area of the lake with the NDVI.

Conclusions
In this work, the relationship between selected meteorological variables and the reduction of the lake area was studied for 34 years between 1984 and 2019 to understand the behavior of each variable in the presence of other variables. A strong positive or negative correlation showed a significant relationship (p-value < 0.01 or p-value < 0.05) between the reduction of the lake area and the selected variables. Furthermore, a strong correlation is exhibited between various variables in the presence of other variables. In summary, the selected variables influence the reduction of lake area directly during various seasons. A factor analysis provided three components that explain 82.15%% of the total variation in the data. The contribution of each variable to each factor is represented by the coefficient associated with each variable. The first factor explains more than 45% (45.06) of the total variation in the data, this contribution is mainly due to the difference between temperature, radiation, and, from the other side, evaporation and precipitation. The second factor explains 18.90% which is mainly due to the variation of the vegetation cover (NDVI) and albedo, whilst the third factor explains only 18.19% of the total variation in the data, which is due to the reduction in the area of the lake. The values of the factors behave differently during various seasons. Furthermore, human activities such as agriculture and industry may contribute to the degradation rate of lake areas which may be considered in future work. In addition, the lake body area as a unit of measurement alone expresses the relative deterioration of the water volume in the lake, but the difference in the depth of the lake from one place to another can lead to a margin of error in the area of the lake body. Thus, solutions to this problem can be developed in future studies by linking the area extracted from satellite images with depth data, if available.