Monitoring the Condition of Wetlands in the Syr Darya Floodplain—How Healthy Are the Tugai Forests in Kazakhstan?

: Tugai wetlands, including the forests of Populus euphratica Oliv. and P. pruinosa Schrenk, are major biodiversity hotspots within semi-arid and arid ecozones. However, for over a century, Central Asian river systems have been severely affected by dam regulation, water withdrawals for large-scale irrigated agriculture, and deforestation. To support sustainable use and protection of this threatened forest type, we provide information on the distribution and degradation status of Tugai wetlands in the Syr Darya ﬂoodplain using Normalized Difference Vegetation Index (NDVI) time series from Landsat 7 and Moderate Resolution Imaging Spectroradiometer (MODIS). An accuracy assessment conﬁrmed the validity of the MODIS-based wetland map, with an overall accuracy of 78.6%. This was considerably better than the Landsat product, mainly due to the greater temporal frequency of the MODIS time series. We further calculated trends and breakpoints between 2001 and 2016 using the BFAST algorithm. We found negative trends for nearly a third of the wetlands. Breakpoint detection showed major stress events in the years 2001, 2009, and 2016. Our study revealed the temporal and spatial distribution and vitality of an endangered forest ecosystem that has rarely been studied thus far. Climate change may accelerate the destabilization of the Tugai forests at the Syr Darya ﬂoodplain.


Introduction
Tugai forests, often referred to as desert riparian forests [1] or Central Asian riparian floodplain forests [2], are biodiversity hotspots in the arid ecozones of Eurasia [3].The term Tugai stands for a composition of wetlands dominated by Euphrates poplar (Populus euphratica Oliv.) or Desert poplar (Populus pruinosa Schrenk), mixed with shrub and reed communities consisting of Tamarix, Elaeagnus, Salix, and Phragmites [3,4].As a component of the UNESCO and RAMSAR sites in Central Asia, they are of great importance for international wetland conservation [5].They are highly relevant for the local populations' well-being as they provide a wide range of ecosystem services, such as the provision of food, fiber, and fodder, as well as fulfilling aesthetic, educational, and recreational functions [3,6].In addition, the Tugai forests help to combat desertification by regulating climate and flooding and by reducing wind erosion.
Although many of these threats are known, protection and restoration efforts are still insufficient.Thus, large-scale management strategies and controls are of great importance.However, monitoring the remaining Tugai forests through site visits is challenging due to their dispersed geographical distribution.Hence, spatial and temporal information obtained from optical satellite imagery could substantially support monitoring tasks.
Because of the limitations of pre-existing maps derived from satellite-based image time series, novel methods for mapping and monitoring the Tugai forests in Central Asia are urgently needed.Vegetation changes can be captured on many spatial-temporal scales and resolutions using remote sensing products [33][34][35].Numerous tools and algorithms have been developed, e.g., phenological indices [36], anomaly identification [37], and the calculation of long-term vegetation trends, shifts, and breaks [38].In the highly degraded Tugai forests at the Tarim Basin, the Breaks For Additive Season and Trend (BFAST) algorithm [38] has been successfully applied to map forest disturbances and breakpoints in vegetation signal trends [39].By using a dense Landsat time series spanning 14 years, the authors were able to unveil important relationships between water availability and the forest's vitality [39,40].However, such time series approaches have not yet been applied in our study region.
Our study aims to monitor the health status of the Tugai forests in the Syr Darya floodplain with optical time series data.The main objectives of our study were: (1) to compare the suitability of Landsat 7 (L7) and Moderate Resolution Imaging Spectroradiometer (MODIS) data for mapping wetlands in Central Asian drylands and (2) to capture the degradation status of the remaining Tugai forests at the Syr Darya floodplain.

Study Area
With a catchment area of 783,000 km 2 and a length of 3018 km, the Syr Darya is the second largest river in Central Asia and supplies large parts of the Aral Sea Basin with water [41].Our study area (Figure 1) comprised the lower and middle reaches of the Syr Darya from the Chardara Reservoir (236 m above sea level) to the river delta close to the North Aral Sea (45 m above sea level).Overall, our study area covered 53,326 km 2 and spanned a river length of 1350 km.We selected the study site for its known distribution of Populus euphratica and P. pruinosa, which mainly exist in the riparian zones of the Syr Darya flatlands [42].According to the MODIS-based land cover classification by Klein et al. [43], the major land cover classes at the study area in 2009 were sparse vegetation (28.2%), irrigated (22.7%), grassland (18.7%), and bare area with salt flats (14.9%).The broadleaved class, which mostly consists of riparian forests, was estimated to cover only 4.8% of the study area.The arid climatic conditions, characterized by hot summers and cold winters, correspond to the cold arid desert (BWk) and cold arid steppe climates (BSk), according to the Köppen-Geiger Climate Classification [44].Most run-off comes from melting water in the upper mountain regions.As the average temperature in Central Asia is predicted to rise faster than the global average [39], increasing amounts of melting water from the mountain glaciers can be expected.Nonetheless, the climate in the study area is expected to be generally drier and hotter, which leads to desertification and threatens the local population.With the building of extensive canal networks and river dams during the Soviet era (Chardara Reservoir, Aydar Reservoir, and Kayrakkum Reservoir), large-scale irrigation farming was introduced in the Syr Darya catchment region [45].The regulation of water by river dams since the 1960s has fundamentally changed the natural run-off of the Syr Darya, resulting in reduced water quantities during the vegetation season.Additionally, after the independence of the bordering countries, energy production from the dams during the winter season resulted in abrupt water release, leading to overflow events in certain years [45,46].As a consequence, the new Koksaray Reservoir in Kazakhstan was built in 2008 for water storage and to prevent the riparian zones from large-scale damage caused by further flooding events.With the introduction of irrigated agriculture and river water control by dams, large amounts of water became unavailable for natural wetlands, which has led to long-term changes and degradation since the Soviet era.Today, irrigated agriculture for cotton production is still the major use of land in the region [47].The agricultural zones (i.e., cultivated wetlands) comprise complex patterns of active and abandoned irrigation cropland, reeds, small lakes, and shrublands.More pristine wetlands with different types of degradation can be found closer to the main stream of the river.Figure 2 shows different vegetation types found in the study area.We chose to use NDVI time series for our study as it includes spectral information of near-infrared and red light spectra, which are sensitive to chlorophyll at vegetated surfaces [50].Hence, biophysical parameters can be used to describe vegetation stress and phenological changes on the spatial and temporal levels.For time series application in vegetation monitoring, Mayr et al. [51] discovered that studies using NDVI products from different multispectral sensors were frequently found to be robust.Using NDVI data from the MODIS and L7 missions allows us to obtain complementary information about the wetland distribution and to evaluate the suitability of time-series-based methods across different spatial and temporal resolutions.
Google Earth Engine (GEE) [52] provides equally preprocessed 8-day NDVI time series products from both sensors.We used the cloud computing platform and its JavaScriptbased application programming interface (API) for downloading time series vectors for the study area between 2001 and 2016 (Table 1).We post-processed the tabular data with R 3.4.1 [53].Our methodological workflow is outlined in Figure 3. First, we created a 250 m regular cell grid for the study area, resulting in a list of 840,000 points, and downloaded the NDVI time series vectors of MODIS and L7 from GEE.Second, we excluded observations outside the vegetation period from March to October.This step reduced the percentage of observations impacted by clouds and snow, which mostly appear in the study area during winter.Consequently, the mean observation numbers per grid cell were 192 for the L7 data and 444 for the MODIS data (Figure 4).These time series data were used for wetland mapping and mask creation.Third, for trend and breakpoint detection using the BFAST algorithm, gap free time series data was mandatory [38].Thus, the remaining gaps in the time series, e.g., caused by cloudy observations, were filled by linear interpolation.The final datasets included equidistant and gap-free 8-day NDVI time series vectors with 30 time steps per year.

Validation Data
For the derivation of ground truth data, we searched for validation sampling sites at the lower and middle reaches of the Syr Darya in a field campaign in 2017.We decided on three 1500 km 2 large regions, which are displayed in the study area map (Figure 1): (a) The sampling region Kyzylorda is characterized by large-scale irrigation farming.We found few, small-scaled, and strongly degraded Populus euphratica forest stands which were highly impacted by fuel wood extraction and grazing and exhibited limited rejuvenation.Other vegetation subclasses, like Elaeagnus shrubland and Phragmites reeds, were more common.(b) The sampling region Zhanakorgan was dominated by large-scale irrigation farming.It comprised larger Populus pruinosa forest stands and few Elaeagnus shrublands.Although there were signs of partial rejuvenation in the forests, the impact of grazing and wood extraction was still high.(c) The sampling region Syr Darya-Turkestan Nature Park also comprised large-scale irrigation farming, but was less degraded.Additionally, it featured large Populus pruinosa forest stands close to the river course.Although the conservation status of the forests forbids wood extraction and grazing, the forest stands showed limited rejuvenation.
After deciding on the sampling sites, we derived validation points through visual interpretations of very high spatial resolution Digital Globe images via Google Earth (Table 2).WorldView imagery from the years 2015 to 2016 was used, which was provided by Maxar technologies.The initial validation point dataset consisted of 2916 samples belonging to the main classes of wetland and non-wetland.To attain more validation points for the smaller wetland subclasses of forest1 and forest2, we collected additional 492 validation points outside of the sampling regions.To create the wetland mask, we used linear regression analysis between a phenological reference profile [54] (NDVI_reference) and the NDVI time series vectors of each spatial cell (NDVI_observed) (Figure 5).NDVI_reference is the mean time series vector of ten healthy and natural Tugai forest stands at the Syr Darya-Turkestan Nature Park.NDVI_observed are the original time series vectors of each cell in the 250 m grid.We chose Pearson's correlation coefficient r [55] as the regression parameter for finding statistical relationships, where the value 1 indicates a strong positive linear correlation, 0 indicates no linear correlation, and −1 indicates a strong negative linear correlation.Because linear regression requires normally distributed input data, we conducted the Shapiro-Wilk test [56] to test the hypothesis of normal distribution within the data.The test is particularly suitable for data with a small sample size, as in our case with a maximum of 480 observations per time series vector.For all the individual time series vectors in the MODIS and L7 datasets, we observed a p value larger than 0.05, which means the hypothesis of normal distribution was not rejected.Next, the linear regression analysis was conducted for all 840,000 time series vectors from the MODIS and L7 datasets.
After the linear regression analysis, a threshold for Pearson's r was chosen to create a wetland mask aiming to delineate most wetlands.We then evaluated the MODIS and L7 masks created in this step against the validation dataset.The wetland mask with the highest overall accuracy (OA) was used as a spatial boundary for further trend and breakpoint analysis (Figure 3).

Trend and Breakpoint Detection
To calculate long-term vegetation changes (i.e., trends) and abrupt changes (i.e., breakpoints) in the NDVI time series, it is necessary to decompose them into the trend, seasonal, and noise component using the R package bfast (version 1.6.1)[38].From the trend component, we calculated the linear regression slope values to find positive and negative trends of the NDVI time series within the wetland mask.These long-term trends were interpreted as gains and losses of vegetation activity within the observed time frame of 16 years.If a regression slope was less than −0.0001, we defined it as a decreasing trend.In the opposite case, a regression slope larger than +0.0001 was considered a positive trend.Regression slope values between these values were interpreted as stable vegetation activity.From the trend component, we also calculated the breakpoints within the NDVI time series.They showed major years with abrupt changes in vegetation.For easier interpretability, we set the number of breakpoints to a maximum of two for the entire time series.We then visualized the breakpoints by their detection year.

Wetland Masks and Accuracy Assessment
Figure 6 shows the MODIS and L7 maps derived from the linear regression analysis.The correlation coefficient r ranges between −0.57 and +0.91 in the MODIS product and between −0.68 and +0.86 in the L7 product.In both maps, high values (r > +0.6) were found close to the main course and tributaries of the Syr Darya.Middle values (r +0.5 to +0.6) were found at agricultural belts and the river delta.Low values (r < +0.5) were found away from the river course.The correlation coefficients among the validation samples from the wetland class (i.e., riparian forests, shrubs, and reeds) reached a mean value of +0.61 in the MODIS and +0.55 in the L7 products.The non-wetland classes (i.e., farmlands, drylands, and settlements) had mean values of +0.15 for MODIS and +0.23 for L7.Nevertheless, the subclass cropland1 (i.e., actively irrigated farmland) often reached r values larger than +0.5 in both products.Thus, achieving full separation between irrigated farmlands and natural wetlands using the chosen method was not possible.As a trade-off, we then chose the correlation coefficient value of r = +0.5 as the threshold for wetland masking to ensure that major parts of the natural wetlands were included and agricultural lands were partially excluded.Hence, for MODIS, we quantified 13,864 km 2 of wetlands or 26.9% of the study area.For L7, we quantified 5216 km 2 of wetlands or 11.0% of the study area.The confusion matrices in the Appendix (see Tables A2 and A3) show the classification accuracies of the MODIS and L7 wetland masks, considering the validation dataset.Although the L7 map achieved higher producer's accuracies (PA) for the non-wetland classes (cropland1 (84.9%) and cropland2 (80.4%)), the MODIS classification performed better for the wetland classes (forest1 (75.1%) and forest2 (95.5%)).For the full validation dataset, the MODIS classification yielded an OA of 78.6%, whereas the L7 classification produced an OA of 63.0%.Because of the higher OAs and PAs for the wetland subclasses, the subsequent analysis was applied to the MODIS wetland mask.

Trend and Breakpoint Detection
Figure 7 and the detail maps (Figure A3) show the results of trend detection for the cells within the MODIS wetland mask.The regression slope values in the study area ranged between −0.00053 and +0.00042, with a mean value of −0.00000 for the whole wetland area.Larger patches of decreasing trends were observed in most wetlands close to the main course of the Syr Darya.Few spots with positive trends were found at the river delta and at a tributary north-east of Kyzylorda.Mixed values were found for the wetlands at the large cotton belt in the southern study area.We quantified an area measuring 4296 km 2 with a decreasing trend, constituting 29.7% of the entire wetland area.Positive trends were quantified for an area of 1242 km 2 or 8.5% of the wetlands.A total of 61.8% of the wetlands were classified as stable.

Wetland Detection and Masking
Uncertainties relating to the large-scale monitoring of wetlands with remote sensing products in Central Asia are driven by large data gaps in the Landsat program [57] and the lack of ground truth information regarding the distribution of Tugai forests [58].We completed a wetland mask for estimating the recent distribution of Tugai wetlands at the middle and lower reaches of the Syr Darya using MODIS and L7 NDVI time series and reference data from field observations.Despite large differences in the observation numbers of the input time series, we were able to distinguish between wetlands and non-wetlands in the research area with an OA of 78.6%.MODIS data, with a much higher observation number, led to more precise classification results compared to L7.The main class (wetlands) was better separated using the MODIS product, with 81.5% PA, compared to the L7 product, with 41.8% PA.However, we found wetland classification uncertainties and misclassifications in both mapping products.The uncertainties are mainly explained by the phenological similarities between the irrigated croplands (e.g., cotton belts) and the Tugai wetlands (e.g., forests, reeds, and shrubs).Furthermore, misclassification in the MODIS product can be explained by mixed pixel effects due to the coarse spatial resolution of 250 m.For instance, we observed that cells in dry Tugai forest stands have a much weaker NDVI signals than stands closer to the river, which leads to lower PAs for the wetland subclasses.In contrast, the Tugai wetlands within the irrigated cotton belts are better distinguished using the L7 product due to its higher spatial precision.
Given the expanding archive of the Sentinel and Landsat missions and advancements in data fusion approaches, such as Landsat-Sentinel-2 or Landsat-MODIS [59,60], a higher quality of mapping products is expected, which could support local forest and conservation authorities in their monitoring duties.However, large-scale ground truth data is also needed to improve monitoring of the Tugai forests.

Trend and Breakpoint Detection
We found that the high temporal density of the MODIS time series data substantially benefitted trend and breakpoint detection in our study area.Many statistical tools and algorithms for vegetation disturbance modeling require temporally dense time series with preferably high spatial resolution [61].For the wetlands, we found a decreasing trend in NDVI for up to one-third of the area, which implies ongoing degradation of the riparian zones at the Syr Darya.We found that flooding events have a positive influence on the Tugai wetlands.For example, in the winter of 2004, massive overflow occurred at the Syr Darya middle reaches [46].This event led to a pronounced positive trend and breakpoint at that location over the following years.Positive vegetation trends were also detected in some areas of the Syr Darya river delta.This pattern may be the result of successful water discharges during the study period, which has been a common political goal following the drying up of the Aral Sea in the 1960s.
In many parts of Central Asia, water overuse occurs due to economic reasons [62].We assume that ongoing water scarcity, intensive agricultural land use, and the extraction of fuel wood will remain the main drivers for degradation of the Tugai wetlands.Climate change will accelerate water scarcity for the riparian forests, especially during summer droughts.The temperature diagrams from the climate stations in Kyzylorda and Kazalinsk (Figure A1) at the Syr Darya provide clear evidence of rapid temperature increases over the past four decades [63,64].By 2050, an increase in the region's annual mean temperature by 2.0 to 3.6 • C is expected [65].Although the annual amount of precipitation may not significantly change, a shift in the seasonal rainfall distribution is projected, with a significant decrease in precipitation in summer and autumn [66].

Conclusions
Due to their scattered distribution, the Tugai forests in Kazakhstan have remained understudied, particularly using remote sensing approaches.Hence, the past and recent health status of these ecosystems has remained largely unknown until now.Based on the example of the Syr Darya floodplain, we demonstrated a novel approach for precise and cost-efficient wetland mapping and monitoring in Central Asian drylands.Using MODIS NDVI time series data, we could detect trends and abrupt changes in vegetation vitality at the regional level.These metrics facilitate the monitoring of this highly threatened biodiversity hotspot.
Our work aimed to achieve two main objectives: (1) Comparing the suitability of L7 and MODIS data for mapping wetlands in the Central Asian drylands.Both satellite missions show advantages regarding their different temporal and spatial resolutions.For long-term vegetation monitoring in Central Asia, the MODIS data is more applicable due to its higher repetition rate.However, we found linear regression analysis, based on a phenological reference profile and the observed NDVI time series, to be a very useful tool for wetland detection with both data sources.(2) Capturing the degradation status of the remaining Tugai forests at the Syr Darya floodplain.Considering the wetland mask derived from MODIS data, we found a decreasing trend for 29.7% of the wetland area.The breakpoint detection revealed two main events which occurred in the years 2001/2002 and 2016.These events contributed to ongoing degradation of the remaining Tugai forests at the Syr Darya.The adverse health status of the riparian ecosystem is exacerbated by activities such as water extraction for irrigation farming, fuel wood extraction, overgrazing, and the lack of natural flooding.Without further protective measures, restoration of the ecosystem to stable conditions is becoming increasingly unattainable.

Figure 1 .
Figure 1.Study area and field sampling sites in the lower and middle reaches of the Syr D map: land use and land cover from Klein et al. [43].Overview map: main rivers of the Aral in Central Asia.

Figure 1 .
Figure 1.Study area and field sampling sites in the lower and middle reaches of the Syr Darya.Base map: land use and land cover from Klein et al. [43].Overview map: main rivers of the Aral Sea Basin in Central Asia.

Figure 2 .
Figure 2. Vegetation types within Tugai wetlands at the Syr Darya: Elaeagnus shrubs (top left Phragmites reeds mixed with Tamarix shrubs (top right), Populus euphratica reforestation site (bottom left), and Populus pruinosa forest at Syr Darya-Turkestan State Regional Park (bottom right).Thes photographs were taken during a field survey in September 2017.

Figure 2 .
Figure 2. Vegetation types within Tugai wetlands at the Syr Darya: Elaeagnus shrubs (top left), Phragmites reeds mixed with Tamarix shrubs (top right), Populus euphratica reforestation site (bottom left), and Populus pruinosa forest at Syr Darya-Turkestan State Regional Park (bottom right).These photographs were taken during a field survey in September 2017.

2. 2 .
Data 2.2.1.Satellite Imagery Since area-wide and high resolution satellite imagery in our study area is missing from the 1980s and 1990s, our study time frame starts with the year 2000, for which data from the multispectral MODIS and L7 missions are available.The MODIS mission, operated by the National Aeronautics and Space Administration (NASA), was carried out by two satellites: Terra (1999) and Aqua (2002).These satellites incorporate 36 spectral bands, including visible and infra-red light spectra [48].It was designed to collect continuous global data for long-and short-term changes in the Earth system.It provides imagery with spatial resolutions of 250 m (bands 1-2), 500 m (bands 3-7), and 1000 m (bands 8-36) [48].The L7 mission is part of the Landsat Earth observation program, which has been run by NASA since 1972, and is operated by a single satellite with the Enhanced Thematic Mapper Plus (ETM+) instrument, incorporating seven spectral bands in the visible and infra-red ranges [49].The spatial resolution of L7 imagery reaches 30 m (bands 1-5, 7) and 60 m (band 6) [49].

Figure 4 .
Figure 4. Number of observations per 250 m grid cell from 2001 to 2016 for MODIS (MCD43A4_NDVI) (top) and Landsat (LE7_L1T_8DAY_NDVI) time series (bottom) after vegetation season, filtering from March to October.

Forests 2023 , 12 Figure 5 .
Figure 5. Example of the linear-regression-analysis-based wetland detection for a single NDVI time series vector (2001-2016) of a wetland cell.r is the Pearson's correlation coefficient between the observed time series vector (blue) and the reference time series vector (green).The red line is the regression line of both vectors.

Figure 5 .
Figure 5. Example of the linear-regression-analysis-based wetland detection for a single NDVI time series vector (2001-2016) of a wetland cell.r is the Pearson's correlation coefficient between the observed time series vector (blue) and the reference time series vector (green).The red line is the regression line of both vectors.

Figure 7 .
Figure 7. Linear regression slope values of the trend component of the 8-day MODIS NDVI time series (2001-2016).

Figure 8
Figure 8 shows the year of the first and second breakpoints within the trend components.According to the BFAST algorithm, almost all the wetland cells underwent at least one breakpoint within the study time frame.Most of the breakpoints were observed for the years 2001, 2002, and 2009 (first breakpoint) and in the years 2013, 2014, and 2016 (second breakpoint).The detail maps (Figure A4) further reveal spatial relationships between the observed vegetation trends and breakpoints.The spots with positive trends at the river delta and at the tributary northern to Kyzylorda match with the breakpoint events around the year 2009.The zones with positive trends at the southern cotton belt are related to a breakpoint event in 2016.On the other hand, the areas with negative trend slopes are more closely associated with breakpoints in 2001 and 2002.

Figure A1 .
Figure A1.Yearly average temperature up to the year 2000 for two climate stations at the Syr Darya, left: Kyzylorda [63] and right: Kazalinsk [64].

Figure A1 .
Figure A1.Yearly average temperature up to the year 2000 for two climate stations at the Syr Darya, left: Kyzylorda [63] and right: Kazalinsk [64].

Figure A3 .
Figure A3.Linear regression slope values of the trend component of the 8-day NDVI time series (2001-2016) for the wetlands at the Syr Darya delta (left), Kyzylorda (middle), and Shardara (right).

Table 1 .
Image collections from the Google Earth Engine used in this study.Original Source: United States Geological Survey.
s 2023, 14, x FOR PEER REVIEW 3 o

Table 2 .
Validation point dataset (n = number of samples per subclass).

Table A2 .
Confusion matrix of the MODIS wetland classification (rows) and the validation dataset (columns).PA = Producer's Accuracy for the subclasses.PA2 = Producer's Accuracy for the main classes.OA = Overall Accuracy.

Table A3 .
Confusion matrix of the Landsat 7 wetland classification (rows) and the validation dataset (columns).PA = Producer's Accuracy for the subclasses.PA2 = Producer's Accuracy for the main classes.OA = Overall Accuracy.