Estimation of Photosynthetic and Non-Photosynthetic Vegetation Coverage in the Lower Reaches of Tarim River Based on Sentinel-2A Data

Estimating the fractional coverage of the photosynthetic vegetation (f PV) and non-photosynthetic vegetation (f NPV) is essential for assessing the growth conditions of vegetation growth in arid areas and for monitoring environmental changes and desertification. The aim of this study was to estimate the f PV, f NPV and the fractional coverage of the bare soil (f BS) in the lower reaches of Tarim River quantitatively. The study acquired field data during September 2020 for obtaining the f PV, f NPV and f BS. Firstly, six photosynthetic vegetation indices (PVIs) and six non-photosynthetic vegetation indices (NPVIs) were calculated from Sentinel-2A image data. The PVIs include normalized difference vegetation index (NDVI), ratio vegetation index (RVI), soil adjusted vegetation index (SAVI), modified soil adjusted vegetation index (MSAVI), reduced simple ratio index (RSR) and global environment monitoring index (GEMI). Meanwhile, normalized difference index (NDI), normalized difference tillage index (NDTI), normalized difference senescent vegetation index (NDSVI), soil tillage index (STI), shortwave infrared ratio (SWIR32) and dead fuel index (DFI) constitutes the NPVIs. We then established linear regression model of different PVIs and f PV, and NPVIs and f NPV, respectively. Finally, we applied the GEMI-DFI model to analyze the spatial and seasonal variation of f PV and f NPV in the study area in 2020. The results showed that the GEMI and f PV revealed the best correlation coefficient (R2) of 0.59, while DFI and f NPV had the best correlation of R2 = 0.45. The accuracy of f PV, f NPV and f BS based on the determined PVIs and NPVIs as calculated by GEMI-DFI model are 0.69, 0.58 and 0.43, respectively. The f PV and f NPV are consistent with the vegetation phonological development characteristics in the study area. The study concluded that the application of the GEMI-DFI model in the f PV and f NPV estimation was sufficiently significant for monitoring the spatial and seasonal variation of vegetation and its ecological functions in arid areas.


Introduction
Vegetation cover is an important part of the ecosystem, which plays an important role in the soil and water conservation, restraining the desertification process, biodiversity protection and other ecological service and functions etc. [1,2]. From a functional point The objective of this study was to quantitatively estimate the seasonal variation in the f PV , f NPV and the fractional coverage of bare soil (f BS )in the lower reaches of the Tarim River based on Sentinel-2A image data. The study is divided into the following sections: (1) select the best PVI and NPVI as estimation index of f PV and f NPV according to the correlation coefficient of the PVIs with f PV and NPVIs with f NPV by linear regression models analysis, (2) build pixel unmixing model based on Sentinel-2A image data and quantitatively evaluate the results of estimating f PV and f NPV by using field sampled data and (3) map different periods of f PV , f NPV and f BS for the lower reaches of the Tarim River and analyze seasonal variation.

Study Area
The study area was located between the Dashkol Reservoir and Tetima Lake [39.5-40.59 • N, 87.56-88.46 • E] in the lower reaches of the Tarim River (Figure 1), which was surrounded by the Taklamakan Desert in the west and the Kuruktagh Desert in the northeast [29]. The study area's annual precipitation ranged from 20 to 50 mm but the annual potential evaporation varied from 2500 to 3000 mm [30,31]. The total annual solar radiation varied in a range of 5692-6360 MJ m −2 with an annual sunshine from 2780 to 2980 h [32]. The water supply of the vegetation in the area mainly depended on the water streamed from the upper reaches of the river. The vegetation was mostly distributed on the floodplain on the riverbank, which composed of trees, shrubs and herbs. Dominant trees included the Populus euphratica, Elaeagnus angustifolia, the shrubs consisted of the Tamarix ramosissima, Lycium ruthenicum, Halimodendron halodendron, and herbs Phragmites australis, Alhagi sparsifolia, Poacynum hendersonii, Karelinia caspia [2].
Remote Sens. 2021, 13, x FOR PEER REVIEW 3 of 17 resolution of 10 m and revisiting period of five days. Therefore, using the Sentinel-2 data with a high spatial and temporal resolution will be the focus to estimate the fPV and fNPV in the study area. The objective of this study was to quantitatively estimate the seasonal variation in the fPV, fNPV and the fractional coverage of bare soil (fBS)in the lower reaches of the Tarim River based on Sentinel-2A image data. The study is divided into the following sections: (1) select the best PVI and NPVI as estimation index of fPV and fNPV according to the correlation coefficient of the PVIs with fPV and NPVIs with fNPV by linear regression models analysis, (2) build pixel unmixing model based on Sentinel-2A image data and quantitatively evaluate the results of estimating fPV and fNPV by using field sampled data and (3) map different periods of fPV, fNPV and fBS for the lower reaches of the Tarim River and analyze seasonal variation.

Study Area
The study area was located between the Dashkol Reservoir and Tetima Lake [39.5-40.59° N, 87.56-88.46° E] in the lower reaches of the Tarim River (Figure 1), which was surrounded by the Taklamakan Desert in the west and the Kuruktagh Desert in the northeast [29]. The study area's annual precipitation ranged from 20 to 50mm but the annual potential evaporation varied from 2500 to 3000 mm [30,31]. The total annual solar radiation varied in a range of 5692-6360 MJ m -2 with an annual sunshine from 2,780 to 2,980 hours [32]. The water supply of the vegetation in the area mainly depended on the water streamed from the upper reaches of the river. The vegetation was mostly distributed on the floodplain on the riverbank, which composed of trees, shrubs and herbs. Dominant trees included the Populus euphratica, Elaeagnus angustifolia, the shrubs consisted of the Tamarix ramosissima, Lycium ruthenicum, Halimodendron halodendron, and herbs Phragmites australis, Alhagi sparsifolia, Poacynum hendersonii, Karelinia caspia [2].

Field Data
The fractional coverage samples of the PV, NPV and bare soil (BS) were collected from the Dashkol Reservoir, Gancaochang, Bozkol, Arghan, Kurgan and Tetima Lake. All samples were collected from 25 September to 29 September 2020 when the PV, NPV and BS existed simultaneously. Data collection and processing steps were as follows: we defined 10 m × 10 m squares, aligned to the north, and covered by a homogeneous vegetation distribution. The four corners and the center of the square were precisely located with GPS. Secondly, a smaller square of 1 m × 1 m was positioned randomly 3-5 times in the Remote Sens. 2021, 13, 1458 4 of 17 larger square. At the same time, a digital camera was used to take pictures vertically 1.6 m above the sample square. Each image was processed in ENVI 5.3. Photos were divided into NPV, PV and BS categories and training samples were established for each for supervised classification. Finally, we calculate the f PV , f NPV and f BS of all 1 m × 1 m squares in each 10 m × 10 m square, and count the average value to represent the f PV , f NPV and f BS in the 10 m × 10 m square. Figure 2 shows the field square acquisition and classification processing.

Field Data
The fractional coverage samples of the PV, NPV and bare soil (BS) were collected from the Dashkol Reservoir, Gancaochang, Bozkol, Arghan, Kurgan and Tetima Lake. All samples were collected from September 25 to September 29, 2020 when the PV, NPV and BS existed simultaneously. Data collection and processing steps were as follows: we defined 10m × 10m squares, aligned to the north, and covered by a homogeneous vegetation distribution. The four corners and the center of the square were precisely located with GPS. Secondly, a smaller square of 1m × 1m was positioned randomly 3-5 times in the larger square. At the same time, a digital camera was used to take pictures vertically 1.6m above the sample square. Each image was processed in ENVI 5.3. Photos were divided into NPV, PV and BS categories and training samples were established for each for supervised classification. Finally, we calculate the fPV, fNPV and fBS of all 1 m × 1 m squares in each 10 m × 10 m square, and count the average value to represent the fPV, fNPV and fBS in the 10 m × 10 m square. Figure 2 shows the field square acquisition and classification processing.

Remote Sensing Data
Sentinel-2A Level-1C data was used in this study. Images were from, seven periods on April 5, May 25, July 24, August 18, September 12, October 2 and November 6, 2020. The data were radiometrically calibrated and geometrically corrected, downloaded from the ESA website (https://sci hub.copernicus.eu/home). The image preprocessing was carried out in SNAP 6.0 provided by ESA (Sentinel application platform) and the ENVI 5.3 software platform. The SNAP 6.0 Sen2Cor (Sentinel to Correction) plug-in was used to perform atmospheric correction processing on all L1C data, which resulted in 9 bands (Band 2, Band 3, Band 4, Band 5, Band 6, Band 7. Band 8a, Band 11 and Band 12) L2A surface reflectance data. And the 9-band data were combined into a single multiple image using composite bands in Quantum GIS (QGIS) and then resampled to 10 m pixels. Image data were mosaicked, and the study area was clipped. Water features were masked by

Remote Sensing Data
Sentinel-2A Level-1C data was used in this study. Images were from, seven periods on 5 April, 25 May, 24 July, 18 August, 12 September, 2 October and 6 November 2020. The data were radiometrically calibrated and geometrically corrected, downloaded from the ESA website (https://sci hub.copernicus.eu/home). The image preprocessing was carried out in SNAP 6.0 provided by ESA (Sentinel application platform) and the ENVI 5.3 software platform. The SNAP 6.0 Sen2Cor (Sentinel to Correction) plug-in was used to perform atmospheric correction processing on all L1C data, which resulted in 9 bands (Band 2, Band 3, Band 4, Band 5, Band 6, Band 7. Band 8a, Band 11 and Band 12) L2A surface reflectance data. And the 9-band data were combined into a single multiple image using composite bands in Quantum GIS (QGIS) and then resampled to 10 m pixels. Image data were mosaicked, and the study area was clipped. Water features were masked by using the modified normalized difference water index (MNDWI) combined with the threshold method (threshold value is −0.08).

Methods
The general workflow of this project was: (1) construct linear regression models of the PVIs and f PV , NPVIs and f NPV, respectively, to select the optimal PVIs and NPVIs, (2) analyze the feasibility of the selected PVIs and NPVIs so as to construct the response space, (3) apply the pixel linear unmixed model to determine the end member values of the PV, NPV and BS based on the Sentinel-2A image data, (4) quantitatively evaluate the estimation accuracy of the f PV and f NPV using the field sampled data, (5) map time series of the f PV , f NPV and f BS for the lower reaches of the Tarim River and to analyze the seasonal variation. The flowchart is shown in Figure 3.

Methods
The general workflow of this project was: (1) construct linear regression models of the PVIs and fPV, NPVIs and fNPV, respectively, to select the optimal PVIs and NPVIs, (2) analyze the feasibility of the selected PVIs and NPVIs so as to construct the response space, (3) apply the pixel linear unmixed model to determine the end member values of the PV, NPV and BS based on the Sentinel-2A image data, (4) quantitatively evaluate the estimation accuracy of the fPV and fNPV using the field sampled data, (5) map time series of the fPV, fNPV and fBS for the lower reaches of the Tarim River and to analyze the seasonal variation. The flowchart is shown in Figure 3.

Pinty et al., 1992
Notes: R Red , R NIR , R SWIR1 and R SWIR2 represent the reflectance of the Red band, Near Infrared band, Shortwave Infrared 1 band (1600 nm) and Shortwave Infrared 2 band (2100 nm), corresponding to the band 4, 8, 11 and 12 of sentinel-2A data, respectively. L is a soil adjustment factor, based on the experience L = 0.5.

Vegetation Index Equation Citation
NDI (normalized difference index) Mc Nairn et al., 1993 Deventer et al., 1997 PVIs and NPVIs were calculated using ArcGIS 10.5 software. The PVI value and f PV measured value, and NPVI value and f NPV measured value, were constructed linear regression models of PVI and NPVI relative to f PV and f NPV, respectively, and were used to evaluate the accuracy of the various indices. Considering the number of samples we have taken, we used the leave-one-out cross-validation (LOOCV) which was suitable for a small number of sample. In LOOCV, each sample was excluded in turn and the regression model was calculated with all the remnants samples and used to predict that sample. The benefit of LOOCV was its aptitude to detect outliers and its capability of providing nearly unbiased estimations of the prediction error [4,37]. The performance of these models was assessed by the coefficients of determination (R 2 ), root mean square error of leave-one-out cross-validation (RMSECV) and regression significance (p): where n shows the sample plots' number, x i is the measured value of the sample plot i, y i stands for the estimated value of the sample plot i, x illustrates the mean value of the measured sample plots, y stipulates the mean value of the estimated sample plots.

Linear Unmixed Model
Guerschman [3] hypothesized that the NDVI and CAI could resolve the fractions of the PV, NPV and BS, when the NDVI and f PV were linearly related (as were the CAI and f NPV ). This situation is reflected in the scatterplot of the NDVI and CAI, namely that the feature space forms a triangle; BS is situated on the right side of the triangle, with a high NDVI and intermediate CAI value; the NPV is located in the upper left corner of the triangle, showing a low NDVI and a high CAI value; the BS is seen in the lower left corner of the triangle, having low NDVI and CAI values; Cao [24] used the DFI (replacing CAI) to present the NPV in order to construct the NDVI-DFI linear unmixed model, and it is successful in estimating the fractions of the PV, NPV and BS. Therefore, we applied the GEMI-DFI linear unmixed model to assess the fractions of the PV, NPV and BS in the study area ( Figure 4).

Linear Unmixed Model
Guerschman [3] hypothesized that the NDVI and CAI could resolve the fractions of the PV, NPV and BS, when the NDVI and fPV were linearly related (as were the CAI and fNPV). This situation is reflected in the scatterplot of the NDVI and CAI, namely that the feature space forms a triangle; BS is situated on the right side of the triangle, with a high NDVI and intermediate CAI value; the NPV is located in the upper left corner of the triangle, showing a low NDVI and a high CAI value; the BS is seen in the lower left corner of the triangle, having low NDVI and CAI values; Cao [24] used the DFI (replacing CAI) to present the NPV in order to construct the NDVI-DFI linear unmixed model, and it is successful in estimating the fractions of the PV, NPV and BS. Therefore, we applied the GEMI-DFI linear unmixed model to assess the fractions of the PV, NPV and BS in the study area ( Figure 4). PV is expressed by the global environmental monitoring index (GEMI) and the NPV by the dead fuel index (DFI), The GEMI formula is calculated as Table 1, The DFI formula is the following in Table 2. The relative proportions of each fractional coverage for any Sentinel-2A image pixel were found by solving the equations: PV is expressed by the global environmental monitoring index (GEMI) and the NPV by the dead fuel index (DFI), The GEMI formula is calculated as Table 1, The DFI formula is the following in Table 2. The relative proportions of each fractional coverage for any Sentinel-2A image pixel were found by solving the equations: where G S and D S show the GEMI and DFI value in the given Sentinel-2A image pixel, the f PV , f NPV and f BS illustrate the fractional coverage of the PV, NPV and BS; the G PV , G NPV and G BS are the GEMI values of the end members, the D PV , D NPV and D BS demonstrate the DFI values of the end members. It forces the values of f PV , f NPV and f BS to sum to unity. If the sum does not get one, the pixel has a negative or higher than 1 value in at least one end member. When that occurred, the following correction was applied: Remote Sens. 2021, 13,1458 8 of 17 C y = C y / C y + C z (8) where C x is the value not within a specified range (C x < −0.2 or C x > 1.2), and C y and C z are the values of the other two end members. If the range of C x is amounts from −0.2 to 0, C x is 0; if the range of C x is 1 to 1.2, C x is 1; the condition is mentioned in the above two cases, we only calculated C y and C z .

Determination of the GEMI and DFI End Member Value
The choice of pure end members was the key to success of the PV and NPV model inversion. The GEMI-DFI linear unmixed model needed to become a pure end member of the PV, NPV and BS to calculate the corresponding proper values and the model needed be used to solve the fractional coverage of each end member. We employed the pixel purity index (PPI) method to determine the pure end member. Firstly, the Sentinel-2A images were subjected to the minimum noise fraction (MNF) so as to reduce the image dimensionality during different periods. The first 6 bands of each image period were selected for calculation and the number of iterations was set to 5000 in order to generate the PPI. Secondly, the GEMI and DFI were calculated for various period images. Finally, the pixels near the vertices of the triangular feature space were regarded as pure pixels.

Model Evaluation
In order to quantitatively evaluate the performance of the model's estimation ability, the coefficients of determination (R 2 ), root mean square error (RMSE) and mean error (ME) were applied.
where n shows the sample plots' number, x i is the measured value of the sample plot i, y i stands for the estimated value of the sample plot i.

PVI and NPVI Index Optimization
In this study, six PV indices (NDVI, RVI, SAVI, MSAVI, RSR and GEMI) were selected to build linear regression models with f PV (Table 3 and Figure 5). Various PVIs exhibited different performance, with the R 2 ranging from 0.33 to 0.59 and the RMSECV ranging from 0.752 to 0.1283. The GEMI index showed the highest correlation, with the R 2 reaching 0.59 and RMSECV being 0.752 (p < 0.05). The RVI, MSAVI and RSR indices had low correlations (all R 2 lower than 0.50 and RMSECV higher than 0.8). The GEMI index was used as the PVI index to estimate the f PV of the study area. In order to count the GEMI value, the GEMI index value had been normalized.   We selected six NPVI indices (NDI, NDTI, NDSAVI, DFI, STI and SWIR32) to construct linear regression models with the f NPV (Table 4 and Figure 6). Compared with the correlation between the PVI indices and f PV , the correlation between the NPVIs and f NPV was lower. This might be caused by similar spectral characteristics of the NPV and BS [24]. The DFI index showed the best performance with a R 2 of 0.45 and RMSECV of 0.2111(p < 0.05). The STI, NDTI, SWIR32, NDSVI and NDI indices were lower than the DFI, with the R 2 ranging from 0.02 to 0.43, and the RMSECV from 0.2611 to 0.2627. Therefore, the DFI index was selected to assess the f NPV of the study area.

The Feasibility of the GEMI-DFI Model
We obtained the GEMI-DFI response space by calculating the GEMI and DFI values for seven Sentinel-2A images' data ( Figure 7). GEMI-DFI response space of the first six images was basically triangular, which was consistent with the theoretical conceptual mode. It showed that the GEMI-DFI linear unmixed model could be utilized to estimate the fractional coverage of the PV, NPV and BS. The last GEMI-DFI response space image did not seem to conform to the triangle shape. We selected images covering the growing season from spring to fall (April to November) and as vegetation leafed out and became active, GEMI values increased while DFI values declined. The inverse was seen with the onset of fall, as vegetation began to senesce and go dormant.

The Feasibility of the GEMI-DFI Model
We obtained the GEMI-DFI response space by calculating the GEMI and DFI values for seven Sentinel-2A images' data ( Figure 7). GEMI-DFI response space of the first six images was basically triangular, which was consistent with the theoretical conceptual mode. It showed that the GEMI-DFI linear unmixed model could be utilized to estimate the fractional coverage of the PV, NPV and BS. The last GEMI-DFI response space image did not seem to conform to the triangle shape. We selected images covering the growing season from spring to fall (April to November) and as vegetation leafed out and became active, GEMI values increased while DFI values declined. The inverse was seen with the onset of fall, as vegetation began to senesce and go dormant.

Evaluation of the f PV and f NPV Estimation Accuracy
Considering that field data collection occurred between 25-29 September in 2020, Sentinel 2A data collected on 2 October in 2020 were selected. We used the GEMI-DFI linear unmixed model (combined with the PPI) to obtain the spatial distribution of the f PV , f NPV and f BS . The BS proportion in the study area was the largest and the PV and NPV were mainly distributed along the downstream river and the Tetima Lake. In order to clearly show the model result in estimating the f PV and f NPV , we selected the areas with a vegetation distribution evenly for display. The four regions were: a (Dashkol), b (Chiwinkol), c (Bozkol) and d (Tetima Lake). The f NPV of the four regions was greater than the f PV , indicating that after October, the vegetation entered the end of the growing season and the increase in the NPV (during this period) caused an increase in the proportion of the f NPV (Figure 8

Evaluation of the fPV and fNPV Estimation Accuracy
Considering that field data collection occurred between September 25-29 in 2020, Sentinel 2A data collected on October 2 in 2020 were selected. We used the GEMI-DFI linear unmixed model (combined with the PPI) to obtain the spatial distribution of the fPV, fNPV and fBS. The BS proportion in the study area was the largest and the PV and NPV were mainly distributed along the downstream river and the Tetima Lake. In order to clearly show the model result in estimating the fPV and fNPV, we selected the areas with a vegetation distribution evenly for display. The four regions were: a (Dashkol), b (Chiwinkol), c (Bozkol) and d (Tetima Lake). The fNPV of the four regions was greater than the fPV, indicating that after October, the vegetation entered the end of the growing season and the increase in the NPV (during this period) caused an increase in the proportion of the fNPV (Figure 8) in turn. We could conclude from Figure 9 that the estimated f PV has the best correlation with the measured f PV , with a R 2 of 0.69 and RMSE of 0.07 (p < 0.05). The estimated f NPV and the measured f NPV have a lower correlation with a R 2 of 0.58 and RMSE of 0.17 (p < 0.05). For BS, the measured (and estimated) values possess the lowest correlation, with a R 2 of 0.43 and a RMSE of 0.17 (p < 0.05). For the PV, the fitted line is situated below the reference line (1:1 line) and the ME value amounts to −2.67%. The data mentioned above show that overall the estimated f PV value is lower than the measured f PV value, this is due to the fact that the acquisition time of the Sentinel-2A image data is later than the one for the field measured data, resulting in lower f PV estimates. Concerning the f NPV , the value of ME is 3.14%, this means that the estimated f PV value is higher than the measured f PV value. In general, the estimation accuracy of the f NPV and f BS is lower than the f PV , this reflects the fact that the PV could be resolved in an easier way than the NPV. Both the NPV and BS have similar spectral reflectance characteristics, resulting in a greater probability of wrong classification between the NPV and BS. We could conclude from Figure 9 that the estimated fPV has the best correlation with the measured fPV, with a R 2 of 0.69 and RMSE of 0.07 (p < 0.05). The estimated fNPV and the measured fNPV have a lower correlation with a R 2 of 0.58 and RMSE of 0.17 (p < 0.05). For BS, the measured (and estimated) values possess the lowest correlation, with a R 2 of 0.43 and a RMSE of 0.17 (p < 0.05). For the PV, the fitted line is situated below the reference line (1:1 line) and the ME value amounts to −2.67%. The data mentioned above show that overall the estimated fPV value is lower than the measured fPV value, this is due to the fact that the acquisition time of the Sentinel-2A image data is later than the one for the field measured data, resulting in lower fPV estimates. Concerning the fNPV, the value of ME is 3.14%, this means that the estimated fPV value is higher than the measured fPV value. In general, the estimation accuracy of the fNPV and fBS is lower than the fPV, this reflects the fact that the PV could be resolved in an easier way than the NPV. Both the NPV and BS have similar spectral reflectance characteristics, resulting in a greater probability of wrong classification between the NPV and BS.  We could conclude from Figure 9 that the estimated fPV has the best correlation with the measured fPV, with a R 2 of 0.69 and RMSE of 0.07 (p < 0.05). The estimated fNPV and the measured fNPV have a lower correlation with a R 2 of 0.58 and RMSE of 0.17 (p < 0.05). For BS, the measured (and estimated) values possess the lowest correlation, with a R 2 of 0.43 and a RMSE of 0.17 (p < 0.05). For the PV, the fitted line is situated below the reference line (1:1 line) and the ME value amounts to −2.67%. The data mentioned above show that overall the estimated fPV value is lower than the measured fPV value, this is due to the fact that the acquisition time of the Sentinel-2A image data is later than the one for the field measured data, resulting in lower fPV estimates. Concerning the fNPV, the value of ME is 3.14%, this means that the estimated fPV value is higher than the measured fPV value. In general, the estimation accuracy of the fNPV and fBS is lower than the fPV, this reflects the fact that the PV could be resolved in an easier way than the NPV. Both the NPV and BS have similar spectral reflectance characteristics, resulting in a greater probability of wrong classification between the NPV and BS.

Seasonal Variation of the f PV and f NPV
In this study, the b region was chosen as the representative area for seasonal variation. This was located in the Chiwinkol wetland, which was less disturbed by human activities and could have better indicated the natural alternation of the PV and NPV's seasonal variation. We selected Sentinel-2A image data from seven periods and used the GEMI-DFI linear unmixed model to estimate the f PV , f NPV and f BS during different periods ( Figure 10). We counted mean values of the f PV , f NPV and f BS in the b region ( Figure 11). By analyzing the changes of the f PV , f NPV and f BS values, it was found that the seasonal variation of the f PV and f NPV was estimated by the GEMI-DFI model in conformity with the characteristics of the vegetation phenology. In April, most of the vegetation not yet broken dormancy and as such, there was a large NPV amount, the GEMI value was low, the DFI value was high, the f NPV was 0.93 and the f PV measured 0.01. In May, as the vegetation had emerged and begun actively growing, the GEMI value increased, the DFI value decreased, the f PV rose to 0.17 and the f NPV declined to 0.69. As the growing season progressed, GEMI increased, peaking in August with DFI following an inverse trend, and the DFI value fell to the lowest point. Similarly, f PV reached maximum value, which was 0.44 and f NPV its lowest value of 0.45. In September, the vegetation began to yellow and GEMI fell and DFI rose. By the beginning of October, most of the vegetation had already withered, the f NPV occupied the dominant position again (at 0.81) and the f PV decreased further to 0.17. The f BS value remained in a relatively stable position throughout the whole year. tivities and could have better indicated the natural alternation of the PV and NPV's seasonal variation. We selected Sentinel-2A image data from seven periods and used the GEMI-DFI linear unmixed model to estimate the fPV, fNPV and fBS during different periods ( Figure 10). We counted mean values of the fPV, fNPV and fBS in the b region ( Figure 11). By analyzing the changes of the fPV, fNPV and fBS values, it was found that the seasonal variation of the fPV and fNPV was estimated by the GEMI-DFI model in conformity with the characteristics of the vegetation phenology. In April, most of the vegetation not yet broken dormancy and as such, there was a large NPV amount, the GEMI value was low, the DFI value was high, the fNPV was 0.93 and the fPV measured 0.01. In May, as the vegetation had emerged and begun actively growing, the GEMI value increased, the DFI value decreased, the fPV rose to 0.17 and the fNPV declined to 0.69. As the growing season progressed, GEMI increased, peaking in August with DFI following an inverse trend, and the DFI value fell to the lowest point. Similarly, fPV reached maximum value, which was 0.44 and fNPV its lowest value of 0.45. In September, the vegetation began to yellow and GEMI fell and DFI rose. By the beginning of October, most of the vegetation had already withered, the fNPV occupied the dominant position again (at 0.81) and the fPV decreased further to 0.17. The fBS value remained in a relatively stable position throughout the whole year.

Discussion
This study explored the performance of several Sentinel-2A based PVIs by constructing linear regression models in the lower reaches of the Tarim River. The GEMI index has the best correlation with the fPV, which is consistent with the conclusion of Liu [38] on the vegetation information monitoring this area. The GEMI index shows greater advantages in detecting the low coverage vegetation information. In the regression analysis between the NPVI index and fNPV, the correlation between the DFI index and fNPV is better than for

Discussion
This study explored the performance of several Sentinel-2A based PVIs by constructing linear regression models in the lower reaches of the Tarim River. The GEMI index has the best correlation with the f PV , which is consistent with the conclusion of Liu [38] on the vegetation information monitoring this area. The GEMI index shows greater advantages in detecting the low coverage vegetation information. In the regression analysis between the NPVI index and f NPV , the correlation between the DFI index and f NPV is better than for those of the NDI, NDTI, NDSVI, STI and SWIR32. Among them, the NDI and NDSVI indices mainly use Band 11 (SWIR1: 1565~1655 nm), combined with a visible light band (NDI for the NIR band, NDSVI for the Red band) and without combining Band 12 (SWIR: 2100~2280 nm). Previous research has demonstrated that the cellulose absorption signature (amounting to around 2100 nm in the SWIR region) has a strong discriminatory ability between the NPV, PV and BS [8]; so Band 12 is a good choice to distinguish the PV, NPV and BS. Compared with the DFI index, STI and SWIR32 simply perform ratio calculations. The DFI index uses more bands for the calculation, including the Red, NIR SWIR1 and SWIR2. It might reduce the impact of the BS background to some extent and it might further improve NPV detection [4]. Ji [39] similarly concluded that the red-edge and NIR bands of the Sentinel-2 data are effective in improving the accuracy of the f NPV estimates. The manner in which the red-edge band of the Sentinel-2 data could be utilized in order to estimate the vegetation information accurately still need further exploration.
The study uses the PPI method to extract the pure end members of the image data, which eliminates the discrepancies between the image data and the measured data, ensuring that both have the same spatial scale, which is widely employed to select the pure end of the spectrum [14,37]. We were able to estimate the f PV and f NPV of the study area in different periods by means of the GEMI-DFI model, revealing the changes in the photosynthetic and non-photosynthetic vegetation at different growth stages. The NPV proportion is greater at the beginning and at the end of the growing season, while the proportion of the PV peaks in the middle of the vegetation growing season. This is consistent with the findings of Wang [40] estimating the f PV and f NPV in the typical grasslands of Xilingol, indicating that the GEMI-DFI model is feasible to estimate the f PV and f NPV in the lower reaches of the Tarim River. The accuracy of the f NPV estimation (R 2 = 0.58) is slightly lower than for the f PV estimation (R 2 = 0.69). As the NPV and BS have similar spectral reflection characteristics in the visible band [3,41], the NPV estimation is more susceptible to the influence of the BS background, which could lead to a false distinction between the NPV and BS. Both the GEMI and DFI values are influenced by a variety of factors, such as the vegetation type, vegetation structure, etc. [17,24]. Therefore, more consideration should be given to the influencing factors in future studies on the PV and NPV.
The overall vegetation coverage in the study area is low, as vegetation is mostly distributed along the watercourses. Due to this, higher spatial resolution imagery is required to obtain more detailed information on the type of the vegetation [42]. Based on this, the Sentinel-2A data with a resolution of 10 m were used, which might reduce the spatial heterogeneity of the mixed pixels and improve the estimation accuracy by a better acquisition of pure pixels. There is still some uncertainty noticeable in the estimation of the PV, NPV and BS using multispectral data. In fact, the vegetation and other surrounding components form a complex system [37] and it endures stages of greening, heading, flowering, maturing and yellowing [43]. The above-mentioned conditions affect the acquisition of the estimated GEMI and DFI values, thereby impacting the model's accuracy to estimate the f PV , f NPV and f BS . In addition, uncertain factors also exist in the field data collection. The moment in which the photo was taken does not match the time of the satellite transit; the visual interpretation of the PV, NPV and BS classification and the accuracy of the geometric alignment will inevitably be influenced by subjective factors, resulting in errors in the f PV , f NPV and f BS estimates [18]. This study relies on the coverage obtained from field sampled data as the verification data. Taking into account the diversity of vegetation types and structures, the spectral characteristics of vegetation are different. We should combine spectral data to accurately estimate the changing trends of different types of vegetation. The use of drones is also an effective measure to improve the estimation of vegetation coverage. This method can obtain a large area of vegetation coverage and reduce the error of its estimation and should be taken into account in future research work.

Conclusions
In this study, we constructed the GEMI-DFI linear unmixed model based on the Sentinel-2A image data so as to estimate the fractional coverage of the PV, NPV and BS in the lower reaches of the Tarim River combined with the field measured data for an accuracy evaluation. The main conclusions are as follows: We established a linear regression model for the PVIs and f PV , and NPVIs and f NPV . The study found that the GEMI have a significantly linear correlation with the f PV , R 2 is 0.59 and the DFI has a significantly linear correlation with the f NPV , R 2 measuring 0.45. We used GEMI and DFI indices to construct linear unmixed model, the response space shown as triangle, which conformed the basic assumption of the linear unmixed model.
The GEMI-DFI linear unmixed models could effectively estimate the f PV and f NPV , but the accuracy of estimation f PV is higher than that of f NPV . How to improve the accuracy of estimation of f NPV is the focus of future work.
Considering the number of sampling data, we need to collect more vegetation coverage data in future work. We can use drones as a platform for obtaining vegetation coverage to achieve a wide range of coverage, thereby improving the accuracy of f PV , f NPV and f BS estimation.