A 2001 – 2015 Archive of Fractional Cover of Photosynthetic and Non-Photosynthetic Vegetation for Beijing and Tianjin Sandstorm Source Region

Fractional covers of photosynthetic and non-photosynthetic vegetation are key indicators for land degradation surveillance in the dryland of China. However, there are no available, well validated, and multispectral-based products. Aiming for this, we selected the Beijing and Tianjin Sandstorm Source Region as the study area, and utilized the linear spectral mixture model for generating the fractional cover of PV, NPV, and bare soil, with endmember spectra retrieved from the field measured endmember spectral library, based on the MODIS NBAR data from 2001 to 2015. The unmixing results were validated through comparison with the field samples. The results show the method adopted could acquire rational and accurate estimation of fractional cover of photosynthetic vegetation (R2 = 0.6297, RMSE = 0.2443) and non-photosynthetic vegetation (R2 = 0.3747, RMSE = 0.2568). The dataset could provide key data support for the users in land degradation surveillance fields. Data Set: https://figshare.com/s/b7fc746acbc986a20bf2 Data Set License: CC-BY

Fractional cover of vegetation plays a key role in resisting wind and water erosion, thus it has been widely used as the indicator for land degradation monitoring and assessment.From a functional perspective, vegetation can be categorized as photosynthetic (green leaves) and non-photosynthetic (wood, senescent material, and litter) material [1].Unlike the well-developed multiple datasets of the fractional cover of PV, the datasets of the fractional cover of NPV are relatively rare, mainly because of the difficulty in differentiating NPV from the soil background, particularly when multispectral sensors are considered.However, NPV is common and widely distributed in the drylands due to the scarce, variable rainfall and low soil fertility.Therefore, simultaneously acquiring the fractional cover of PV and NPV would provide new insights for land degradation surveillance and land management.
In this context, an archive of fractional cover of PV and NPV was computed for the Beijing and Tianjin Sandstorm Source Region (BTSSR), with monthly temporal resolution covering 2001 through 2015, and a spatial resolution of 500 m.The BTSSR-including 75 counties in Beijing, Tianjin, Hebei, Shanxi, and Inner Mongolia-has a total area of 458,000 km 2 , of which approximately 101,200 km 2 is desertified, mainly in the Otindag and Horqin Sandy Lands, where land degradation has been attributed to overgrazing, excessive reclamation, deforestation, and climate change (Figure 1).The study area includes arid, semi-arid, dry sub-humid, and semi-humid climates, with an annual precipitation ranging from 200 mm in the northwest to 600 mm in the southeast.desertified, mainly in the Otindag and Horqin Sandy Lands, where land degradation has been attributed to overgrazing, excessive reclamation, deforestation, and climate change (Figure 1).The study area includes arid, semi-arid, dry sub-humid, and semi-humid climates, with an annual precipitation ranging from 200 mm in the northwest to 600 mm in the southeast.So far, remote sensing is the unique means to acquire vegetation cover at large scales that might otherwise be costly and labor-intensive [2][3][4].Over the past several decades, vegetation indices, which exploit the difference between visible and near-infrared (NIR) reflectance, have been widely utilized for estimating the fractional cover of vegetation.However, these indices are only sensitive to the amount of photosynthetic vegetation (PV), as well as its turgidity and greenness [5].On the contrary, retrieving NPV coverage has been scarcely investigated at a large scale.The cellulose absorption index (CAI), computed from hyperspectral data, has been proven to be an effective method to resolve NPV cover [6][7][8], and linear unmixing of hyperspectral bands affected by cellulose and lignin were also utilized successfully to retrieve NPV cover [9] and monitor degradation and desertification [10].However, these methods, based on the hyperspectral sensors, face a great challenge for land degradation surveillance due to the shortage of data acquisition ability for large regions.Considering multispectral sensors, some spectral indexes sensitive to NPV-such as the normalized difference senescent vegetation index (NDSVI) [11], a ratio of moderate resolution imaging spectrometer (MODIS) bands 7 and 6 [1], and the dead fuel index (DFI) [12] -were proposed in different environments.However, this approach is area-specific, and not well validated in other environments.
Spectral mixture analysis (SMA) provides another promising method for retrieving PV and NPV cover from multispectral imagery.So far, remote sensing is the unique means to acquire vegetation cover at large scales that might otherwise be costly and labor-intensive [2][3][4].Over the past several decades, vegetation indices, which exploit the difference between visible and near-infrared (NIR) reflectance, have been widely utilized for estimating the fractional cover of vegetation.However, these indices are only sensitive to the amount of photosynthetic vegetation (PV), as well as its turgidity and greenness [5].On the contrary, retrieving NPV coverage has been scarcely investigated at a large scale.The cellulose absorption index (CAI), computed from hyperspectral data, has been proven to be an effective method to resolve NPV cover [6][7][8], and linear unmixing of hyperspectral bands affected by cellulose and lignin were also utilized successfully to retrieve NPV cover [9] and monitor degradation and desertification [10].However, these methods, based on the hyperspectral sensors, face a great challenge for land degradation surveillance due to the shortage of data acquisition ability for large regions.Considering multispectral sensors, some spectral indexes sensitive to NPV-such as the normalized difference senescent vegetation index (NDSVI) [11], a ratio of moderate resolution imaging spectrometer (MODIS) bands 7 and 6 [1], and the dead fuel index (DFI) [12] -were proposed in different environments.However, this approach is area-specific, and not well validated in other environments.
Spectral mixture analysis (SMA) provides another promising method for retrieving PV and NPV cover from multispectral imagery.[17].The above-mentioned applications show that SMA with MODIS data is an effective mean for retrieving PV and NPV cover simultaneously.However, there is no available and well validated datasets in China, especially for the drylands.Thus, the PV and NPV cover datasets for the BTSSR based on AUTOMCU with MODIS data was produced in order to provide support for land degradation surveillance.

Data and Metadata
The data cover land ranging from 38 • 50 to 46 • 40 N and from 109 • 30 to 120 • 30 E. The archive starts in January 2001 and ends in December 2015.It contains one raster layer per month for each of these variables: Fractional cover of PV (%) Fractional cover of NPV (%) Fractional cover of bare soil (%)

Metadata
The archive is managed in a Geographic Information System (GIS).All the layers are in the GeoTiff format.The valid data range spans from 0 to 100.Table 1 describes the relevant metadata fields.

Dataset
The archive is organized in three large compressed files, one per variable.File names are according to the following conventions:

Remotely Sensed Data
We used MODIS Nadir Bidirectional Reflectance Distribution Function (BRDF) Adjusted Reflectance (NBAR) acquired between January 2001 and December 2015.MODIS NBAR data (MCD43A4) uses daily Terra and Aqua satellite overpasses to produce a 500 m resolution conglomerated image every eight days, and as such offers a high temporal frequency [18].
We acquired all data via the Google Earth Engine (https://code.earthengine.google.com).The NDVI was calculated for each layer, and the values with the maximum NDVI per month were selected to composing a NBAR dataset per month.The data were spatially subset to the study area dimensions.We used Google Earth Engine for all processing steps [19].
3.1.2.Field Spectroscopy Spectral characteristics of PV, NPV, and bare soils are highly variable in the BTSSR, their spectral properties were thoroughly investigated with an Analytical Spectral Devices (ASD) full-range (350-2500 nm) Fieldspec ® 4 spectroradiometer with a 25 • sensor foreoptic in 2014.All measurements were collected within two hours of local solar noon on clear sky days.Field spectra were collected during two periods.The first measurement was conducted on July 2014, representative of maximum PV existence.In order to acquire additional NPV spectra, a second measurement was conducted on November 2014, representative of maximum NPV existence.Based on the spectral response function of the MODIS sensor, the field spectra were resampled to the MODIS NBAR bands.

In Situ Fractional Ground Cover Data
In situ fractional ground cover data were collected in late August, the period of maximum green vegetation cover in 2015, following the method proposed by Muir et al. (2011) [20].In natural vegetation communities, three 100-m measuring transects are laid in a star-shape.For vegetation in parallel rows, two 100-m measuring transects are oriented at 45 degrees across the sowing lines.An observation of the ground cover is made every meter of each transect.Finally, fractional cover was determined by combining the point records of all transects, 300 points for natural vegetation and 200 points for vegetation in rows, respectively.The coordinates of the cross point of transects were recorded by a global positioning system (Trimble GeoXT 2800-3000).Finally, ground covers of 24 fields with a size of 100 × 100 m were investigated for validating the fractional cover products.

Linear Spectral Mixture Model (LSMM)
In LSMM, the reflectance of a pixel is assumed to be a linear combination of the reflectance of the spectra of the endmembers, weighted by their fractional cover.In its general form, the LSMM can be described as where R i is the measured reflectance of a mixed pixel in spectral band i, f j is the sub-pixel cover fraction of the j-th endmember in the pixel, and W i,j is the j-th endmember reflectance for spectral band i, m is the number of the endmembers.Endmembers are fundamental physical components that themselves are not mixtures of other components.Considering the coarse resolution of MODIS NBAR data, the average spectra of PV, NPV, and bare soil were utilized as the endmember spectra (Figure 2).

Unmixing Technique
A Fully Constrained Least Square (FCLS) algorithm was applied to solve the LSMM [21], with two important constraints on : 1, the fraction sum-to-one constraint (ASC) ∑ = 1 and 2 and the fraction nonnegativity constraint (ANC) 0. For ASC, we include ASC in the signature matrix by introducing a new signature matrix; For ANC, an iteration algorithm proposed by Chang and Heinz [22] was adopted by introducing a Lagrange multiplier vector.
The algorithms for FCLS were transformed into IDL (Interactive Data Language) routines, based on which, all the fractional covers were produced.The fractional cover of PV, NPV, and BS in September 2015 are shown in Figure 3.

Unmixing Technique
A Fully Constrained Least Square (FCLS) algorithm was applied to solve the LSMM [21], with two important constraints on f k : 1, the fraction sum-to-one constraint (ASC) n ∑ k=1 f k = 1 and 2 and the fraction nonnegativity constraint (ANC) f k ≥ 0. For ASC, we include ASC in the signature matrix ρ by introducing a new signature matrix; For ANC, an iteration algorithm proposed by Chang and Heinz [22] was adopted by introducing a Lagrange multiplier vector.
The algorithms for FCLS were transformed into IDL (Interactive Data Language) routines, based on which, all the fractional covers were produced.The fractional cover of PV, NPV, and BS in September 2015 are shown in Figure 3.

Unmixing Technique
A Fully Constrained Least Square (FCLS) algorithm was applied to solve the LSMM [21], with two important constraints on : 1, the fraction sum-to-one constraint (ASC) ∑ = 1 and 2 and the fraction nonnegativity constraint (ANC) 0. For ASC, we include ASC in the signature matrix by introducing a new signature matrix; For ANC, an iteration algorithm proposed by Chang and Heinz [22] was adopted by introducing a Lagrange multiplier vector.
The algorithms for FCLS were transformed into IDL (Interactive Data Language) routines, based on which, all the fractional covers were produced.The fractional cover of PV, NPV, and BS in September 2015 are shown in Figure 3.

Accuracy Assessment
To compare the performance of different SMA techniques on PV/NPV fractional cover estimation, two metrics were calculated against observed data, the RMSE and coefficient of determination (R 2 ) of linear regression where n is the number of fields, x i is the estimated fractional cover of field i, y i is the measured fractional cover of field i, x is the average value of the estimated fractional cover, and y is the average value of the measured fractional cover.
The validation results for fractional cover of PV and NPV were shown in Figure 4.Both fractional cover showed statistically significant correlations with observed data.Fractional cover of PV was with higher accuracy (R 2 = 0.6297, RMSE = 0.2443) compared to NPV (R 2 = 0.3747, RMSE = 0.2568).It is worth noting that the PV fraction was underestimated to some extent, while the NPV fraction was overestimated slightly, which could be caused by invariant endmember effects, scale, and time difference between field investigation and remote sensing data.

Accuracy Assessment
To compare the performance of different SMA techniques on PV/NPV fractional cover estimation, two metrics were calculated against observed data, the RMSE and coefficient of determination (R 2 ) of linear regression where n is the number of fields, is the estimated fractional cover of field i, is the measured fractional cover of field i, ̅ is the average value of the estimated fractional cover, and is the average value of the measured fractional cover.
The validation results for fractional cover of PV and NPV were shown in Figure 4.Both fractional cover showed statistically significant correlations with observed data.Fractional cover of PV was with higher accuracy (R 2 = 0.6297, RMSE = 0.2443) compared to NPV (R 2 = 0.3747, RMSE = 0.2568).It is worth noting that the PV fraction was underestimated to some extent, while the NPV fraction was overestimated slightly, which could be caused by invariant endmember effects, scale, and time difference between field investigation and remote sensing data.

User Notes
The monthly fractional cover of PV, NPV, and bare soil are compressed in three independent zip files, one per variable.Each layer is stored in GeoTiff format, which could be used directly by most GIS and image processing software.
Total vegetation cover could be acquired through summing fractional cover of PV and NVP, which is a key indicator for land degradation surveillance.Also, period summaries of PV and NPV could be derived in order to match with other data.
Inconsistency between field-measured endmember and MODIS NBAR data would lead to some systematic error in the unmixing results.The dataset could be improved through calibrating with more field investigation data from the users.

User Notes
The monthly fractional cover of PV, NPV, and bare soil are compressed in three independent zip files, one per variable.Each layer is stored in GeoTiff format, which could be used directly by most GIS and image processing software.
Total vegetation cover could be acquired through summing fractional cover of PV and NVP, which is a key indicator for land degradation surveillance.Also, period summaries of PV and NPV could be derived in order to match with other data.
Inconsistency between field-measured endmember and MODIS NBAR data would lead to some systematic error in the unmixing results.The dataset could be improved through calibrating with more field investigation data from the users.

Figure 1 .
Figure 1.Location of study area, land cover obtained from the China Cover 2010.

Figure 1 .
Figure 1.Location of study area, land cover obtained from the China Cover 2010.

Figure 4 .
Figure 4. (a) PV and (b) NPV cover estimation accuracy validation against field samples.

Acknowledgments:
This work was funded by National Key Research and Development Program (No. 2016YFC0500806), National Natural Science Foundation of China (No. 41571421).

Figure 4 .
Figure 4. (a) PV and (b) NPV cover estimation accuracy validation against field samples.

Table 1 .
Data set characteristics of the fractional cover