Evaluating the Effectiveness of Using Vegetation Indices Based on Red-Edge Reflectance from Sentinel-2 to Estimate Gross Primary Productivity

Gross primary productivity (GPP) is the most important component of terrestrial carbon flux. Red-edge (680–780 nm) reflectance is sensitive to leaf chlorophyll content, which is directly correlated with photosynthesis as the pigment pool, and it has the potential to improve GPP estimation. The European Space Agency (ESA) Sentinel-2A and B satellites provide red-edge bands at 20-m spatial resolution on a five-day revisit period, which can be used for global estimation of GPP. Previous studies focused mostly on improving cropland GPP estimation using red-edge bands. In this study, we firstly evaluated the relationship between eight vegetation indices (VIs) retrieved from Sentinel-2 imagery in association with incident photosynthetic active radiation (PARin) and carbon flux tower GPP (GPPEC) across three forest and two grassland sites in Australia. We derived a time series of five red-edge VIs and three non-red-edge VIs over the CO2 flux tower footprints at 16-day time intervals and compared both temporal and spatial variations. The results showed that the relationship between the red-edge index (CIr, ρ783 ρ705 − 1) multiplied by PARin and GPPEC had the highest correlation (R2 = 0.77, root-mean-square error (RMSE) = 0.81 gC·m−2·day−1) at the two grassland sites. The CIr also showed consistency (rRMSE defined as RMSE/mean GPP, lower than 0.25) across forest and grassland sites. The high spatial resolution of the Sentinel-2 data provided more detailed information to adequately characterize the GPP variance at spatially heterogeneous areas. The high revisit period of Sentinel-2 exhibited temporal variance in GPP at the grassland sites; however, at forest sites, the flux-tower-based GPP variance could not be fully tracked by the limited satellite images. These results suggest that the high-spatial-resolution red-edge index from Sentinel-2 can improve large-scale spatio-temporal GPP assessments.


Introduction
Remote-sensing satellites provide continuous spatio-temporal land surface data, which can be used to monitor vegetation-growing conditions. Gross primary productivity (GPP) is one of the most important components of carbon flux in terrestrial ecosystems [1]. In recent years, many remote-sensing

Field Sites
We selected five CO 2 flux sites in southeast Australia representing considerable variations in region, climate, and species, including three evergreen broadleaf forests (EBF) and two grassland (GRA) sites (Table 1, Figure 1).
The Cumberland Plain (CUM) site is located in a dry sclerophyll forest in the Hawkesbury Valley in central New South Wales. The canopy is dominated by Eucalyptus moluccana and Eucalyptus fibrosa.
The Tumbarumba (TUM) site is located in the Bago State Forest in southeastern New South Wales. The forest is classified as wet sclerophyll, and the dominant species is Eucalyptus delegatensis. The Wombat Forest (WOM) site is located in the Wombat State Forest, Victoria. The site is a secondary regrowth forest that was last harvested in 1980. The dominant tree species are Eucalyptus obliqua (messmate stringybark), Eucalyptus radiata (narrow leaf peppermint), and Eucalyptus rubida (candlebark), with an average canopy height of 25 m. Table 1. Information on research sites. Climate types follow the Köppen classification system [54] (Cfa-warm temperate fully humid with hot summer, Cfb-warm temperate fully humid with warm summer, BSk-arid steppe cold). EBF and GRA in the vegetation type column represent evergreen broadleaf forest and grassland, respectively. ID-identifier; Jan-January; Oct-October; Lat-latitude; Lon-longitude; HLS-Harmonized Landsat and Sentinel-2 project. The Riggs Creek (RIG) site is located in the Goulburn-Broken catchment in northeastern Victoria, with the surrounding area dominated by broadacre farming practices. The vegetation cover is predominantly pasture. The Yanco (YNC) site is located within the Yanco region of New South Wales. The flux tower site in the Yanco area is located in the western plains of the Murrumbidgee catchment and is within a wider research area (60 × 60 km) that supports a network of OzFlux stations.

Tower-Based Carbon Flux Data
All sites contained CO2 flux data and incident photosynthetic active radiation (PARin) measurements during the research period. The carbon flux data at each site during the study period were collected by members of OZflux [60]. The CO2 flux data at these sites were measured by an Li-7500 (Campbell Scientific Inc., MS, USA), which is an open-path infrared CO2/H2O gas analyzer, in half-hour time steps. These flux data were described in Beringer et al. 2011, Cleverly et al. 2016a, 2016b, and Li et al. 2017 [61][62][63][64][65]. Data were downloaded from the Australian Terrestrial Ecosystem Research Network (TERN) website (http://data.ozflux.org.au/portal/pub/listPubCollections.jspx) and then post-processed using the REddyProc tool [66]. The REddyProc tool was previously used to transform half-hourly CO2 flux input data into half-hourly GPP data [67,68]. We used the daytime method, which avoids the CO2 flux transfer uncertainties resulting from potentially problematic nighttime data [69]. This method was successfully used to evaluate the global carbon flux [1] to partition the CO2 flux into GPP. Then, we aggregated the half-hourly GPP and PARin data on a daily scale. The aggregated flux tower-based GPP data are presented as GPPEC in the following sections.

Sentinel-2 Remote-Sensing Products
We applied Sentinel-2 data from the Harmonized Landsat and Sentinel-2 (HLS) project to provide high-frequency multispectral data with moderate-to-high spatial resolution [70]. The HLS dataset provided remote-sensing reflectance with high spatial resolution that was atmospherically corrected, and the data were mapped to the Military Grid Reference System [71]. We used the HLS

Tower-Based Carbon Flux Data
All sites contained CO 2 flux data and incident photosynthetic active radiation (PAR in ) measurements during the research period. The carbon flux data at each site during the study period were collected by members of OZflux [60]. The CO 2 flux data at these sites were measured by an Li-7500 (Campbell Scientific Inc., MS, USA), which is an open-path infrared CO 2 /H 2 O gas analyzer, in half-hour time steps. These flux data were described in Beringer et al. 2011, Cleverly et al. 2016a, 2016b, and Li et al. 2017. Data were downloaded from the Australian Terrestrial Ecosystem Research Network (TERN) website (http://data.ozflux.org.au/portal/pub/listPubCollections.jspx) and then post-processed using the REddyProc tool [66]. The REddyProc tool was previously used to transform half-hourly CO 2 flux input data into half-hourly GPP data [67,68]. We used the daytime method, which avoids the CO 2 flux transfer uncertainties resulting from potentially problematic nighttime data [69]. This method was successfully used to evaluate the global carbon flux [1] to partition the CO 2 flux into GPP. Then, we aggregated the half-hourly GPP and PAR in data on a daily scale. The aggregated flux tower-based GPP data are presented as GPP EC in the following sections.

Sentinel-2 Remote-Sensing Products
We applied Sentinel-2 data from the Harmonized Landsat and Sentinel-2 (HLS) project to provide high-frequency multispectral data with moderate-to-high spatial resolution [70]. The HLS dataset Remote Sens. 2019, 11, 1303 5 of 25 provided remote-sensing reflectance with high spatial resolution that was atmospherically corrected, and the data were mapped to the Military Grid Reference System [71]. We used the HLS land surface reflectance product for Sentinel-2 data, which were resampled to 30-m spatial resolution from the HLS v4 dataset (Figure 1). We evaluated the effect of cloud cover on the satellite image by using the cloud and shadow masks provided in the HLS data.

MODIS GPP Product
This study also used the MOD17A2H GPP product as representative of light-use efficiency (LUE)-based GPP and compared it to the VI-based GPP estimation results. The MOD17A2H GPP product provided spatio-temporally continuous GPP data with 500-m spatial resolution for eight-day averages [2]. We chose the 3 × 3 pixels' region around the flux tower from the MOD17A2H product, and the mean value of these pixels was set as the MODIS-based GPP result.

Methods
The main CO 2 flux footprint region is approximately 500 m around the flux tower [72,73], which is spatially representative of GPP EC . The spatial resolution of the MOD17A2H product is 463 m. To compare the remote-sensing data, including Sentinel-2-based VI and MOD17A2H products, with tower-based CO 2 flux data at a fine scale, we selected the standard spatial window as 450 × 450 m around the flux tower, which was approximately 15 × 15 pixels of the Sentinel-2 data aggregated at 30-m spatial resolution.

Remote-Sensing-Based Indices
Eight VIs were calculated from the Sentinel-2 data, as shown in Table 2. Three indices did not contain red-edge information, including the EVI [74], NDVI [75], and the near-infrared reflectance of terrestrial vegetation (NIRv [76]). Five of the indices, including one or more of the three Sentinel-2 red-edge bands (CIr, CIg, and MTCI) and two normal deviation indices of the red edge (NDRE1, NDRE2 [36,77]), were related to red-edge reflectance (band 5 to band 7 of the MSI sensor).

Evaluation of the Cloud Effect
For the Sentinel-2 data, the percentage of no cloud cover (Pclear) was defined as where N clear is the number of pixels with no cloud cover, N cloudcover is the number of pixels with cloud cover, and N cloudshadow is the number of pixels with cloud shadow masks. Here, we defined the effective images as the data in the spatial image window with P clear values >0.7. Cloud cover affects the diffuse scattering of light in the sky; thus, it changes the light intensity and influences the carbon assimilation rate for photosynthesis [78]. Here, we applied the concept of potential PAR (PAR pot ), which was the maximal incident PAR during the 16-day interval [79]. Therefore, the daily cloud effect at the site was defined as When the clear sky index (CSI) is approximately 1, clouds have little effect on the incident PAR.

Rebuilding the Time Series of Vegetation Indices at the Sites
The VIs were derived from the reflectance data that were affected by cloud cover, resulting in some temporal gaps. Thus, the rebuilding of the time series is important for filling the gaps resulting from missing VIs. We used the following steps to derive the spatio-temporally continuous VIs: (1) The reflectance was selected in each image from the Sentinel reflectance product in the 450 × 450-m spatial window. (2) The reflectance values were removed from the pixels where there were cloud and shadow masks.
(3) The P clear was calculated in the spatial window. (4) The VI was calculated from the reflectance in each pixel where there were no cloud and shadow masks. (5) The maximal VI was chosen with P clear > 0.8 during the standard interval in each pixel (here, we defined it as a 16-day interval from the first day of the year) as the true VI value [80]. (6) A continuous VI time series in each pixel was rebuilt by a Savitzky-Golay filter [81].
The daily VI was interpolated from the rebuilt VI time series within a standard interval. Then, we selected the cloud-free images at the five sites during the peak of the growing season from 1 January 2016 to 1 October 2018, in order to characterize the spatial variance in vegetation in the study area. The standard deviations of CIr and EVI in the spatial window were used to quantify the spatial heterogeneity of the total canopy chlorophyll content.

Estimating GPP by VI and Statistical Analysis
Our hypothesis is that GPP is a function of the VI and PAR in , as shown below.
Because different remote-sensing chlorophyll-related VIs have unique ranges and the environment has various conditions, the relationship for the VI i -PAR in -GPP model was not constant.
In Equation (4), the coefficient values of a i and b i were changed according to the specific VI i . In each iteration, we randomly selected 70% of the observations (calibration set), including 16-day interval data of the mean GPP EC , time-series rebuilt VI i , and mean PAR in , to calibrate the coefficients a i and b i . The remaining 30% (validation set) of the observations were used to validate the relationship between GPP VIi and GPP EC . The correlations of GPP VIi and GPP EC in the validation set were described by the determination coefficient (R 2 ), root-mean-square error (RMSE), and related RMSE (rRMSE, which is defined as the ratio of RMSE and mean observed GPP). We then repeated this iteration 100 times. The mean R 2 , RMSE, and rRMSE values of the 100 iterations in the validation set were used to quantify the temporal relationship between GPP VIi and GPP EC , while the mean a i and b i values in the calibration set were the general coefficients of VI i -PAR in -GPP.
Then, we mapped the GPP distribution based on the spatio-temporally continuous VI i and the relationship of VI i -PAR in -GPP at each site. We simplified the expression for the relationship of VI i -PAR in -GPP into GPP VI ; for example, GPP EVI represents the GPP estimated by EVI and PAR in in the later sections.

Temporal Relationship between GPP VI and GPP EC
Cloud cover affected the number of effective images of the sites, determining whether the gaps in the VI needed to be filled or not. The percentage of cloud-free pixels in the Sentinel-2 tiles of grassland sites was mostly less than 50%. In particular, at YNC, more than 70% of the data had P clear values greater than 30%. However, the EBF sites contained much more cloud cover than did the GRA sites. At the TUM site, 10% of the data had less than 70% cloud cover ( Figure 2). During the 16-day standard interval, the number of effective images was less than two at the grassland sites before the launch of Sentinel-2A ( Figure 3). After the launch of Sentinel-2B, the networking system of Sentinel-2A and Sentinel-2B increased the number of effective images to four during the 16-day interval. The number of effective images at CUM also significantly increased (average from one image per month to two or more per month). During the end of winter to the early summer, the number of effective images after Sentinel-2B launched was two times more than when the images were obtained by Sentinel-2A alone. However, during the end of summer and early autumn at the TUM and WOM sites, there were almost no cloud-free images due to the rainy season in southeast Australia. Thus, although the networking Sentinel-2 system has a five-day revisit cycle, it cannot get low-cloud-cover images during continuously rainy days at the sites with humid summers.
The general relationships between red-edge-and non-red-edge-related VIs derived from Sentinel-2 with GPP EC in the main footprint region of the flux towers at the EBF sites are shown in Table 3. These results show how much of the temporal variation in GPP EC can be captured by GPP VI . Both red-edge VIs and non-red-edge VIs corresponded closely to GPP EC during the study period at the WOM and TUM sites. The WOM site showed the highest correlation between GPP VI and GPP EC across all VIs. The non-red-edge VIs, such as EVI and NIRv, explained more than 90% of the temporal variance in GPP, with an average rRMSE of approximately 11%. The red-edge-based GPP VI , including CIr, CIg, MTCI, NDRE1, and NDRE2, showed R 2 values of 0.87, 0.90, 0.83, 0.89, and 0.89 with GPP EC , respectively. The TUM site also showed similar results, while the highest correlation between GPP VI and GPP EC in the red-edge VIs was found for MTCI (R 2 = 0.75, rRMSE = 0.18). The other red-edge VIs showed R 2 values of 0.73, 0.70, 0.53, and 0.09 for NDRE1, NDRE2, CIr, and CIg with GPP EC , respectively. The non-red-edge-related VIs showed high relevance with GPP EC (average R 2 = 0.73, rRMSE = 0.19). The CUM site showed a different result between the red-edge and non-red-edge VIs with GPP EC . The GPP CIr and GPP CIg corresponded well with GPP EC , with an average R 2 of 0.56. However, other red-edge-related VIs (NDRE1, NDRE2, MTCI) showed low correlation with GPP EC . The non-red-edge VIs, including EVI, NDVI, and NIRv, showed R 2 values of 0.38, 0.43, and 0.38 with GPP EC , respectively. Table 4 shows the temporal relationship between different GPP VI and GPP EC at the grassland sites. At RIG, the red-edge VI-related GPP CIr and GPP CIg had R 2 values of 0.87 and 0.84 with GPP EC , respectively, whereas the non-red-edge VI-related GPP NIRv and GPP EVI also showed high R 2 values of 0.86 and 0.77 with GPP EC , respectively. The results at the YNC site were similar to the results at the RIG site. The GPP CIr correlated closely with GPP EC (average R 2 = 0.69), but other red-edge-related GPP VI corresponded poorly to GPP EC . The GPP EVI and GPP NIRv showed R 2 values of 0.61 and 0.66 with GPP EC , respectively.   Table 3. The algorithm results for 16-day average daytime gross primary productivity (GPP) estimated at evergreen broadleaf forest (EBF) sites, with the determination coefficient (R 2 ), root-mean-square error (RMSE; unit in gC·m −2 ·day −1 ), and relative RMSE (rRMSE) presented. We employed the equation GPP EC = a × VI × PAR + b here, where the vegetation index (VI) is the rebuilt time series, and PAR is the photosynthetic active radiation. The number of samples at each site is presented in parentheses after the site name. The numbers in bold represent the highest R 2 , lowest RMSE, and lowest rRMSE values.  The general relationships between red-edge-and non-red-edge-related VIs derived from Sentinel-2 with GPPEC in the main footprint region of the flux towers at the EBF sites are shown in Table 3. These results show how much of the temporal variation in GPPEC can be captured by GPPVI. Both red-edge VIs and non-red-edge VIs corresponded closely to GPPEC during the study period at the WOM and TUM sites. The WOM site showed the highest correlation between GPPVI and GPPEC across all VIs. The non-red-edge VIs, such as EVI and NIRv, explained more than 90% of the temporal variance in GPP, with an average rRMSE of approximately 11%. The red-edge-based GPPVI, including CIr, CIg, MTCI, NDRE1, and NDRE2, showed R 2 values of 0.87, 0.90, 0.83, 0.89, and 0.89 with GPPEC, respectively. The TUM site also showed similar results, while the highest correlation between GPPVI and GPPEC in the red-edge VIs was found for MTCI (R 2 = 0.75, rRMSE = 0.18). The other red-edge VIs showed R 2 values of 0.73, 0.70, 0.53, and 0.09 for NDRE1, NDRE2, CIr, and CIg with GPPEC, respectively. The non-red-edge-related VIs showed high relevance with GPPEC (average R 2 = 0.73, rRMSE = 0.19). The CUM site showed a different result between the red-edge and non-red-edge VIs with GPPEC. The GPPCIr and GPPCIg corresponded well with GPPEC, with an average R 2 of 0.56. However, other red-edge-related VIs (NDRE1, NDRE2, MTCI) showed low correlation with GPPEC. The non-red-edge VIs, including EVI, NDVI, and NIRv, showed R 2 values of 0.38, 0.43, and 0.38 with GPPEC, respectively. Table 4 shows the temporal relationship between different GPPVI and GPPEC at the grassland sites. At RIG, the red-edge VI-related GPPCIr and GPPCIg had R 2 values of 0.87 and 0.84 with GPPEC, respectively, whereas the non-red-edge VI-related GPPNIRv and GPPEVI also showed high R 2 values of 0.86 and 0.77 with GPPEC, respectively. The results at the YNC site were similar to the results at the RIG site. The GPPCIr correlated closely with GPPEC (average R 2 = 0.69), but other red-edge-related GPPVI corresponded poorly to GPPEC. The GPPEVI and GPPNIRv showed R 2 values of 0.61 and 0.66 with GPPEC, respectively. Both non-red-edge-related GPP EVI and red-edge-related GPP CIr showed robust correspondence with the GPP EC across all sites, with an average R 2 > 0.70 and RMSE < 1.5 gC·m −2 ·day −1 . The red-edge-based GPP CIr coupled with PAR showed high correspondence with GPP at the GRA sites. The highest correlation with GPP EC was observed in CIr × PAR, with an average R 2 greater than 0.75 and rRMSE less than 0.39 ( Table 4). The relationship between GPP CIr and GPP EC had rRMSE values that were 0.15 and 0.05 lower than those for GPP EVI at RIG and YNC, respectively. At some EBF sites, EVI × PAR had low rRMSE values with GPP EC compared with the other red-edge-based VIs. At TUM and WOM, GPP EVI had a lower uncertainty than GPP EC , with an average RMSE of 1.42 gC·m −2 ·day −1 and an rRMSE that was 0.03 less than that for GPP CIr . However, the red-edge-based VIs, including CIr and CIg, coupled with PAR in , had lower RMSE values than the other VI-based models at CUM ( Table 3).
The red-edge-related GPP CIr and GPP CIg and the non-red-edge-related GPP EVI and GPP NIRv corresponded well with GPP EC at all sites. Figure 4 shows the time-series relationship between GPP VI and GPP EC during the study period. These four VIs showed similar trends with GPP EC from November 2015 to November 2016 at RIG. When the GPP EC was high in the summer, the GPP VI showed a robust relationship with GPP EC in these four VIs at YNC. There were fewer effective images obtained at the EBF sites than at the GRA sites; thus, the GPP VI did not track the GPP EC variance. From January 2017 to April 2017, there were little available data for VI retrieval, and all GPP VI had different trends with GPP EC at CUM. The GPP CIg showed limited variance at TUM; thus, it exhibited a high bias with GPP EC during the study period. At WOM, all of the GPP VI corresponded with GPP EC across the different seasons. The Sentinel-2-based red-edge VIs such as CIr exhibited similar estimating accuracy with GPPEC under gap-filled days and non-gap-filled days (Figure 5). At WOM, the rRMSE under a cloudy sky was 0.19, which was similar to the rRMSE under a clear sky (0.17). A similar result was also observed at the YNC site. However, the correlation coefficients a and b were different for cloudy and clear days at WOM and YNC. The seasonal change in CIr in WOM was not significant, and the GPP change was mostly accompanied by a change in PARin (Figure 6). High cloud cover led to a low clear sky index (CSI), which influenced the image quality; thus, the gaps in the VIs in these periods needed to be filled (Figure 4). With the VIs rebuilt on a daily scale, the relationship between GPPEC and GPPCIr was different between clear and cloudy days; thus, more available data are required to investigate the relationship between the time series rebuilt in the red-edge index and GPPEC. The Sentinel-2-based red-edge VIs such as CIr exhibited similar estimating accuracy with GPP EC under gap-filled days and non-gap-filled days (Figure 5). At WOM, the rRMSE under a cloudy sky was 0.19, which was similar to the rRMSE under a clear sky (0.17). A similar result was also observed at the YNC site. However, the correlation coefficients a and b were different for cloudy and clear days at WOM and YNC. The seasonal change in CIr in WOM was not significant, and the GPP change was mostly accompanied by a change in PAR in (Figure 6). High cloud cover led to a low clear sky index (CSI), which influenced the image quality; thus, the gaps in the VIs in these periods needed to be filled (Figure 4). With the VIs rebuilt on a daily scale, the relationship between GPP EC and GPP CIr was different between clear and cloudy days; thus, more available data are required to investigate the relationship between the time series rebuilt in the red-edge index and GPP EC .

Spatial Distribution of GPPCIr
We used 30-m resampled Sentinel-2 data to quantify the spatial distribution of non-red-edge VIs and red-edge VIs (Figure 7). The coefficient of variance (CV) values of the red-edge CI in the EBF sites in January 2018 were 0.56, 0.16, and 0.14 for CUM, TUM, and WOM, respectively. The CVs of the EVIs near each site were 0.40, 0.12, and 0.08 for CUM, TUM, and WOM, respectively. In the forest sites, the CV of the red-edge-based CIr exhibited a higher spatial difference than the non-red-edgebased EVIs. The spatial variance in CIr was higher in the natural forest of TUM than at WOM. The CIr showed 11 value levels; however, the EVI showed concentrated values between −2 standard deviations (STDs) and 2 STDs, whereas the artificial forests, such as those at the WOM site, were more homogeneous than the natural forests and still showed eight major levels of CIr values compared with the five levels in the EVI. The spatial heterogeneity at the CUM site showed large differences in CIr around the center of the image, but it showed a similar range of EVI around the site.
We mapped the range of GPP at five 3 × 3-km sites based on their CIr × PAR-GPP relationship, as shown in Tables 3 and 4. The GPP exhibited higher seasonal variance at the GRA sites than at the EBF sites. Figure 8 shows the annual GPP variance at these sites. The central footprint of CUM was

Spatial Distribution of GPPCIr
We used 30-m resampled Sentinel-2 data to quantify the spatial distribution of non-red-edge VIs and red-edge VIs (Figure 7). The coefficient of variance (CV) values of the red-edge CI in the EBF sites in January 2018 were 0.56, 0.16, and 0.14 for CUM, TUM, and WOM, respectively. The CVs of the EVIs near each site were 0.40, 0.12, and 0.08 for CUM, TUM, and WOM, respectively. In the forest sites, the CV of the red-edge-based CIr exhibited a higher spatial difference than the non-red-edgebased EVIs. The spatial variance in CIr was higher in the natural forest of TUM than at WOM. The CIr showed 11 value levels; however, the EVI showed concentrated values between −2 standard deviations (STDs) and 2 STDs, whereas the artificial forests, such as those at the WOM site, were more homogeneous than the natural forests and still showed eight major levels of CIr values compared with the five levels in the EVI. The spatial heterogeneity at the CUM site showed large differences in CIr around the center of the image, but it showed a similar range of EVI around the site.
We mapped the range of GPP at five 3 × 3-km sites based on their CIr × PAR-GPP relationship, as shown in Tables 3 and 4. The GPP exhibited higher seasonal variance at the GRA sites than at the EBF sites. Figure 8 shows the annual GPP variance at these sites. The central footprint of CUM was   Figure 6. Seasonal change in GPP EC , incident photosynthetic active radiation (PARin), and red-edge chlorophyll index (CIr) during the research period (PAR in MJ·day −1 , GPP in gC·m −2 ·day −1 ).

Spatial Distribution of GPPCIr
We used 30-m resampled Sentinel-2 data to quantify the spatial distribution of non-red-edge VIs and red-edge VIs (Figure 7). The coefficient of variance (CV) values of the red-edge CI in the EBF sites in January 2018 were 0.56, 0.16, and 0.14 for CUM, TUM, and WOM, respectively. The CVs of the EVIs near each site were 0.40, 0.12, and 0.08 for CUM, TUM, and WOM, respectively. In the forest sites, the CV of the red-edge-based CIr exhibited a higher spatial difference than the non-red-edge-based EVIs. The spatial variance in CIr was higher in the natural forest of TUM than at WOM. The CIr showed 11 value levels; however, the EVI showed concentrated values between −2 standard deviations (STDs) and 2 STDs, whereas the artificial forests, such as those at the WOM site, were more homogeneous than the natural forests and still showed eight major levels of CIr values compared with the five levels in the EVI. The spatial heterogeneity at the CUM site showed large differences in CIr around the center of the image, but it showed a similar range of EVI around the site. component showed high intra-annual GPP variance (CV > 0.4). The other two forest sites were mostly covered by the EBF, showing CV ranges of 0.2 to 0.4 and 0 to 0.4 for WOM and TUM, respectively. For the grassland sites, the RIG site showed an annual GPP with a CV range from 0.3 to 0.8, and the YNC showed a CV range from 0.4 to 0.8. The land-cover reference image from Google Earth was used for comparison, as shown in Figure 1, which showed that the annual GPP variance in grassland and croplands was much higher than that in the EBFs.  We mapped the range of GPP at five 3 × 3-km sites based on their CIr × PAR-GPP relationship, as shown in Tables 3 and 4. The GPP exhibited higher seasonal variance at the GRA sites than at the EBF sites. Figure 8 shows the annual GPP variance at these sites. The central footprint of CUM was forest, while the other parts were grassland, croplands, and artificial land cover (Figure 1). The forest component showed low intra-annual variance (CV < 0.2) throughout the year, whereas the grass component showed high intra-annual GPP variance (CV > 0.4). The other two forest sites were mostly covered by the EBF, showing CV ranges of 0.2 to 0.4 and 0 to 0.4 for WOM and TUM, respectively. For the grassland sites, the RIG site showed an annual GPP with a CV range from 0.3 to 0.8, and the YNC showed a CV range from 0.4 to 0.8. The land-cover reference image from Google Earth was used for comparison, as shown in Figure 1, which showed that the annual GPP variance in grassland and croplands was much higher than that in the EBFs. The spatial variance in GPP differed among the different seasons. Figure 9a shows the GPP differences on different dates. The GPP showed high variance at the forest sites (TUM, WOM). During the peak of the growing season (December to January), each site showed higher GPP variance than that in the winter (July). Unlike the standard deviation, which was high at the peak of the growing season, the GPP showed less spatial variance at the TUM and WOM forest sites than at the other sites, with CV values less than 0.15 in all seasons (Figure 9b). CUM, which had two major types of vegetation cover, always showed high spatial variance in GPP (CV = 0.5) in different periods. The spatial variance always showed a CV value of 0.3 at the GRA sites.
Remote Sens. 2019, 11, x FOR PEER REVIEW 16 of 26 Figure 8. Spatial differences in GPP in the footprint region of each site. We set the footprint as a 3 × 3km region. The GPP was mapped by using the rebuilt CIr over a 16-day interval coupled with the PAR coefficient at each site, as shown in Tables 3 and 4. The GPP variance is the CV of the GPP in each pixel (standard deviation (STD)/mean value) throughout the research period. The pixels in gray are the regions without rebuilt VIs or non-vegetation components.
The spatial variance in GPP differed among the different seasons. Figure 9a shows the GPP differences on different dates. The GPP showed high variance at the forest sites (TUM, WOM). During the peak of the growing season (December to January), each site showed higher GPP variance than that in the winter (July). Unlike the standard deviation, which was high at the peak of the growing season, the GPP showed less spatial variance at the TUM and WOM forest sites than at the other sites, with CV values less than 0.15 in all seasons (Figure 9b). CUM, which had two major types of vegetation cover, always showed high spatial variance in GPP (CV = 0.5) in different periods. The spatial variance always showed a CV value of 0.3 at the GRA sites. The CUM site is a heterogeneous site with three major types of vegetation. Each vegetation type showed a different growing condition across seasons. Each vegetation type had different growth trends of GPP ( Figure 10). Thus, the GPP STD at the 3 × 3-km footprint region was always high at the CUM site and it showed a wide boundary of GPP during the study period (Figure 9). The high STD of YNC in 2018 was because, in 2018, most of the vegetation showed a low value near the flux tower except the north portion of the 3 × 3-km footprint region with forest cover. From April 2018 to June 2018, the GPP was greater than 1.75 in the north part of the footprint region, with other areas having   Tables  3 and 4

Comparison of GPP modeling results based on Sentinel-2 data and MODIS products
We compared our GPPVI with an LUE-based GPP product (MOD17A2H) in the footprints near the flux tower sites. Table 3 shows that GPPEVI exhibited a higher correlation with GPPEC than MOD17A2H at the EBF sites. Table 4 shows that the red-edge-based GPPCIr had a lower rRMSE value of 0.07 with GPPEC than that of the MODIS product at the GRA sites. The non-red-edge GPPEVI based on the coefficients at each site had a higher correlation with GPPEC than did the MOD17A2H product. Both GPPCIr and GPPEVI showed a higher correlation than MOD17A2H with GPPEC at the spatially heterogeneous CUM sites. However, at the sites with homogeneous vegetation distribution near the flux tower footprint region such as WOM and YNC (Figure 8), the estimation accuracy between MOD17A2H and GPPEC was similar to that for GPPVI and GPPEC (Tables 3 and 4).
To evaluate the effectiveness for large-scale GPP estimation, we used the same series of coefficients for estimating GPP across different sites ( Figure 11). Figure 11 shows that the Sentinel-2 Figure 10. Temporal trends in GPP spatial distribution at the YNC (a1-a8) and CUM (b1-b8) sites. The GPP estimation of each subfigure is based on the CIr-PARin-GPP relationship shown in Tables 3 and 4.

Comparison of GPP Modeling Results based on Sentinel-2 Data and MODIS Products
We compared our GPP VI with an LUE-based GPP product (MOD17A2H) in the footprints near the flux tower sites. Table 3 shows that GPP EVI exhibited a higher correlation with GPP EC than MOD17A2H at the EBF sites. Table 4 shows that the red-edge-based GPP CIr had a lower rRMSE value of 0.07 with GPP EC than that of the MODIS product at the GRA sites. The non-red-edge GPP EVI based on the coefficients at each site had a higher correlation with GPP EC than did the MOD17A2H product. Both GPP CIr and GPP EVI showed a higher correlation than MOD17A2H with GPP EC at the spatially heterogeneous CUM sites. However, at the sites with homogeneous vegetation distribution near the flux tower footprint region such as WOM and YNC (Figure 8), the estimation accuracy between MOD17A2H and GPP EC was similar to that for GPP VI and GPP EC (Tables 3 and 4).
To evaluate the effectiveness for large-scale GPP estimation, we used the same series of coefficients for estimating GPP across different sites ( Figure 11). Figure 11 shows that the Sentinel-2 red-edge-based GPP CIr corresponded better to GPP EC than GPP EVI and MOD17A2H in grasslands, with the highest R 2 and RMSE values as 0.2 gC·m −2 ·day −1 less than those obtained using other methods. At the EBF sites, MOD17A2H showed the highest R 2 and the lowest RMSE. Unlike the similar coefficient values of a and b at the GRA sites, the key coefficient b in GPP CIr exhibited large differences across the three EBF sites (5.01, 2.88, and −0.13 for TUM, WOM, and CUM, respectively, Table 3). In addition, the key coefficient a in GPP EVI exhibited large differences across the three EBF sites (2.48, 1.85, and 1.15 for TUM, WOM, and CUM, respectively, Table 3), which led to high systemic bias with GPP EC . red-edge-based GPPCIr corresponded better to GPPEC than GPPEVI and MOD17A2H in grasslands, with the highest R 2 and RMSE values as 0.2 gC•m −2 •day −1 less than those obtained using other methods. At the EBF sites, MOD17A2H showed the highest R 2 and the lowest RMSE. Unlike the similar coefficient values of a and b at the GRA sites, the key coefficient b in GPPCIr exhibited large differences across the three EBF sites (5.01, 2.88, and −0.13 for TUM, WOM, and CUM, respectively, Table 3). In addition, the key coefficient a in GPPEVI exhibited large differences across the three EBF sites (2.48, 1.85, and 1.15 for TUM, WOM, and CUM, respectively, Table 3), which led to high systemic bias with GPPEC.

Improvements to GPP modeling with Vegetation Red-Edge Information
Previous research confirmed that total canopy chlorophyll content and incident photosynthetic active radiation (PARin) are key drivers of GPP in many croplands [39,45,46]. Our results compared the relationships between VIs × PARin and GPPEC in grasslands and EBF sites, where the VIs were derived with and without red-edge bands. The results of this study showed that red-edge-based VIs such as CIr and CIg coupled with PAR have a higher correlation with GPPEC than other VIs without red-edge information in GRA sites, which is similar to the results of previous research [48]. The red edge contains information on vegetation conditions and is sensitive to changes in chlorophyll; thus, red-edge-based VIs, such as CIr, can track the seasonal changes in the chlorophyll pigment pool [37,52,82]. The vegetation pigment pool plays an important role in long-or medium-term GPP estimations [36]; thus, red-edge VIs can significantly explain the temporal variance in GPP in cropland and grassland sites [39,52,83].
Furthermore, this study also tested the relationship between CIr × PARin and GPPEC in the forest sites. However, GPPEVI was more representative of GPPEC than GPP derived from the red edge in the EBF sites, which was also shown in previous studies [48]. There was a small rRMSE difference between GPPCIr and GPPEVI. This result indicated that GPPCIr could partly evaluate the variance in GPP in EBF sites. On the one hand, at WOM sites, as in EBF sites, the seasonal change in CIr was not significant ( Figure 6). The total GPP was more limited by the incident PAR but not the CIr-related

Improvements to GPP Modeling with Vegetation Red-Edge Information
Previous research confirmed that total canopy chlorophyll content and incident photosynthetic active radiation (PAR in ) are key drivers of GPP in many croplands [39,45,46]. Our results compared the relationships between VIs × PAR in and GPP EC in grasslands and EBF sites, where the VIs were derived with and without red-edge bands. The results of this study showed that red-edge-based VIs such as CIr and CIg coupled with PAR have a higher correlation with GPP EC than other VIs without red-edge information in GRA sites, which is similar to the results of previous research [48]. The red edge contains information on vegetation conditions and is sensitive to changes in chlorophyll; thus, red-edge-based VIs, such as CIr, can track the seasonal changes in the chlorophyll pigment pool [37,52,82]. The vegetation pigment pool plays an important role in long-or medium-term GPP estimations [36]; thus, red-edge VIs can significantly explain the temporal variance in GPP in cropland and grassland sites [39,52,83].
Furthermore, this study also tested the relationship between CIr × PAR in and GPP EC in the forest sites. However, GPP EVI was more representative of GPP EC than GPP derived from the red edge in the EBF sites, which was also shown in previous studies [48]. There was a small rRMSE difference between GPP CIr and GPP EVI . This result indicated that GPP CIr could partly evaluate the variance in GPP in EBF sites. On the one hand, at WOM sites, as in EBF sites, the seasonal change in CIr was not significant ( Figure 6). The total GPP was more limited by the incident PAR but not the CIr-related chlorophyll content; thus, under clear and cloudy conditions, the major limitation to GPP is only PAR. Research showed that the reflectance chlorophyll/carotenoid index is more sensitive to intra-annual changes in the pigment pool at evergreen sites [38]; thus, the red-edge-based VIs cannot track the seasonal changes in the pigment pool. Conversely, the light-use efficiency at evergreen forest sites is mostly controlled by water and PAR in ; thus, the seasonal changes in the chlorophyll content did not lead to significant stress and photosynthesis decline [84]. In addition, the VI × PAR in -based GPP model represented the medium condition of monthly scale GPP that cannot fully catch the highest and lowest GPP values [23]. The accurate GPP value over a shorter time period should also consider the short-term changes in weather conditions. Thus, the LUE-based models considering short-term weather situations such as MODIS GPP exhibited better performance with GPP EC than other GPP VI [85,86]. For cross-site studies in evergreen forests, a previous study also showed that the MODIS GPP exhibited better performance than the GPP estimated by the red-edge index [48]. The MTCI was based on the MERIS with a different central wavelength for band 8 in Sentinel-2 MSI (740 nm) and MERIS band 10 (753 nm). Additionally, the MTCI explained less than 40% of the carbon flux well in grasslands and evergreen forests [48]. Thus, the MTCI showed a low R 2 and high RMSE in these sites (Tables 3 and 4). To summarize, the GPP estimation by VI in evergreen forests needs to consider vegetation growing stress such as seasonal pigment changes [38,87] and light conditions [88,89].

Advantages of Sentinel-2 High-Spatial-Resolution Data for GPP Modeling
The Sentinel-2 network provided data with a higher revisit period (less than five days) than the Landsat data (16 days), and it also contained a sensor with 20-m spatial resolution (resampled to 30-m spatial resolution in this study). The high-spatial-resolution data provided more detailed information on the physiological variance in vegetation. Figure 7 shows that the CIr value was much higher in forestland than in grassland, which indicated that the canopy chlorophyll content in the forest was higher than that in grassland. The maximal CIr value was found to the north of the tower at the YNC site, which is a tree-covered area with a CIr value that was 1.6 times higher than that in the other grassland area (Figure 7e). Additionally, CIr can precisely describe the physiological conditions and seasonal changes in vegetation; thus, it is suitable for high-spatial-resolution GPP mapping (Figures 8 and 9). The reason for the high CV at grass sites was the trees in the 3 × 3-km range near the central towers of RIG and YNC. On the other hand, the GPP EVI and GPP CIr at CUM, which is spatially heterogeneous, corresponded better to GPP EC than to the MOD17A2H product. Furthermore, the high spatial resolution of GPP CIr also indicated the high spatial variance in seasonal changes in GPP at the CUM site, because grass and forest vegetation have different growing speeds and different GPP conditions. This result indicated that the high-spatial-resolution Sentinel-2 data were suitable for characterizing the intra-annual spatial changes in GPP variance. Conversely, at the sites where there was more homogeneous land-cover type, such as the WOM and YNC sites, the relationships between GPP MOD and GPP EC , GPP CIr , and GPP EC were similar. With a single type of vegetation near the flux tower, the vegetation growing had a similar trend under the same environment and nutrient conditions; thus, the spatial resolution of the VI input did not affect the GPP estimation accuracy.
The networking systems of Sentinel-2A and B have a higher revisit period than the 16-day Landsat data, which enhanced the tracking ability of mid-term or long-term GPP variance. At the YNC grassland site, which is a dry area with few cloudy days, Sentinel-2 provided data with an average revisit period of three days. After the launch of Sentinel-2B, the YNC site had almost no missing data at the 16-day interval (Figures 2 and 3). Because there was greater availability of cloud-free data at this site, the spatio-temporally continuous VIs also exhibited good performance at the eight-day standard interval. Under these circumstances, the correlation between VI × PAR and GPP EC was much higher than that modeled at the 16-day standard interval. At the YNC site, the R 2 between CIr and GPP increased from 0.67 to 0.89, whereas the rRMSE decreased by 11% (Table 4). This result suggested that, with the greater number of effective images obtained by Sentinel-2, GPP VI was more representative of GPP EC because VIs can capture the mid-term changes in the vegetation pigment pool. The launch of Sentinel-2B increased the revisit period of the Sentinel-2 network, providing more effective data during spring and summer for the EBF sites. There were at least two effective images for all seasons except autumn at the EBF sites. However, although Sentinel-2B shortened the revisit period, the data were still less effective at the 16-day standard interval during the autumn. There was only one effective image at the WOM site and two at the TUM site during the autumn of 2018. Although the networking Sentinel-2 system had a five-day revisit period, with more cloudy days during the peak of growing season in some humid climate types, there were limited observations when they had the highest ecosystem GPP value. Thus, it limited the effectiveness of VI values to capture the magnitude of GPP EC .

Outlook for High-Spatial-Resolution Satellite GPP Mapping
Our results demonstrated that the high-spatial-resolution and multi-temporal Sentinel-2 red-edge data corresponded well with GPP EC . The red-edge-based vegetation index CIr coupled with PAR showed less bias with GPP EC than the other VIs at all sites. We successfully mapped the spatio-temporally continuous GPP near the footprint region at each site based on the local relationship of red-edge VIs and GPP EC .
Cloud cover affected the correlation between GPP and VIs ( Figure 5). We selected CIr as an example to evaluate the relationship between GPP and VI. With the VI rebuilt on a daily scale, the correlation between GPP and VI × PAR is similar under clear sky and cloudy days. Although the rRMSE of the daily GPP estimation at WOM was low, the rRMSE of CIr was lower on clear sky days than on cloudy days. The YNC site showed similar results, with an rRMSE reduced by 0.04 on clear days compared with the overall rRMSE. However, the key coefficients of a and b in the relationship of CIr-PAR in -GPP had 20-50% difference under gap-filled CIr-based GPP and non-gap-filled values. On the one hand, cloud cover change affects the amount of PAR in , which leads to bias in GPP estimates. Shaded leaves and sunlit leaves received different amounts of PAR in under different sky conditions, and they had different LUEs in the same tree; thus, the relationship of VI-PAR in -GPP differed under clear sky and cloudy conditions [90]. Therefore, the tower-based spectra measurement under different levels of PAR in could help determine the calibration coefficients a and b for the relationship between VI and GPP [91][92][93][94]. However, because the optical remote-sensing method cannot obtain effective images under cloudy conditions, the representativeness of the rebuilt VI time series depends on the effective VI that appears in the standard interval.
There are two possible ways to improve the performance of GPP VI for capturing the seasonal and shorter temporal trend. One on hand, in order to get more effective satellite-based data during the cloudy conditions, one can apply multi-source remote-sensing data to improve the VI quality for rebuilding the time series VI during the peak of growing season [95]. The amounts of high-spatial-resolution remote-sensing data are increasing, such as Landsat-8 Operational Land Imager (OLI) and Chinese GF Wide-Field Camera (WMV), whereby the combination of these sensors can both improve the spatial resolution and revisit period [96]. This increase in data benefits the GPP mapping ability on a regional and continental scale. One the other hand, the chlorophyll-related VIs (such as CIr and EVI) are more related to the seasonal change of GPP, but not the short-term change [52]. Therefore, further research can apply some red-edge-related VIs for detect the seasonal change of chlorophyll content and couple the weather condition stress and chlorophyll fluorescence to capture short-term GPP variance [97,98].
For regional-or continental-scale mapping, a series of stable values for coefficients a and b is important. Similar coefficients of a and b in the CIr-PAR in -GPP relationship were shown at the YNC and RIG sites. Thus, a cross-site comparison with one set of coefficients for GPP CIr showed high correlation and low uncertainties with GPP EC at the two grassland sites (Figure 11b). This result suggested that GPP CIr could be used to map GPP at a large scale. However, the modeled GPP showed significant systematic bias compared with GPP EC because, at each forest type, the relationship between GPP VI and GPP EC was not a constant value across these sites. These data showed that high uncertainties occurred if one set of coefficients was used to estimate GPP with GPP CIr across the EBF sites (Figure 11a). There were several reasons for this result. The canopy chlorophyll content changes slowly at the EBF sites [99]. Unlike EVI, which changes slowly during the growing season, CIr is highly affected by the canopy chlorophyll content and the amount of available data. Therefore, during the rainy season, the few cloud-free images made the CIr unable to track the monthly changes in vegetation physiology. The LUE-based GPP model corresponds well with short-term GPP variance, but CIr is more related to seasonal changes in the vegetation pigment pool and canopy structure [100]. Thus, the MOD17A2H-based GPP product and GPP EVI are better representatives of GPP EC than GPP CIr .
The biodiversity in a forest is higher than that in GRA sites [101]. Although the canopy showed a similar total canopy chlorophyll content, the vegetation ecosystem exhibited a different response to the environment that produced different amounts of GPP ( Figure 11b in the CUM series). Thus, the estimation of GPP with one series of parameters by GPP VI across multiple sites must consider the differences in plant traits in regions with high biodiversity. The CIr, as a characterization of canopy chlorophyll content, can be added as one part of the photosynthesis process to improve the accuracy of large-scale GPP assessment [37].

Conclusions
This study evaluated how much GPP variation can be explained by spatio-temporally continuous Sentinel-2 satellite-based red-edge data in grassland and evergreen forests. We found that GPP based on CIr × PAR in had the highest correlation with GPP EC across grassland sites, and it also showed low bias with GPP EC at each forest site. The networking system of Sentinel-2A and B enhanced the ability to track the seasonal canopy chlorophyll changes, which improved the temporal representativeness of GPP. The high-spatial-resolution remote-sensing data specifically characterized the spatial GPP variance over a heterogeneous landscape. The GPP CIr and GPP EVI based on the coefficients for each site exhibited better performance with GPP EC than the MOD17A2H product. The coefficients derived for the empirical relationship of CIr × PAR in with GPP EC were largely transferable between the grassland sites. This result indicated that the vegetation red-edge information-based GPP CIr was suitable for mapping spatio-temporally continuous GPP in grasslands at regional and continental scales.
However, as the Sentinel-2 data's revisit cycle is not as high as other medium-spatial-resolution satellite data and the VI-based GPP estimation method is more related to seasonal chlorophyll change, the vegetation red-edge-based empirical relationship is more suitable for capturing the seasonal change of GPP. Further research on satellite-based GPP mapping with high-spatial-resolution data should focus on (1) improving the number of effective data during the high-cloud-cover seasons by using multisource data, (2) calibrating the coefficients with more in situ measurement data under cloudy and clear sky conditions, and (3) investigating whether there is a series of transferable coefficients to estimate GPP under multiple EBF sites.