Evaluating the Performance of Satellite-Derived Vegetation Indices for Estimating Gross Primary Productivity Using FLUXNET Observations across the Globe

: Satellite-derived vegetation indices (VIs) have been widely used to approximate or estimate gross primary productivity (GPP). However, it remains unclear how the VI-GPP relationship varies with indices, biomes, timescales, and the bidirectional reﬂectance distribution function (BRDF) e ﬀ ect. We examined the relationship between VIs and GPP for 121 FLUXNET sites across the globe and assessed how the VI-GPP relationship varied among a variety of biomes at both monthly and annual timescales. We used three widely-used VIs: normalized di ﬀ erence vegetation index (NDVI), enhanced vegetation index (EVI), and 2-band EVI (EVI2) as well as a new VI - NIR V and used surface reﬂectance both with and without BRDF correction from the moderate resolution imaging spectroradiometer (MODIS) to calculate these indices. The resulting traditional (NDVI, EVI, EVI2, and NIR V ) and BRDF-corrected (NDVI BRDF , EVI BRDF , EVI2 BRDF , and NIR V, BRDF ) VIs were used to examine the VI-GPP relationship. At the monthly scale, all VIs were moderate or strong predictors of GPP, and the BRDF correction improved their performance. EVI2 BRDF and NIR V, BRDF had similar performance in capturing the variations in tower GPP as did the MODIS GPP product. The VIs explained lower variance in tower GPP at the annual scale than at the monthly scale. The BRDF-correction of surface reﬂectance did not improve the VI-GPP relationship at the annual scale. The VIs had similar capability in capturing the interannual variability in tower GPP as MODIS GPP. VIs were inﬂuenced by temperature and water stresses and were more sensitive to temperature stress than to water stress. VIs in combination with environmental factors could improve the prediction of GPP than VIs alone. Our ﬁndings can help us better understand how the VI-GPP relationship varies among indices, biomes, and timescales and how the BRDF e ﬀ ect inﬂuences the VI-GPP relationship. annual scale. The R 2 GPP di ﬀ ered the R 2 vegetation


Introduction
Terrestrial gross primary productivity (GPP), the amount of carbon absorbed by terrestrial plants through photosynthesis, is the largest carbon flux between the terrestrial biosphere and the atmosphere. It also drives the terrestrial food chain and is the basis for agricultural and wood production. Quantifying GPP is therefore essential for assessing ecosystem carbon dynamics, climate not corrected with BRDF) [1,2]. For example, the BRDF-corrected NDVI and EVI had slightly stronger correlation with tower GPP than the traditional VIs. Recently, a new index based on BRDF-corrected reflectance, the near-infrared reflectance of vegetation (NIR V ), was proposed [32]. NIR V is the product of the BRDF-corrected NIR reflectance and BRDF-corrected NDVI. NIR V represents the proportion of reflectance attributable to the vegetation from a physical perspective and has been shown to be better related to GPP than NDVI or NIR alone [32].
To our knowledge, no study has compared traditional VIs (i.e., VIs based on surface-reflectance that is not corrected with BRDF) against BRDF-corrected VIs for estimating GPP for a large number of sites across a wide variety of biomes at both seasonal and annual scales. The MODIS ASCII Subsets for BRDF-corrected surface reflectance products (e.g., MCD43A4) were not available until very recently. Prior the release of the Collection 6 MODIS ASCII Subsets in 2017, the MCD43A4 subsets for each EC site had to be extracted from entire MODIS tiles by the users, while extracting MODIS subsets for a large number of sites globally for multiple years is computationally intensive and time consuming. The recent release of the MCD43A4 subsets makes it feasible to extract BRDF-corrected surface reflectance and then to calculate BRDF-corrected VIs for a large number of sites in a timely fashion.
The recent concurrent availability of FLUXNET and MODIS data for a large number of sites over a 15-year period (2000-2014) and the recent availability of the MODIS ASCII subsets for BRDF-corrected surface reflectance products provide an unprecedented opportunity to examine the relationships between satellite-derived VIs (both traditional and BRDF-corrected VIs) and tower GPP across a wide variety of biomes at various timescales. Here we examined the relationships between VIs and tower GPP for a total of 121 EC sites across the globe and assessed how the VI-GPP relationship varied among various VIs and a wide variety of biomes and at both monthly and annual timescales. We used three widely-used VIs: NDVI, EVI, and EVI2 along the new index -NIR V , and used surface reflectance products both with and without BRDF correction to calculate these VIs. Besides these six VIs, the new VI based on BRDF-corrected reflectance -NIR V was also used. To evaluate the usefulness of the VIs for estimating GPP, we also assessed the relationship between the MODIS GPP product and flux tower GPP for these sites. We then investigated the correlations of VIs with temperature and water stresses to reveal how VIs respond to temperature and water availability. Finally, we examined how VIs in combination with environmental factors could better explain the variations in GPP. Our findings can reveal the performance of satellite-derived VIs for approximating GPP and the influence of the BRDF effect and informing future ecosystem functioning, vegetation productivity, and carbon cycle studies based on VIs.

Study Sites and Flux Tower Data
We used flux and meteorological data from the latest FLUXNET synthesis dataset, the FLUXNET2015 database. This database was first released in December 2015 with more site-years of data added in 2016. It provides standardized and high-quality flux and meteorological data for a total of 221 sites across the globe (http://fluxnet.fluxdata.org/data/fluxnet2015-dataset/data-processing/). Over the past decade or so, there have been two major data synthesis activities initiated by the FLUXNET research community, leading to two synthesis datasets: La Thuile and FLUXNET2015. The FLUXNET2015 dataset has up to 15 years of overlap between EC data and MODIS data products, while the La Thuile dataset has only up to 5 years of overlap with the MODIS data record. The new dataset also has better quality than its predecessor. We therefore used the FLXUNET2015 dataset in our study. The flux and meteorological data in the dataset were gap filled, and the NEE measurements were partitioned into GPP and ecosystem respiration using standardized procedures [33].
In our study, we selected 121 Tier-1 sites that are open and free for scientific research (Figure 1). The site code, site name, latitude, longitude, vegetation type, and duration of measurements, and references are provided in Table S1. Although these sites are distributed across the globe with latitudinal distribution and encompass the following ten biomes: evergreen needleleaf forests (ENF), evergreen broadleaf forests (EBF), deciduous broadleaf forests (DBF), mixed forests (MF), closed and open shrublands (COSH), woody savannas (WSA), savannas (SAV), grasslands (GRA), wetlands (WET), and croplands (CRO), there are actually no significant latitudinal impacts on GPP and VIs ( Figure S1). The GPP data (GPP-DT-CUT-REF) partitioned from NEE using the daytime partitioning approach [34] and meteorological data were used in our analysis.  Figure S1). The GPP data (GPP-DT-CUT-REF) partitioned from NEE using the daytime partitioning approach [34] and meteorological data were used in our analysis.  Table S1 in the Supplementary Material. The base map is the MODIS land cover map.

Vegetation Indices (VIs)
We used the following four vegetation indices: NDVI, EVI, EVI2, and NIRV in this study. NDVI is perhaps the most widely used vegetation index. NDVI captures the difference between maximum reflection of radiation in the near-infrared band due to the leaf cellular structure and the maximum absorption of radiation in the red band due to the chlorophyll pigments. NDVI can be expressed as follows: where N and R are the reflectance in the near-infrared (NIR) and red bands, respectively. Although NDVI has been widely used to approximate photosynthetic activity, this index has some limitations. NDVI is sensitive to soil background brightness and is influenced by soil texture and moisture conditions [13].
To overcome the saturation effects of NDVI, weight factors were introduced to the NIR reflectance term in the NDVI formulation to adjust the relative contributions of the red and NIR reflectance. The resulting VI-EVI can be expressed as follows: where B is the reflectance in blue band; G is a gain factor; C1 and C2 are the coefficients of the aerosol resistance term that are used to correct for aerosol affects in the red band with the blue band; L functions as the soil adjustment factor. The coefficients adopted in the EVI algorithm are as follows: G = 2.5, C1 = 6, C2 = 7.5 and L = 1. EVI has been widely used as proxies of vegetation productivity or to estimate GPP [35,36].  Table S1 in the Supplementary Material. The base map is the MODIS land cover map.

Vegetation Indices (VIs)
We used the following four vegetation indices: NDVI, EVI, EVI2, and NIR V in this study. NDVI is perhaps the most widely used vegetation index. NDVI captures the difference between maximum reflection of radiation in the near-infrared band due to the leaf cellular structure and the maximum absorption of radiation in the red band due to the chlorophyll pigments. NDVI can be expressed as follows: where N and R are the reflectance in the near-infrared (NIR) and red bands, respectively. Although NDVI has been widely used to approximate photosynthetic activity, this index has some limitations. NDVI is sensitive to soil background brightness and is influenced by soil texture and moisture conditions [13].
To overcome the saturation effects of NDVI, weight factors were introduced to the NIR reflectance term in the NDVI formulation to adjust the relative contributions of the red and NIR reflectance. The resulting VI-EVI can be expressed as follows: where B is the reflectance in blue band; G is a gain factor; C 1 and C 2 are the coefficients of the aerosol resistance term that are used to correct for aerosol affects in the red band with the blue band; L functions as the soil adjustment factor. The coefficients adopted in the EVI algorithm are as follows: G = 2.5, C 1 = 6, C 2 = 7.5 and L = 1. EVI has been widely used as proxies of vegetation productivity or to estimate GPP [35,36]. The visible bands are highly related with each other, and the blue reflectance can be expressed as a function of the red reflectance [37]. A two-band EVI index, EVI2, was developed assuming the blue-band reflectance is a function of the red-band reflectance (B = c*R) and using the L, C 1 and C 2 values in the EVI formula [22]. With the optimal C 1 , C 2 and G values, EVI2 can be expressed as follows: EVI2 not only gained its heritage from EVI but also improved the linearity with vegetation biophysical parameters [22]. A more recently developed index, NIR V , is the product of the BRDF-corrected reflectance in the near-infrared band and the BRDF-corrected NDVI. For the calculation of NIR V , 0.08 is subtracted from NDVI BRDF to partially reduce the effects of the bare soil on NDVI. NIR V , highlighting the BRDF-corrected NIR reflectance, is said to be less sensitive to background contamination and is well related to GPP [32]. NIR V can be written as: In this study, we used the following four VIs: two classical and most widely used indices -NDVI and EVI, a relatively new and improved index -EVI2, and a more recently developed index -NIR V . We calculated NDVI, EVI, and EVI2 for each of the 121 sites using MODIS surface reflectance both with and without BRDF correction as described below, and the resulting VIs are NDVI, NDVI BRDF , EVI, EVI BRDF , EVI2, and EVI2 BRDF . For comparison purposes, NIR V was also calculated using surface reflectance both with and without BRDF correction, and the resulting two VIs -NIR V and NIR V, BRDF were used in the following analyses.

MODIS Data Products and Calculation of VIs
In this study, we used the MODIS Terra MOD09A1 (Version 6) product from the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) (Global Subsets Tool: MODIS/VIIRS Land Products: https://modis.ornl.gov/cgi-bin/MODIS/global/subset.pl). This product provides surface spectral reflectance of Terra MODIS bands 1-7 corrected for atmospheric conditions (e.g., gases, aerosols, and Rayleigh scattering) with 8-day intervals and 500 m spatial resolution. We also used the MODIS/Terra and Aqua nadir BRDF-corrected reflectance product (MCD43A4; Version 6). The MCD43A4 provides 500 m BRDF-adjusted reflectance data of bands 1-7 from MODIS on board both Terra and Aqua satellites with daily time step. These BRDF-adjusted reflectance values are modeled as if they were acquired from the nadir view using the BRDF. Besides the MODIS surface reflectance products, we also used the MODIS GPP product (MOD17A2H; Version 6) [38] to help evaluate the performance of the VIs for estimating GPP. This product consists of 8-day GPP maps with 500 m resolution estimated using MODIS data, meteorological data, and a light use efficiency algorithm.
For each of the 121 flux towers, surface reflectance values for both MOD09A1 and MCD43A4 products were extracted for the 500-m grid cells within the 1 km × 1 km window surrounding the tower for each time step (8-day for MOD09A1 and daily for MCD43A4) over the period 2000-2014. For each site and each time step, the surface MOD09A1 reflectance values averaged within the window were used to calculate NDVI, EVI, and EVI2. For each site and each time step, the BRDF-adjusted reflectance values (MCD43A4) averaged within the window were also used to calculate NDVI, EVI, EVI2, and NIR V . The VIs based on reflectance without BRDF correction (MOD09A1) were referred to as NDVI, EVI, EVI2, and NIR V , respectively, while the VIs based on BRDF-adjusted reflectance values (MCD43A4) were referred to as NDVI BRDF , EVI BRDF , EVI2 BRDF , and NIR V, BRDF respectively. The BRDF-adjusted reflectance values were also used to calculate NIR V . For each site and each 8-day time step, we also calculated the average of the MODIS GPP values within the 1 km × 1 km window surrounding the tower.

Analysis
We examined the statistical relationships between the eight VIs and GPP using MODIS data and flux tower data for the 121 FLUXNET sites. First, we evaluated the relationship between the eight MODIS-derived VIs and tower GPP at the monthly scale for each of the 121 EC sites encompassing ten biomes. For each site, each VI was averaged within each month and was then correlated with monthly tower GPP. To evaluate the performance of each VI by biome, the R 2 values for the relationship between each VI and tower GPP were also averaged over all the sites within the biome. Second, the EC sites with six or more years of flux data were chosen to analyze the relationship between the VIs and tower GPP at the annual scale. We identified those sites having at least 6 years of flux data from the 121 FLUXNET sites. A total of 48 sites had at least 6 years of data. Third, we examined the relationships between the VIs and tower GPP at the site and biome levels. Fourth, we analyzed the relationship between VIs and two environmental scalars, f Topt and f VPD , representing optimum temperature and vapor pressure deficit (VPD) stresses, respectively, to determine how VIs respond to environmental stresses and how a combination of VIs and environmental factors (f Topt , f VPD and PAR) could explain the variance of GPP. The f VPD scalar used in this study is based on the water scalar in the MODIS GPP algorithm. For temperature stress, we used the temperature scalar (f Topt ) in the vegetation photosynthesis model (VPM) that includes the minimum, maximum and optimal temperature for photosynthetic activity; we did not use the temperature scalar (f Tmin ) in the MODIS GPP algorithm because of the lack of optimal temperature in the scalar. For each site, both f Topt and f VPD were calculated from flux tower meteorological measurements.

Relationships between Vegetation Indices and Tower GPP
We examined the relationship between each VI and tower GPP at the monthly scale and calculated the coefficient of determination (R 2 ) for each of the 121 flux sites. For each VI, the R 2 values were averaged over the 121 sites. On average, NDVI, EVI, EVI2, NIR V , NDVI BRDF , EVI BRDF , EVI2 BRDF , and NIR V, BRDF explained 55%, 48%, 66%, 67%, 56%, 62%, 70%, and 70% of the variance in GPP, respectively. Compared with the traditional VIs (NDVI, EVI, EVI2, and NIR V ), the BRDF-corrected VIs (NDVI BRDF , EVI BRDF , EVI2 BRDF , and NIR V, BRDF ) generally exhibited stronger correlation with tower GPP, indicating that BRDF-corrected VIs had higher performance in estimating GPP than their traditional counterparts at the monthly scale. EVI2 BRDF (average R 2 = 0.70, p < 0.0001) and NIR V, BRDF (average R 2 = 0.70, p < 0.0001) had the strongest correlation with tower GPP. To evaluate the usefulness of the VIs for estimating GPP, we also examined the relationship between MODIS GPP and tower GPP for each flux site. NDVI (55%), EVI (48%), and NDVI BRDF (66%) explained lower variance in tower GPP than did MODIS GPP (70%). By contrast, EVI BRDF , EVI2 BRDF , and NIR V, BRDF explained the same proportion (average R 2 = 0.70, p < 0.0001) of the variance in tower GPP as did MODIS GPP (Figure 2), although the MODIS GPP is driven by incident photosynthetically active radiation (PAR), the fraction of PAR absorbed by vegetation canopies (fPAR), temperature stress, and water stress. We then stratified the 121 flux sites by biome type. For each biome, the R 2 values for the relationship between each VI and tower GPP were averaged over all the sites within the biome. The R 2 values for all the ten biomes and VIs are summarized in Figure 2. The performance of each VI generally varied with biome. On average, deciduous broadleaf forests (R 2 = 0.65-0.86, p < 0.0001), grasslands (R 2 = 0.62-0.72, p < 0.0001), mixed forests (R 2 = 0.57-0.86, p < 0.0001), evergreen needleleaf forests (R 2 = 0.47-0.82, p < 0.0001) and shrublands (R 2 = 0.17-0.77, p < 0.0001) had the strongest correlation, followed by woody savannas (R 2 = 0.31-0.61, p < 0.0001), savannas (R 2 = 0.35-0.63, p < 0.0001), To examine the relationships between VIs and tower GPP at the annual scale, we identified those sites having at least 6 years of flux data from the 121 FLUXNET sites. A total of 48 sites had at least 6 years of data. For each site, we analyzed the relationships between VIs and tower GPP at the annual scale. For each VI, the R 2 values were averaged over all the 48 sites. On average, NDVI, EVI, EVI2, NIRV, NDVIBRDF, EVIBRDF, EVI2BRDF, NIRV, BRDF, and MODIS GPP explained 30%, 29%, 31%, 28%, 25%, 27%, 25%, 30%, and 23% of the variance in tower GPP. This indicates that VIs generally had weaker correlation with tower GPP at the annual scale than at the monthly scale. At the annual scale, the correlation of VIs with tower GPP was comparable to or slightly stronger than MODIS GPP (23%). Compared with the traditional VIs, the BRDF-corrected VIs had no advantage at the annual scale. The relationships between VIs and GPP varied with biome at the annual scale ( Figure 3). On average, the relationship was relatively strong for EBF (average R 2 = 0.42-0.72), COSH (average R 2 = 0.28-0.81), and WSA (average R 2 = 0.48-0.68), moderate for ENF (average R 2 = 0.16-0.34), GRA (average R 2 = 0.29-0.41), WET (average R 2 = 0.12-0.34), and CRO (average R 2 = 0.09-0.54), and relatively weak for DBF (average R 2 = 0.09-0.50) and MF (average R 2 = 0.02-0.32). For each biome, the R 2 value varied with VI, particularly for ENF, DBF, and COSH.
We then examined the relationships between VIs and tower GPP across all sites for each biome ( Figure 4; Table 1). For each site, annual tower GPP and annual mean VIs were separately averaged across all years. Overall, at the site level, tower GPP exhibited moderate to strong relationships with all or most VIs at the annual scale for all biomes except savannas. For savannas, tower GPP was weakly correlated with VIs, particularly NDVI, EVI, and NDVIBRDF. The R 2 for the relationship between each VI and GPP varied with biome. There was no universally robust vegetation index that had strong correlation with tower GPP across all ten biomes. Notably, for all biomes except mixed forests, savannas, and croplands, the performance of NDVI, EVI2, EVIBRDF, EVI2BRDF, and NIRV in approximating GPP was comparable to MODIS GPP; for mixed forests, savannas, and croplands, these VIs was more strongly correlated with tower GPP than was MODIS GPP. In addition, at the annual scale, the BRDF-corrected VIs generally had similar performance in approximating GPP as To examine the relationships between VIs and tower GPP at the annual scale, we identified those sites having at least 6 years of flux data from the 121 FLUXNET sites. A total of 48 sites had at least 6 years of data. For each site, we analyzed the relationships between VIs and tower GPP at the annual scale. For each VI, the R 2 values were averaged over all the 48 sites. On average, NDVI, EVI, EVI2, NIR V , NDVI BRDF , EVI BRDF , EVI2 BRDF , NIR V, BRDF , and MODIS GPP explained 30%, 29%, 31%, 28%, 25%, 27%, 25%, 30%, and 23% of the variance in tower GPP. This indicates that VIs generally had weaker correlation with tower GPP at the annual scale than at the monthly scale. At the annual scale, the correlation of VIs with tower GPP was comparable to or slightly stronger than MODIS GPP (23%). Compared with the traditional VIs, the BRDF-corrected VIs had no advantage at the annual scale. The relationships between VIs and GPP varied with biome at the annual scale ( Figure 3). On average, the relationship was relatively strong for EBF (average R 2 = 0.42-0.72), COSH (average R 2 = 0.28-0.81), and WSA (average R 2 = 0.48-0.68), moderate for ENF (average R 2 = 0.16-0.34), GRA (average R 2 = 0.29-0.41), WET (average R 2 = 0.12-0.34), and CRO (average R 2 = 0.09-0.54), and relatively weak for DBF (average R 2 = 0.09-0.50) and MF (average R 2 = 0.02-0.32). For each biome, the R 2 value varied with VI, particularly for ENF, DBF, and COSH.
We then examined the relationships between VIs and tower GPP across all sites for each biome ( Figure 4; Table 1). For each site, annual tower GPP and annual mean VIs were separately averaged across all years. Overall, at the site level, tower GPP exhibited moderate to strong relationships with all or most VIs at the annual scale for all biomes except savannas. For savannas, tower GPP was weakly correlated with VIs, particularly NDVI, EVI, and NDVI BRDF . The R 2 for the relationship between each VI and GPP varied with biome. There was no universally robust vegetation index that had strong correlation with tower GPP across all ten biomes. Notably, for all biomes except mixed forests, savannas, and croplands, the performance of NDVI, EVI2, EVI BRDF , EVI2 BRDF , and NIR V in approximating GPP was comparable to MODIS GPP; for mixed forests, savannas, and croplands, these VIs was more strongly correlated with tower GPP than was MODIS GPP. In addition, at the annual scale, the BRDF-corrected VIs generally had similar performance in approximating GPP as VIs that were not adjusted with BRDF. For example, the traditional NDVI generally performed as well as NDVI BRDF in estimating GPP.
For each biome, the annual VIs and tower GPP averaged across all years were further averaged across all the sites within the biome, and we then examined the relationships between VIs and tower GPP at the biome level ( Figure 5). All VIs were strongly correlated with tower GPP across the biomes with R 2 ranging from 0.73 to 0.86. The biome averaged EVI had slightly weaker relationship with tower GPP (R 2 = 0.73, p < 0.0001) than did other VIs likely due to the low value of EVI in shrublands. Across the ten biomes, EVI BRDF (R 2 = 0.86, p < 0.0001) was more strongly correlated with tower GPP than was EVI (R 2 = 0.73, p < 0.0001), while NDVI BRDF and EVI2 BRDF had comparable performance in estimating GPP to NDVI and EVI2, respectively. All the VIs except EVI were more strongly correlated with averaged tower GPP across the ten biomes than MODIS GPP. VIs that were not adjusted with BRDF. For example, the traditional NDVI generally performed as well as NDVIBRDF in estimating GPP. For each biome, the annual VIs and tower GPP averaged across all years were further averaged across all the sites within the biome, and we then examined the relationships between VIs and tower GPP at the biome level ( Figure 5). All VIs were strongly correlated with tower GPP across the biomes with R 2 ranging from 0.73 to 0.86. The biome averaged EVI had slightly weaker relationship with tower GPP (R 2 = 0.73, p < 0.0001) than did other VIs likely due to the low value of EVI in shrublands. Across the ten biomes, EVIBRDF (R 2 = 0.86, p < 0.0001) was more strongly correlated with tower GPP than was EVI (R 2 = 0.73, p < 0.0001), while NDVIBRDF and EVI2BRDF had comparable performance in estimating GPP to NDVI and EVI2, respectively. All the VIs except EVI were more strongly correlated with averaged tower GPP across the ten biomes than MODIS GPP.          Figure 4. The asterisks -*, ** and *** stand for the significance level p < 0.05, p < 0.01 and p < 0.001, respectively.

Relationships between Vegetation Indices and Environmental Stresses
To understand how VIs respond to environmental stresses, we examined the relationships between the VIs and two environmental scalars -f Topt and f VPD (representing temperature stress and water stress, respectively) at the monthly scale for each site separately. On average, NDVI, EVI, EVI2, NIR V , NDVI BRDF , EVI BRDF , EVI2 BRDF , and NIR V, BRDF explained 58%, 45%, 61%, 55%, 59%, 53%, 64%, and 62% of the variance in temperature stress across the 121 sites. This indicates that these VIs responded to temperature stress fairly well, and three VIs (EVI2, EVI2 BRDF and NIR V , BRDF ) could capture temperature stress almost as well as the MODIS GPP product (average R 2 = 0.67, p < 0.001).   Figure 4. The asterisks -*, ** and *** stand for the significance level P < 0.05, P < 0.01 and P < 0.001, respectively.

Relationships between Vegetation Indices and Environmental Stresses
To understand how VIs respond to environmental stresses, we examined the relationships between the VIs and two environmental scalars -fTopt and fVPD (representing temperature stress and water stress, respectively) at the monthly scale for each site separately. On average, NDVI, EVI, EVI2, NIRV, NDVIBRDF, EVIBRDF, EVI2BRDF, and NIRV, BRDF explained 58%, 45%, 61%, 55%, 59%, 53%, 64%, and 62% of the variance in temperature stress across the 121 sites. This indicates that these VIs responded to temperature stress fairly well, and three VIs (EVI2, EVI2BRDF and NIRV, BRDF) could capture temperature stress almost as well as the MODIS GPP product (average R 2 = 0.67, p < 0.001). Figure 6 summarizes the R 2 values for the relationships between the VIs and the fTopt scalar across all sites in each biome. In general, the correlation of the VIs to temperature stress was high for ENF (R 2 = 0.   NDVI, EVI, EVI2, NIR V , NDVI BRDF , EVI BRDF , EVI2 BRDF , and NIR V, BRDF explained 29%, 26%, 32%, 32%, 29%, 31%, 34%, and 34% of the variance in water stress at the monthly scale across 121 sites, respectively. This indicates that these satellite-derived VIs were more strongly correlated with temperature stress than with water stress. Figure 7 summarizes the R 2 values for the relationships between the VIs and the water stress scalar across sites for each biome. Compared with the temperature stress scalar (R 2 = 0.45-0.64, p < 0.0001), the water stress scalar had weaker correlation with VIs (R 2 = 0.26-0.34, p < 0.0001). The correlation of the VIs to water stress was relatively higher for COSH (R 2 = 0. 13 NDVI, EVI, EVI2, NIRV, NDVIBRDF, EVIBRDF, EVI2BRDF, and NIRV, BRDF explained 29%, 26%, 32%, 32%, 29%, 31%, 34%, and 34% of the variance in water stress at the monthly scale across 121 sites, respectively. This indicates that these satellite-derived VIs were more strongly correlated with temperature stress than with water stress. Figure 7 summarizes the R 2 values for the relationships between the VIs and the water stress scalar across sites for each biome. Compared with the temperature stress scalar (R 2 = 0.45-0.64, p < 0.0001), the water stress scalar had weaker correlation with VIs (R 2 = 0.26-0.34, p < 0.0001). The correlation of the VIs to water stress was relatively higher for COSH (R 2 = 0.13-0.41, p < 0.0001), WSA (R 2 = 0.33-0.6, p < 0.0001), SAV (R 2 = 0.37-0.52, p < 0.0001), and CRO (R 2 = 0.40-0.49, p < 0.0001) and relatively lower for other biomes: ENF (R 2 = 0.23-0.38, p < 0.0001), MF (R 2 = 0.28-0.44, p < 0.0001) and WET (R 2 = 0.25-0.39, p < 0.0001), EBF (R 2 = 0.18-0.33, p < 0.0001), DBF (R 2 = 0.24-0.29, p < 0.0001) and GRA (R 2 = 0.20-0.23, p < 0.0001). For most biomes, VIs generally captured water stress as well as MODIS GPP. During this drought, fVPD decreased with lack of rain while fTopt did not decrease with high temperature; meanwhile, VIs immediately decreased when the drought and high temperature occurred (Figure 8). At US-Ne1, a cropland site in US, there was much less precipitation from June to July in 2003 than in 2004 but relative more rain from August to September in 2003 than in 2004. US-Ne1 and CH-Oe1 exhibited similar responses in fVPD, fTopt, and VIs to drought ( Figure S2). At DK-Sor, a DBF site in Denmark, air temperature during the growing season was lower in 2007 than in 2014, and at a result, fTopt and VIs during the growing season were also lower in 2007 than in 2014 ( Figure  9). Similarly, at both BE-Vie (a MF site in Germany) ( Figure S3) and DE-Hai (a DBF site in Germany), fTopt and VIs during the growing season decreased due to the relatively low temperature ( Figure S4).

Relationships between GPP and Vegetation Indices in Combination with Environmental Stresses
To further address how each VI in combination with environmental factors could explain the variations in GPP, we also examined the relationships of GPP with VI × fTopt, VI × fTopt × fVPD and VI × fTopt × fVPD × PAR at the monthly timescale for each site separately. On average, the VIs with temperature stress (VIs × fTopt) explained higher variance in GPP (R 2 = 0.64-0.72, P < 0.0001) than VIs with temperature and water stresses (R 2 = 0.60-0.70, P < 0.0001) or VIs alone (R 2 = 0.48-0.70, p < 0.0001); VIs × fTopt × fVPD × PAR explained the highest variance in GPP (R 2 = 0.72-0.76, P < 0.0001) at the monthly timescale for 121 sites. Figure 10 summarizes the R 2 values for the relationships between the GPP and the VIs combined with fTopt, fVPD, and PAR in each biome. In general, VIs × fTopt × fVPD × PAR had the strongest relationship with GPP in ten biomes except EBF and SAV where VIs × fTopt × fVPD had the best performance. This indicates that GPP was mainly driven by VIs and also influenced by the environmental stresses and PAR.
The seasonal cycles of the flux tower GPP, two environmental scalars (fTopt and fVPD), and PAR at the NL-Loo site are shown in Figure 11. fTopt exhibited clear seasonal cycles while fVPD was equal to 1.0 most of the time, indicating that temperature limited GPP and VPD was usually not a limiting factor at this site. We chose NIRV, BRDF to examine how VIs combined with environmental scalars and PAR could explain the variance in GPP because NIRV, BRDF had the best performance for predicting GPP (Figure 11). NIRV, BRDF, fTopt, fVPD, and PAR alone explained 73%, 79%, 23%, and 85% of the variance in GPP, respectively. NIRV, BRDF × fTopt (87%) explained the same portion of the variance in GPP as NIRV, BRDF × fTopt × fVPD (86%). Meanwhile, NIRV, BRDF × fTopt × PAR explained the same variance in GPP (R 2 = 0.93, p < 0.0001) as NIRV, BRDF × fTopt × fVPD × PAR, indicating that VPD was not a dominant controlling factor of photosynthesis and had negligible contribution at this forest ecosystem ( Figure  11).

Relationships between GPP and Vegetation Indices in Combination with Environmental Stresses
To further address how each VI in combination with environmental factors could explain the variations in GPP, we also examined the relationships of GPP with VI × f Topt , VI × f Topt × f VPD and VI × f Topt × f VPD × PAR at the monthly timescale for each site separately. On average, the VIs with temperature stress (VIs × f Topt ) explained higher variance in GPP (R 2 = 0.64-0.72, p < 0.0001) than VIs with temperature and water stresses (R 2 = 0.60-0.70, p < 0.0001) or VIs alone (R 2 = 0.48-0.70, p < 0.0001); VIs × f Topt × f VPD × PAR explained the highest variance in GPP (R 2 = 0.72-0.76, p < 0.0001) at the monthly timescale for 121 sites. Figure 10 summarizes the R 2 values for the relationships between the GPP and the VIs combined with f Topt , f VPD , and PAR in each biome. In general, VIs × f Topt × f VPD × PAR had the strongest relationship with GPP in ten biomes except EBF and SAV where VIs × f Topt × f VPD had the best performance. This indicates that GPP was mainly driven by VIs and also influenced by the environmental stresses and PAR.  Similarly, the seasonal cycles of the flux tower GPP, two environmental scalars (fTopt and fVPD), and PAR at the AU-Das site are shown in Figure 12. Unlike NL-Loo, AU-Das exhibited constant values (1.0) for fTopt and clear seasonal cycles for fVPD. At the savanna site, GPP was not affected by fTopt and was weakly influenced by PAR (R 2 = 0.03, p < 0.5); NIRV, BRDF × fVPD had a slightly stronger relationship with GPP (R 2 = 0.59, p < 0.0001) than NIRV, BRDF (R 2 = 0.54, p < 0.0001) or fVPD (R 2 = 0.45, p The seasonal cycles of the flux tower GPP, two environmental scalars (f Topt and f VPD ), and PAR at the NL-Loo site are shown in Figure 11. f Topt exhibited clear seasonal cycles while f VPD was equal to 1.0 most of the time, indicating that temperature limited GPP and VPD was usually not a limiting factor at this site. We chose NIR V, BRDF to examine how VIs combined with environmental scalars and PAR could explain the variance in GPP because NIR V, BRDF had the best performance for predicting GPP (Figure 11). NIR V, BRDF , f Topt , f VPD , and PAR alone explained 73%, 79%, 23%, and 85% of the variance in GPP, respectively. NIR V, BRDF × f Topt (87%) explained the same portion of the variance in GPP as NIR V, BRDF × f Topt × f VPD (86%). Meanwhile, NIR V, BRDF × f Topt × PAR explained the same variance in GPP (R 2 = 0.93, p < 0.0001) as NIR V, BRDF × f Topt × f VPD × PAR, indicating that VPD was not a dominant controlling factor of photosynthesis and had negligible contribution at this forest ecosystem ( Figure 11). < 0.0001) alone and NIRV, BRDF × fVPD × PAR only explained 3% additional variance in GPP (R 2 = 0.62 p < 0.0001) ( Figure 12). NIRV, BRDF × PAR (R 2 = 0.40, p < 0.0001) had much lower R 2 value than NIRV, BRDF × fVPD × PAR (R 2 = 0.62 p < 0.0001) (Figure 12). This indicates that temperature was not a limiting factor of GPP but VPD is an important controlling factor at the AU-Das site.  Similarly, the seasonal cycles of the flux tower GPP, two environmental scalars (f Topt and f VPD ), and PAR at the AU-Das site are shown in Figure 12. Unlike NL-Loo, AU-Das exhibited constant values (1.0) for f Topt and clear seasonal cycles for f VPD . At the savanna site, GPP was not affected by f Topt and was weakly influenced by PAR (R 2 = 0.03, p < 0.5); NIR V, BRDF × f VPD had a slightly stronger relationship with GPP (R 2 = 0.59, p < 0.0001) than NIR V, BRDF (R 2 = 0.54, p < 0.0001) or f VPD (R 2 = 0.45, p < 0.0001) alone and NIR V, BRDF × f VPD × PAR only explained 3% additional variance in GPP (R 2 = 0.62 p < 0.0001) (Figure 12). NIR V, BRDF × PAR (R 2 = 0.40, p < 0.0001) had much lower R 2 value than NIR V, BRDF × f VPD × PAR (R 2 = 0.62 p < 0.0001) (Figure 12). This indicates that temperature was not a limiting factor of GPP but VPD is an important controlling factor at the AU-Das site.

Discussion
The concurrent availability of the MODIS record and the FLUXNET2015 database up to 15 years (2000-2014) for a large number of sites across the globe allowed us to examine the relationships between the MODIS-derived VIs and tower GPP at both monthly and annual scales and also to compare these relationships across sites and biomes globally. Our results showed that at the monthly scale there was a strong relationship between each VI and tower GPP for ten biomes except evergreen broadleaf forests. Previous studies also showed close relationships between VIs and GPP. For example, there was a moderate to strong relationship between the daily NDVI and daily GPP for cropland [10], deciduous forest [11], and evergreen forest [12]. Moderate to strong relationships between EVI and GPP at the daily scale were found for grassland [39], temperate evergreen broadleaf

Discussion
The concurrent availability of the MODIS record and the FLUXNET2015 database up to 15 years (2000-2014) for a large number of sites across the globe allowed us to examine the relationships between the MODIS-derived VIs and tower GPP at both monthly and annual scales and also to compare these relationships across sites and biomes globally. Our results showed that at the monthly scale there was a strong relationship between each VI and tower GPP for ten biomes except evergreen broadleaf forests. Previous studies also showed close relationships between VIs and GPP. For example, there was a moderate to strong relationship between the daily NDVI and daily GPP for cropland [10], deciduous forest [11], and evergreen forest [12]. Moderate to strong relationships between EVI and GPP at the daily scale were found for grassland [39], temperate evergreen broadleaf forest [40], and wetlands [21]. Previous studies also showed strong correlation between EVI2 and GPP for cropland [10] and evergreen forest [24]. A more recent study showed that the MODIS-derived VIs had strong correlation with tower GPP (R 2 = 0.55-0.65, p < 0.0001) at a number of flux sites across eight biomes [2]. The weak relationship for evergreen broadleaf forests found in our study may be caused by the following reasons. First, evergreen broadleaf forests, mainly distributed in the tropics, are more often covered with clouds. Satellite-based indicators are sensitive to cloud contamination or sun-sensor geometry, and the resulting changes in VIs can confound the real seasonality of forests [2]. Second, optical sensors still lack the capability to detect the full canopy activity of the tropical forests. Third, VIs reflect vegetation greenness and are not able to capture seasonal photosynthetic activity of evergreen trees because of the common decoupling of greenness and photosynthesis [41]. Finally, the EC technique can lead to large uncertainty in GPP estimates for these dense, multi-layered forests [42].
Monthly EVI performed slightly better in approximating GPP for seven biomes except ENF, COSH, and SAV than monthly NDVI. Previous studies also found that daily EVI was more useful for estimating GPP in both forest and crop ecosystems than NDVI [43,44]. EVI was more strongly related to GPP than was NDVI likely because EVI does not saturate as rapidly as NDVI when vegetation greenness increases in dense vegetation. Our results showed that EVI2 was a more robust proxy for GPP than was EVI, which supported and substantially expanded the results of the previous studies validating such EVI use for GPP estimation at the daily scale for forest, cropland, and grassland ecosystems [19,44,45]. At short time scales, when the atmospheric effects are insignificant, the blue band does not contribute much additional signal about the land surface than the red band at the canopy level [22]. The two-band EVI2 has no significant loss of information on the land surface, and EVI2 could be even slightly better than the traditional EVI with three bands. Vegetation indices took advantage of NIR band with high reflective properties and red band with absorbed properties of foliage to gain proxies of surface greenness, which were then related to canopy physiology or GPP [23]. Although EVI2 contains the same information as NDVI, the additional weight on the red reflectance in the denominator allows EVI2 to be less sensitive to soil darkening [23], which explained why the EVI2 had the best performance in estimating GPP at the monthly scale among the traditional vegetation indices.
Our results showed that NDVI BRDF , EVI BRDF , EVI2 BRDF , and NIR V had stronger relationships with tower GPP than did NDVI, EVI, EVI2, and NIR V, BRDF , respectively, at the month timescale, indicating that the correction of surface reflectance based on BRDF can improve the correlation between VIs and GPP. This finding is consistent with the results of a recent study [2] showing that NDVI BRDF , EVI BRDF , and had stronger relationships with tower GPP than did NDVI and EVI at the daily timescale. Changing illumination and viewing geometry (i.e., Solar Zenith Angle, View Zenith Angle, and Relative Azimuth Angle) can change satellite-measured surface reflectance [28] and the resulting VIs [46]. For example, changes in the sun-sensor geometry could alter EVI across the range of view zenith angles used for most land remote sensing [47]. Another previous study showed that NDVI was highly sensitive to solar zenith angle at intermediate LAI values (0.25-2.00) [48]. It should be noted that for MCD43A4 product, the solar zenith angle is at the local noon, meaning that the solar zenith angle is different at different sites. Moreover, BRDF effects are typically higher in the red wavelength than in the NIR wavelength [46]. Since NDVI captures the difference in reflectance between the NIR and red bands, the BRDF correction could likely alter the behavior of NDVI. Unlike NDVI, with the additional weight on the red band in the denominator, EVI and EVI2 with BRDF correction had varying behaviors because of low BRDF effects in the NIR wavelength and higher BRDF effects in the red wavelength [46]. As a result, the BRDF-corrected VIs performed better in estimating GPP than the traditional VIs without BRDF correction. The different performance of the EVI BRDF and EVI2 BRDF resulted from the different formula of the EVI and EVI2. By contrast, the BRDF-corrected VIs had no apparent advantage over the traditional VIs at the annual scale. This is likely because the BRDF effect depends on the sun-object-sensor geometry that changes with season, while the annually averaged sun-object-sensor geometry and BRDF effect do not exhibit interannual variations. Therefore, the correction of BRDF could improve the correlation between BRDF-corrected VIs and tower GPP at the monthly scale but not at the annual scale.
Our results showed that VIs exhibited moderate to strong relationships with tower GPP for EBF, COSH, and WSA and weaker relationships for other biomes at the annual scale. The R 2 values for the relationship between the VIs and GPP differed among sites. We examined how the R 2 values varied with vegetation cover and water stress. For each site, the Vegetation Continuous Fields (VCF) product [49] derived from MODIS was used to extract the percent vegetation cover (the sum of percent tree cover and percent non-tree cover), and VPD from tower data was used a measure of water stress. The strength of the relationship between VIs and GPP (R 2 value) exhibited weak relationships with percent vegetation cover (Table S2) and moderately strong relationships with VPD (Table S3). This indicates that the relationship between VIs and GPP at the annual scale was more dependent on water stress than on vegetation coverage. The performance of the VIs in capturing the interannual variability of tower GPP was generally similar to that of the MODIS GPP product. There is no vegetation index that had strong correlation with tower GPP across all biomes at the annual scale. The correlation between the VIs and tower GPP at the annual scale was generally not as strong as that at the monthly scale for several reasons. First, satellite-derived VIs cannot fully capture the influences of the interannual variability in environmental drivers and biotic factors on GPP [50]. For example, moisture in the root zone, an important ecological factor, can affect annual GPP in crops and seasonally dormant forests, which was not directly observed in the VIs [50]. Second, seasonal changes in the NIR reflectance do not necessarily indicate changes in leaf reflectance, leaf area or vegetation productivity [47]. Third, cloud cover curtails the retrievals of vegetation dynamics and the presence of ephemeral seasonal snow cover can limit the identification of the seasonal change of VIs [51], resulting in the deviation of the VIs at the annual scale. Finally, the interannual variability of GPP was relatively small for some ecosystems such as tropical forests and irrigated croplands [52], and the small ranges in annual VIs and GPP could lead to weaker VI-GPP relationships.
At the site and biome levels, all-around VIs with significant correlation (P < 0.05) to tower GPP only existed in ENF, GRA, WSA, and WET (Table 1) probably due to less seasonal changes of the vegetation canopies in these four ecosystems. NDVI, EVI2, EVI BRDF , EVI2 BRDF , and NIR V, BRDF had a relatively better performance in approximating GPP except for SAV and CRO. These results indicate that there is no vegetation index with a strong relationship (R 2 >0.7) with tower GPP at the annual scale across all the biomes (Figure 3). At the biome level, the performances of the VIs in predicting GPP were comparable to each other (R 2 = 0.84-0.86, p < 0.0001). EVI explained slightly lower variance in GPP (R 2 = 0.73, p < 0.0001) probably due to the low values of EVI in open and closed shrublands. Interestingly, VIs generally had similar performance to the MODIS GPP product in estimating GPP at both monthly and annual timescales. Besides fPAR, three other input variables: PAR, temperature stress, and water stress are explicitly incorporated into the MODIS GPP algorithm. This demonstrates that VIs are generally good proxies of vegetation productivity and can be used to estimate GPP across a wide variety of biomes. Meanwhile, similar to the MODIS GPP product, VIs only explained~20-30% of the variance in GPP at the annual scale. This indicates that caution should be taken for interpreting the results on the interannual variability and long-term trends in VIs. Capturing the interannual variability and long-term trends in GPP remains a challenge to not only satellite-derived VIs but also light use efficiency and process-based ecosystem models [53,54]. Both observational and modeling studies have reported that the increase of the diffuse radiation improved plant LUE [55][56][57]. The surface reflectance and the resulting VIs do not account for the fraction of diffuse radiation, and therefore the effects of diffuse radiation were not considered.
Our results also showed that these VIs responded to temperature stress fairly well. In particular, three VIs: EVI2, EVI2 BRDF , and NIR V, BRDF could capture temperature stress almost as well as the MODIS GPP algorithm that incorporates a temperature scalar. A late spring frost reduced vegetation productivity as approximated by NDVI in a beech dominated Mediterranean mountain forest [58]. Satellite-derived photosynthetic activity approximated by NDVI in the Swiss Alps revealed high-elevation growth enhancement and low-elevation growth suppression in response to the 2003 heat wave [59]. Although VIs were less sensitive to water stress than to temperature stress, VIs generally captured water stress as well as MODIS GPP for most biomes. The VIs respond to dry and wet spells because of their sensitivity to water stress and have been used to examine the responses of vegetation productivity to drought [60,61]. On the other hand, VIs and MODIS GPP explained only a small part of the variance in water stress, indicating that they could not fully capture the effects of water stress on plant productivity.
Our results further indicated that VIs in combination with environmental stresses (temperature stress and water stress) could better explain the variance in GPP. The two environmental stressesf Topt and f VPD limited GPP during low temperature and drought periods, respectively. For example, VPD is an important controlling factor of GPP at AU-Das (a savanna ecosystem in Australia), while temperature is a dominant controlling factor of the photosynthesis at the NL-Loo site (an ENF ecosystem in Netherlands). Drought can reduce stomatal conductance when hydraulic capacity cannot meet the transpirational demand in a drought season, particularly with increasing atmospheric VPD [62]. Low temperature can inhibit photophysical and photochemical processes involved in light absorption and energy transfer and transformation and can also decrease the rates of the enzymatic reactions including C, N, and S [63,64], leading to the reduction of GPP.

Conclusions
Our study made the first comprehensive global analysis of how the traditional vegetation indices (NDVI, EVI, EVI2), BRDF-corrected vegetation indices (NDVI BRDF , EVI BRDF , EVI2 BRDF ), and a newly proposed vegetation index -NIR V with and without BRDF correction (NIR V , NIR V, BRDF ) are related to tower GPP for a total of 121 sites encompassing ten biomes across the globe and at both monthly and annual scales. At the monthly scale, all the eight VIs were moderately or strongly correlated with tower GPP. Compared with the traditional VIs (NDVI, EVI, and EVI2), the BRDF-corrected VIs (NDVI BRDF , EVI BRDF , and EVI2 BRDF ) generally had stronger correlation with tower GPP, indicating that the VIs based on BRDF-correction had an advantage over the traditional VIs. EVI2 BRDF and NIR V, BRDF had the strongest correlation with tower GPP (averaged R 2 = 0.70, p < 0.0001), and their performance was comparable to the MODIS GPP product (averaged R 2 = 0.70, p < 0.0001) based on a light use model. The VIs explained lower variance in tower GPP at the annual scale than at the monthly scale. BRDF-corrected VIs had no advantage over the traditional counterparts at the annual scale. The capability of the VIs in capturing the interannual variability in GPP was also similar to that of the MODIS GPP product. At the site level, VIs generally exhibited moderate to strong correlation with tower GPP at the annual scale for all biomes except savannas. Similar to the MODIS GPP product, all VIs were strong predictors of GPP at the biome level. Our results also showed that monthly VIs were also influenced by the environmental stresses (temperature stress and water stress). As with the MODIS GPP product, monthly VIs were more sensitive to temperature stress than to water stress. Our findings have implications for improving our understanding of the relationships between satellite-derived VIs and tower GPP, the influences of the BRDF effect on the VI-GPP relationship, and the variations of the VI-GPP relationship among indices, biomes, and timescales.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/11/15/1823/s1. Figure S1: Flux tower GPP (GPP EC ) and MODIS-derived VIs against latitude: (a) GPP; (b) VIs. There are no significant relationships between VIs (GPP) and latitude. Figure S2: Daily VIs, temperature, precipitation, temperature stress (f Topt ) and water stress (f VPD ) in croplands site (US-Ne1) in a normal period from August to September in 2003 and a drought period from June to July in 2004. Figure S3: VIs, temperature, precipitation, temperature stress (f Topt ) and water stress (f VPD ) at mixed forest site (BE-Vie) in a normal year (2010) and a low temperature year (2011). Figure S4: Daily VIs, temperature, precipitation, temperature stress (f Topt ) and water stress (f VPD ) in deciduous broadleaf forest site (DE-Hai) in a normal year (2003) and a relative low temperature year (2004). Table S1: List of 121 eddy covariance flux tower sites used in this study. IGBP refers to the International Geosphere-Biosphere Program land cover classification. ENF: evergreen needleleaf forest; EBF: evergreen broadleaf forest; DBF: deciduous broadleaf forest; MF: mixed forest; OCSH: open and closed shrublands; SAV: savannas; WSA: woody savannas; GRA: grassland; CRO: cropland; WET: wetland. Table S2: The correlation of R 2 of the relationships between VIs and tower-based GPP with vegetation coverage at the annual scale across all sites. The asterisks -*, ** and *** stand for the significance level p < 0.05, p < 0.01 and p < 0.001, respectively. Table S3: The correlation of R 2 of of the relationships between VIs and tower-based GPP with VPD at the annual scale across all sites. The asterisks -*, ** and *** stand for the significance level p < 0.05, p < 0.01 and p < 0.001, respectively.