Global Vegetation Photosynthetic Phenology Products Based on MODIS Vegetation Greenness and Temperature: Modeling and Evaluation

: Land surface phenology (LSP) products that are derived from different data sources have different deﬁnitions and biophysical meanings. Discrepancies among these products and their linkages with carbon ﬂuxes across plant functional types and climatic regions remain somewhat unclear. In this study, to differentiate LSP related to gross primary production (GPP) from LSP related to remote sensing data, we deﬁned the former as vegetation photosynthetic phenology (VPP), including the starting and ending days of GPP (SOG and EOG, respectively). Speciﬁcally, we estimated VPP based on a combination of observed VPP from 145 ﬂux-measured GPP sites together with the vegetation index and temperature data from MODIS products using multiple linear regression models. We then compared VPP estimates with MODIS LSP on a global scale. Our results show that the VPP provided better estimates of SOG and EOG than MODIS LSP, with a root mean square error (RMSE) for SOG of 12.7 days and a RMSE for EOG of 10.5 days. The RMSE was approximately three weeks for both SOG and EOG estimates of the non-forest type. Discrepancies between VPP and LSP estimates varied across plant functional types (PFTs) and climatic regions. A high correlation was observed between VPP and LSP estimates for deciduous forest. For most PFTs, using VPP estimates rather than LSP improved the estimation of GPP. This study presents a useful method for modeling global VPP, investigates in detail the discrepancies between VPP and LSP, and provides a more effective global vegetation phenology product for carbon cycle modeling than the existing ones. long-term global VPP products; (2) to compare the discrepancies between VPPest and MODIS-LSP estimates across different PFTs and climatic regions; and (3) to investigate whether VPP is superior to LSP in estimating global vegetation photosynthetic carbon uptake.


Introduction
Land surface phenology (LSP) is defined as the seasonal pattern of variation in vegetated land surfaces [1,2]. Because photosynthetic carbon uptake is closely associated with the length of the growing season, LSP has been widely used to discover the effects of climate change on vegetation biophysical processes, especially related to carbon exchange between the atmosphere and terrestrial ecosystems [3,4]. Various data sources have been applied to extract LSP, including remote sensing, ground camera imagery, and FLUXNET measurements [4][5][6][7][8]. The phenological metrics of vegetation obtained from these different data sources have various definitions and biophysical meanings [9]. However, spatial-and species-specific discrepancies as well as the effectiveness in accurately describing variations in vegetation carbon fluxes among these vegetation phenological metrics remain unclear.
Remote sensing data have been widely used to extract LSP data and analyze their regional-to global-scale changes [3,6]. Time series of vegetation indices include three key regression models for estimating the global VPP of different PFTs and map long-term global VPP products; (2) to compare the discrepancies between VPPest and MODIS-LSP estimates across different PFTs and climatic regions; and (3) to investigate whether VPP is superior to LSP in estimating global vegetation photosynthetic carbon uptake.

Site Description
In this study, we used the FLUXNET2015 dataset and selected the site-year of daily GPP data with over 80% of the quality flag > 0.8. The quality flag indicates the percentage of measured and good-quality gap-filled data [25]. The FLUXNET2015 dataset has been quality-controlled and gap-filled according to strict quality standards [25][26][27]. The vegetation type of the site was classified into sixteen PFTs based on the International Geosphere-Biosphere Program (IGBP) land cover classification scheme [28]. In total, 145 flux sites covering 785 site-years of daily GPP data (GPP_DT_VUT_REF) from 2000 to 2014 were downloaded from the FLUXNET2015 dataset (https://fluxnet.fluxdata.org/) ( Table S1.

Site-Level VPP Observations from FLUXNET Measurements
Based on the method proposed in [3], site-level VPP observations were extracted from daily GPP estimates derived from FLUXNET measurements. First, long-term daily GPP estimates of all years from the given site were smoothed using the Savitzky-Golay filter based on the embedded function in the MATLAB R2013a environment. The Savitzky-Golay filter has been successfully used for reconstructing high-quality normalized difference vegetation indices (NDVIs) and leaf area index time series datasets [29]. The general equation of the filter can be found in [29]. It has been used in numerous

Site-Level VPP Observations from FLUXNET Measurements
Based on the method proposed in [3], site-level VPP observations were extracted from daily GPP estimates derived from FLUXNET measurements. First, long-term daily GPP estimates of all years from the given site were smoothed using the Savitzky-Golay filter based on the embedded function in the MATLAB R2013a environment. The Savitzky-Golay filter has been successfully used for reconstructing high-quality normalized difference vegetation indices (NDVIs) and leaf area index time series datasets [29]. The general equation of the filter can be found in [29]. It has been used in numerous studies to extract vegetation phenology data [16,17,30]. Two parameters for the filter should be determined, including the degree of the polynomial and smoothing window. A suitably large smoothing window can more effectively smoothen the sharp peaks (outliers) caused by steep changes in environmental factors in the time series of daily GPP. Results from previous studies indicate an optimal smoothing window width of 9 (approximately 91 days) for reconstructing a high-quality 10-day NDVI time-series data set based on the Savitzky-Golay filter for 14 kinds of vegetation Remote Sens. 2021, 13, 5080 4 of 24 types [29]. A data analysis also showed that differences in SOG and EOG observations using smoothing windows of 91 and 15 days were 3 and 5 days for DBF sites and 2 and 1 days for ENF sites, respectively. Therefore, based on data analysis and results from previous studies, in our study, the degree of the polynomial was 2 and the width of the smoothing window was 91 days [29,31].
Second, we calculated the amplitude of each year for the given site, equaling the maximum value of the smoothed daily GPP minus the minimum value. Finally, a relative threshold method was repeatedly used to extract the SOG and EOG [10]. Dates extracted using different thresholds represent different phases of the seasonal cycles of GPP [10], and a commonly used threshold equates to 10% of the growing season amplitude to represent the start and end of photosynthesis phenology from GPP [17,32]. Previous studies have shown that LSPs derived from different methods have large discrepancies [33]. In our study, consistent with the threshold used to extract the latest MODIS-LSP product by [34], we set the threshold for extracting SOG and EOG at 15% of the long-term average amplitude for the given site. Thus, the SOG and EOG values for each year for each site were extracted based on the first and last dates, respectively, when the smoothed daily GPP reached a threshold of 15% ( Figure S1). The SOG using a threshold of 15% occurred 5 and 6 days later than when using a threshold of 10%, while the EOG was 7 and 9 days earlier than when using a threshold of 10% for DBF and ENF sites, respectively. The definitions of the seasons for the Southern hemisphere were reversed to be consistent with the seasons for the Northern hemisphere. For example, the spring season is defined as March-April-May in the northern hemisphere and September-October-November in the Southern hemisphere. The seasons for the EVI and LST data sets were also reversed for the Southern hemisphere.

Global VPP Estimates from MODIS EVI and Land Surface Temperature
A combined function of EVI and LST can well explain variation in GPP because of close correlations between EVI and light-use efficiency and between LST and environmental variables, such as radiation and vapor pressure deficit [35,36]. This provides a sound theoretical basis for estimating VPP using a regression model driven by EVI and LST. Multiple linear regression models for different PFTs were developed based on the relationships between site-level VPP observations and MODIS EVI and LST data [24]. The EVI data from the MODIS/Terra Vegetation Indices 16-Day L3 Global 250-m SIN Grid (MOD13Q1) [37] and the LST data from the MODIS/Terra LST/Emissivity 8-day L3 Global 1-km SIN Grid V006 (MOD11A2) [38] were used during the process of model construction. For each site, the EVI and LST data were extracted from a single MODIS pixel covering the FLUXNET site. Low quality values for both EVI and LST data were removed based on quality assurance (QA) layers, such as pixel reliability values from the MOD13Q1 dataset for the EVI data and mandatory QA flags from the MOD11A2 dataset for the LST data. The EVI data were masked when their corresponding pixel reliability was equal to 2 or 3, and the LST data were removed when their corresponding mandatory QA flags were equal to "10". If the percentage of poor values was > 30%, or if a long period of poor values was observed over two sequential time steps (32 days for EVI data and 16 days for LST data) during DOY 65 to 334 (i.e., the period of active plant growth), the site-year of the data was not used. Following this process, both 16-day EVI and 8-day LST values were linearly interpolated into daily values. Then, the same was done with the daily GPP estimates, and both site-level daily EVI and LST values were smoothed using the Savitzky-Golay filter.
A previous study has reported on the strong explanatory ability of the mean and standard deviation of seasonal EVI and LST for SOS and EOS modeling [24]. In addition, we hypothesize that inter-annual variations in SOG and EOG are strongly related to EVI and LST values at five key time points during the year that shape the curves of daily EVI and LST. Therefore, twenty-eight original independent variables were obtained from smoothed daily EVI and LST data for modeling site-level VPP estimates (Table S2). These include the mean, standard deviation, and coefficient of variation of seasonal EVI and LST [24,31]. The EVI and LST values at five key time points, including the start of each season (spring, DOY 60; summer, DOY 152; autumn, DOY 243; winter, DOY 334 for the  Northern hemisphere and spring, DOY 243; summer, DOY 334; autumn, DOY 60; winter,  DOY 152 for the Southern hemisphere) and the maximum value during summer, were used in this study, as they shape the curves of daily EVI and LST [31]. A detailed description of these independent variables is shown in Table S2. We used a stepwise selection method to select the independent variables from the 28 original independent variables using significance levels of 0.05 and 0.1 (default values in SPSS software) as thresholds for adding and removing independent variables, respectively. Thus, we developed PFTspecific multiple linear regression models for estimating the SOG (Appendix A) and EOG (Appendix B) of each PFT.
The PFT-specific multiple linear regression models were then applied to estimate global VPP. The 16-day EVI from the MOD13C1 v006 product and the 8-day LST from the MOD11C2 v006 product at a 0.05 • spatial resolution were used for global-scale VPP estimation. The independent variables were extracted from the smoothed EVI and LST time series. The global PFT map is from the MCD12C1 v006 product based on the IGBP classification schemes. Finally, the independent variables were added to the multiple linear regression models to produce a distribution map of global SOG and EOG estimates for 2001-2017.
Notably, the majority of EVI pixels in high latitudes (>60 • N) lacked quality due to snow cover [39], resulting in large uncertainties in SOG and EOG estimates. A growing number of studies show that vegetation phenology in high-latitude regions is closely related to ice/snow cover [2,[40][41][42][43][44][45][46]. We assume that photosynthetic carbon uptake is zero if the vegetation is covered by ice/snow and the land surface temperature is below 278 K [45]. According to observations by [44], the SOS date occurred later than the last (snowmelt) date of lingering snowpack (LDS), and the EOS date occurred before the first (snowfall) date of lingering snowpack (FDS). Furthermore, a significant relationship was observed between snow timing and vegetation phenology metrics in high-latitude regions [44]. Based on the above analysis, we used the following strategy to correct SOG and EOG estimates for pixels covered by ice/snow in high-latitude regions. When the SOG estimate from the multiple linear regression model occurred earlier than the LDS, or the EOG estimate occurred later than the FDS together with a land surface temperature below 278 K, these estimates were defined as unreasonable. These unreasonable SOG and EOG estimates were then respectively replaced with the corresponding estimates from the relationships between SOG observations and LDS, and between EOS observations and FDS ( Figure  2), derived from the FLUXNET dataset. The LDS and FDS were extracted from the pixel reliability layer of the MOD13C1 v006 product. The LDS is equal to the last date when the pixel reliability is equal to 2 (indicating snow/ice coverage) during a long and consecutive ice/snow cover period between DOY 1 and 182, and the BDS is equal to the first date when the pixel reliability is equal to 2 during a long and consecutive ice/snow cover period between DOY 183 and 365 [44]. The percent of pixels for SOG and EOG estimates based on the dates of lingering snowpack was small, only accounting for 5.0% and 7.4% of the total pixels of the terrestrial area, respectively.
Additionally, the QA values for VPP estimates were constructed according to the following criteria. When the percentage of poor EVI or LST data of a pixel was less than 20% during DOY 65 to DOY 334, the reliability of the VPP estimate of the pixel was defined as high quality, and the corresponding QA value was equal to zero. When the percentage of poor EVI or LST data of a pixel was between 20% and 40% during DOY 65 to DOY 334, the reliability of the VPP estimate of the pixel was defined as medium quality, and the corresponding QA value was equal to one. When the percentage of poor EVI or LST data of a pixel was over 40% during DOY 65 to DOY 334, the reliability of the VPP estimate of the pixel was defined as low quality, and the corresponding QA value was equal to two. When the VPP values were estimated based on the LDS and FDS, the reliability of the VPP estimate of the pixel was defined as low quality, and the corresponding QA values were equal to three ( Figure S2). Additionally, the QA values for VPP estimates were constructed according to the following criteria. When the percentage of poor EVI or LST data of a pixel was less than 20% during DOY 65 to DOY 334, the reliability of the VPP estimate of the pixel was defined as high quality, and the corresponding QA value was equal to zero. When the percentage of poor EVI or LST data of a pixel was between 20% and 40% during DOY 65 to DOY 334, the reliability of the VPP estimate of the pixel was defined as medium quality, and the corresponding QA value was equal to one. When the percentage of poor EVI or LST data of a pixel was over 40% during DOY 65 to DOY 334, the reliability of the VPP estimate of the pixel was defined as low quality, and the corresponding QA value was equal to two. When the VPP values were estimated based on the LDS and FDS, the reliability of the VPP estimate of the pixel was defined as low quality, and the corresponding QA values were equal to three ( Figure S2).

Evaluating Differences between VPP Estimates and MODIS-LSP Products
In order to evaluate differences between VPPest and LSP products, the most commonly used LSP products (MODIS-LSP) were compared with VPPest products. In addition, two global GPP data sets were used to analyze whether the VPPest products have closer relationships to GPP than the LSP products.

Comparison with MODIS-Derived LSP
The MODIS land cover dynamics product (MCD12Q2) is a commonly used global LSP product, which includes data from 2001 to present at a spatial resolution of 500 m [6,47]. The MODIS-LSP from MCD12Q2 Collection 6 data was derived from the two-band EVI (EVI2) time series calculated from the MODIS nadir bidirectional reflectance distribution function-adjusted surface reflectance product (MCD43A4). The algorithm for detecting the MODIS-LSP Collection 6 metrics differs substantially from the previous MODIS-LSP Collection 5 product. The EVI2 time series was first smoothed using a QA/quality check-weighted penalized cubic smoothing spline approach [34]. Then, the MODIS-SOS was extracted as the first date when the smoothed EVI2 time series crossed 15% of the smoothed EVI2 amplitude. Similarly, the MODIS-EOS was extracted as the last date when the smoothed EVI2 time series crossed 15% of the smoothed EVI2 amplitude. More information about the MODIS-LSP Collection 6 product can be found in [34].
Based on [47], seven MODIS tiles (h23v02, h12v03, h12v04, h18v04, h27v05, h20v07, and h20v08) including diverse ecoregions, climatic zones, and vegetation types were selected as the study area ( Figure S3). The corresponding area to each code is shown in

Evaluating Differences between VPP Estimates and MODIS-LSP Products
In order to evaluate differences between VPPest and LSP products, the most commonly used LSP products (MODIS-LSP) were compared with VPPest products. In addition, two global GPP data sets were used to analyze whether the VPPest products have closer relationships to GPP than the LSP products.

Comparison with MODIS-Derived LSP
The MODIS land cover dynamics product (MCD12Q2) is a commonly used global LSP product, which includes data from 2001 to present at a spatial resolution of 500 m [6,47]. The MODIS-LSP from MCD12Q2 Collection 6 data was derived from the two-band EVI (EVI2) time series calculated from the MODIS nadir bidirectional reflectance distribution functionadjusted surface reflectance product (MCD43A4). The algorithm for detecting the MODIS-LSP Collection 6 metrics differs substantially from the previous MODIS-LSP Collection 5 product. The EVI2 time series was first smoothed using a QA/quality check-weighted penalized cubic smoothing spline approach [34]. Then, the MODIS-SOS was extracted as the first date when the smoothed EVI2 time series crossed 15% of the smoothed EVI2 amplitude. Similarly, the MODIS-EOS was extracted as the last date when the smoothed EVI2 time series crossed 15% of the smoothed EVI2 amplitude. More information about the MODIS-LSP Collection 6 product can be found in [34].

Comparing the Relationship of VPP and LSP with GPP Products
Two independent and relatively reliable global GPP products at a spatial resolution of 0.05 • × 0.05 • from 2001 to 2016 were used to evaluate the relationship of VPP and LSP with GPP. The first annual GPP product (hereafter, GPP VPM ) was downloaded from https: //www.nature.com/articles/sdata2017165 accessed on 16 July 2021) The GPP VPM product was estimated based on the vegetation photosynthesis model using input datasets from MODIS and NCEP-reanalysis II [39]. Validated against GPP estimates from 113 FLUXNET sites, the overall accuracy of the 8-day GPP VPM dataset was relatively high, with a coefficient of determination R 2 = 0.74 and a RMSE = 2.08 g C m −2 day −1 . A slight underesti-Remote Sens. 2021, 13, 5080 7 of 24 mation in GPP VPM estimates for evergreen broadleaf forest and ENF was observed. The accuracy of GPP VPM estimates for CRO was improved by separating the treatments for C3/C4 photosynthesis pathways.
The second annual GPP product (hereafter, GLASS-GPP) for 2001-2016 was downloaded from http://www.glass.umd.edu/GPP/AVHRR/ accessed on 16 July 2021) The GLASS-GPP product was estimated based on the latest version of the Eddy Covariance-Light Use Efficiency model, which integrates the impacts of atmospheric CO 2 concentration, radiation components, and atmospheric water vapor pressure [48]. The model explains the spatial variations in the annual GPP with an R 2 of 0.83 and 0.68 for 42 calibration and 43 validation sites from FLUXNET, respectively.

Analytical Methods
The leave-one-out cross-validation approach was used to test the performance of the multiple linear regression models for estimating VPP. Both R 2 and RMSE were used to evaluate the accuracy of VPP estimates from the statistical models relative to VPP observations from FLUXNET measurements. The LSP metrics derived from MODIS datasets at a 500 m spatial resolution were aggregated to a 0.05 • spatial resolution to enable a comparison with VPP estimates from this study. All phenology metrics values at a finer scale (500 m) within the pixel at the coarser scale (0.05 • ) were averaged [2,49]. When the proportion of available values from pixels at a finer scale within a given pixel at a coarser scale was less than 50%, the pixel was not used. Due to different definitions and detection methods between VPP estimates from this study and LSP estimates from the aforementioned data sources, the mean difference (bias) indicating systematic variations, root-mean-square difference (RMSD), and the RMSD after removing systematic bias (RMSDb) were used to evaluate the agreement between VPP from this study and LSP from other data sources. The R 2 value was used to compare the linkages between VPP, LSP, and GPP.

Comparison of Site-Level VPP Estimates with VPP Observations from FLUXNET GPP
The SOG and EOG estimates based on the VPPest statistical models were compared with SOG and EOG observations from daily GPP time series. For forest sites, the SOG estimates had a strong relationship with SOG observations (R 2 = 0.80, p < 0.01), and the RMSE between them was 12.7 days ( Figure 3 and Table 1). Agreement between the EOG estimates and EOG observations was relatively lower (R 2 = 0.70, p < 0.01), but the RMSE between them (10.5 days) was comparable to that of SOG. The R 2 and RMSE values between estimates and observations varied across forest types. The DBF had the highest accuracy, with R 2 = 0.85 and a RMSE of less than a week for SOG, and with R 2 = 0.77 and a RMSE = 7.5 days for EOG, followed by MF and ENF with a RMSE of approximately two weeks for both SOG and EOG.   The R 2 values (0.78 for SOG and 0.64 for EOG) between VPP estimates and observations in non-forest sites were similar to those of forest sites, but the RMSE values (19.4 days for SOG and 22.9 days for EOG) were relatively higher than those of forest sites ( Figure 4 and Table 2). The coefficients of determination between VPP estimates and observations were higher than 0.60 for all non-forest types, but the requirements for RMSE were not satisfied. The RMSE values were between two weeks and three weeks for CRO, GRA, and WET, and over three weeks for shrub sites and SAVA. The RMSE values of EOG estimates for both GRA and WET were relatively low at close to two weeks. The EOG estimates of shrub sites had the lowest R 2 and the largest uncertainty with a RMSE = 39.0 days, followed by SAVA (27.5 days) and CRO (25.3 days). The SOG and EOG of non-forest types were more difficult to estimate than those of forest types. The R 2 values (0.78 for SOG and 0.64 for EOG) between VPP estimates and observations in non-forest sites were similar to those of forest sites, but the RMSE values (19.4 days for SOG and 22.9 days for EOG) were relatively higher than those of forest sites ( Figure 4 and Table 2). The coefficients of determination between VPP estimates and observations were higher than 0.60 for all non-forest types, but the requirements for RMSE were not satisfied. The RMSE values were between two weeks and three weeks for CRO, GRA, and WET, and over three weeks for shrub sites and SAVA. The RMSE values of EOG estimates for both GRA and WET were relatively low at close to two weeks. The EOG estimates of shrub sites had the lowest R 2 and the largest uncertainty with a RMSE = 39.0 days, followed by SAVA (27.5 days) and CRO (25.3 days). The SOG and EOG of non-forest types were more difficult to estimate than those of forest types.
x FOR PEER REVIEW 9 of 26

Spatial Distribution of Global VPP Estimates
We investigated the spatial distribution of the global averaged SOG and EOG estimates for the period 2001-2017 (Figure 5a,b, respectively) as well as changes in SOG and EOG along with latitude ( Figure 5c,d, respectively). Generally, the SOG decreases with a decrease in latitude, ranging from DOY 165 to DOY 80 between 85 • N and 17 • N, while it increases from DOY 80 to DOY 120 between 17 • N and 13 • N due to the arid climate of North Africa. Thereafter, it sharply decreases again from DOY 120 to DOY 65 between 13 • N and the equator. In the Southern hemisphere, the SOG increases from DOY 250 at the equator to DOY 300 at 18 • S, decreases from DOY 300 at 18 • S to DOY 250 at 38 • S, and then fluctuates between DOY 250 and DOY 300. The EOG increases with a decrease in latitude, ranging from DOY 235 at 85 • N to DOY 305 at the equator. The EOG increases from DOY 130 at the equator to DOY 160 at 15 • S, decreases to DOY 120 at 30 • S, and then remains relatively stable from 30 • S to 55 • S in the Southern hemisphere.
In the Northern hemisphere, late SOG occurs at high latitudes in the mid-western United States of America, the Alps in Europe, northeast and western China, central India, and North Africa, with SOG values ranging between DOY 130 and 180. In the Southern hemisphere, late SOG occurs in northern Australia, South Africa, and southern South America, with SOG values ranging between DOY 300 and 330. Early SOG is observed in the southeastern regions of the United States of America, across the Amazon, Western Europe, Central Africa, Southeast China, and Asia. Early EOG, with values ranging from DOY 240 to DOY 280, was found at high latitudes in the mid-western United States of America, Western Europe, North China, and western India. The spatial variation in EOG in the Southern hemisphere was smaller than that in the Northern hemisphere. We found that the spatial patterns of SOG and EOG were strongly correlated with ice/snow cover, altitude, climate, and vegetation types such as CRO.
Remote Sens. 2021, 13, x FOR PEER REVIEW 10 of 26 In the Northern hemisphere, late SOG occurs at high latitudes in the mid-western United States of America, the Alps in Europe, northeast and western China, central India, and North Africa, with SOG values ranging between DOY 130 and 180. In the Southern hemisphere, late SOG occurs in northern Australia, South Africa, and southern South America, with SOG values ranging between DOY 300 and 330. Early SOG is observed in the southeastern regions of the United States of America, across the Amazon, Western Europe, Central Africa, Southeast China, and Asia. Early EOG, with values ranging from DOY 240 to DOY 280, was found at high latitudes in the mid-western United States of America, Western Europe, North China, and western India. The spatial variation in EOG in the Southern hemisphere was smaller than that in the Northern hemisphere. We found that the spatial patterns of SOG and EOG were strongly correlated with ice/snow cover, altitude, climate, and vegetation types such as CRO.

Comparisons of VPP Estimates with MODIS-LSP
Both averaged VPPest and MODIS-LSP within seven tiles ( Figure S4) were compared across different PFTs for the period 2001-2017 (Tables 3 and 4). The SOG had a

Comparisons of VPP Estimates with MODIS-LSP
Both averaged VPPest and MODIS-LSP within seven tiles ( Figure S4) were compared across different PFTs for the period 2001-2017 (Tables 3 and 4). The SOG had a strong relationship with the MODIS-SOS, except for OSH and deciduous needleleaf forest (DNF), with R < 0.5. The relationships between EOG and MODIS-EOS were relatively weaker than those between SOG and MODIS-SOS. Among all PFTs, relationships between EOG and MODIS-EOS for GRA, SVAV, MF, DBF, DNF, and WET were greater than 0.5. However, we found that the absolute bias values were relatively large for GRA, ENF, and WET in spring phenology, and for cropland/natural vegetation (CNV), WET, DNF, and ENF in autumn phenology.     The RMSD value of spring phenology for deciduous forest was the lowest of all the PFTs, with a RMSD < 10 days. For GRA and CRO, RMSD values of spring phenology were relatively large at over 20 days. The remaining PFTs had spring phenology RMSD values that ranged from 10 to 20 days. In autumn phenology, RMSD values were generally higher than in spring phenology for the majority of PFTs. Similar to spring phenology, differences between EOG and MODIS-EOS for deciduous forest were considerably smaller than other PFTs with a RMSD < 12 days. The RMSD values of autumn phenology for CRO and CNV were considerably larger than other PFTs with a RMSD > 50 days. The RMSD values in autumn phenology were 15-20 days for ENF, DNF, MF, CSH, and GRA, and 20-40 days for SAVA, SWA, and urban and built-up regions (URB). The RMSDb values were considerably smaller than the RMSD values for DNF, WET, and CRO in autumn phenology, implying that these differences are primarily associated with systematic variations.
The magnitudes of bias and RMSD between VPP and MODIS-LSP also varied across PFTs and climatic regions. For ENF, average SOG and MODIS-SOS occurred between DOY 110 and DOY 118 in H12V03, DOY 104 and DOY 117 in H12V04, and DOY 60 and DOY 88 days in H18V04, respectively ( Figure 6). We found that the absolute magnitude of bias between SOG and MODIS-SOS increased for ENF, from boreal (a difference of 8 days in H12V03) to temperate in North America and the European Mediterranean region (a difference of 28 days in H18V04), whereas the magnitude of bias between EOG and MODIS-EOS increased for ENF with a decrease in latitude. Notably, spatial variations in bias between SOG and MODIS-SOS for MF were considerable. For MF, average SOG and MODIS-SOS were detected at DOY 142 and DOY 124 in H23V02, DOY 124 and DOY 120 in H12V04, and DOY 5 and DOY 72 in H20V08, respectively. A considerably large bias between SOG and MODIS-SOS occurred for MF in humid tropical regions (H20V08). The average SOG for DBF ranged from DOY 96 to DOY 117 and occurred 4-7 days later than the MODIS-SOS, which ranged from DOY 89 to DOY 113 in temperate and European Mediterranean regions (H12V04, H18V04, and H27V05). However, the average SOG in tropical humid regions (H20V08) occurred on DOY 88, 18 days later than the MODIS-SOS (DOY 70). The magnitude of bias between SOG and MODIS-SOS increased for DBF with a decrease in latitude, implying that the difference between SOG and The average SOG for DBF ranged from DOY 96 to DOY 117 and occurred 4-7 days later than the MODIS-SOS, which ranged from DOY 89 to DOY 113 in temperate and European Mediterranean regions (H12V04, H18V04, and H27V05). However, the average SOG in tropical humid regions (H20V08) occurred on DOY 88, 18 days later than the MODIS-SOS (DOY 70). The magnitude of bias between SOG and MODIS-SOS increased for DBF with a decrease in latitude, implying that the difference between SOG and MODIS-SOS in tropical regions was larger than that in temperate regions.
The spatial variations in bias between SOG and MODIS-SOS, and between EOG and MODIS-EOS, for CRO were also very clear. A large bias between SOG and MODIS-SOS was found for CRO in arid tropical regions (H20V07), while a large bias between EOG and MODIS-EOS was found for CRO in arid tropical and temperate Asian regions. The SOG (DOY 109) was clearly biased and occurred earlier than the MODIS-SOS (DOY 169), while the EOG (DOY 271) also occurred earlier than the MODIS-EOS (DOY 325) in arid tropical regions for CRO. A large discrepancy between SOG and MODIS-SOS was also found for GRA. The SOG (DOY 148) was earlier than the MODIS-SOS (DOY 174), and there was no bias between EOG (DOY 319) and the MODIS-EOS (DOY 319) in arid tropical regions for GRA.
For the majority of PFTs, the bias between VPP and MODIS-LSP was larger in the European Mediterranean region (H18V04 tile) than in temperate and boreal North America and Subarctic Eurasia. Furthermore, our results show large discrepancies between VPP and MODIS-LSP in the European Mediterranean region and in arid and humid tropical regions compared with temperate and boreal North America and Subarctic Eurasia.

Relationship between VPP, MODIS-LSP, and GPP
We further compared the relationships of LOG and LOS estimates to two individual global GPP products (GPP VPM from the VPM model and GLASS-GPP from the EC-LUE model) across all PFTs from 2001 to 2016, based on the data within the seven tiles ( Figure 7 and Figure S5). The results show that LOG estimates were more closely related to annual GPP than LOS for most PFTs. The R-values of ENF, MF, SVAV, and WVA were above 0.8 using LOG estimates, while only those of WVA and GRA were above 0.8 using LOS. Eight of thirteen PFTs had higher R values (ENF, MF, SAVA, WVA, CRO, URB, and CNV) using LOG estimates compared with those using LOS, while the R values for DNF, OSH, and GRA were lower using LOG estimates than those using LOS. The R-value of WET from LOG estimates was lower than that from LOS estimates for GPP VPM , but the R-value of WET from LOG estimates was higher than that from LOS estimates for GLASS-GPP. There was no significant difference in R-values between LOG and LOS estimates for DBF. The R-values of CRO and CNV using LOG estimates were noticeably higher than those using LOS. The relationship between LOS and annual GPP for CRO and CNV was negative and not significant. These results are consistent with those based on the FLUXNET data. The ability to explain annual GPP was improved when LOS was replaced by LOG for all PFTs. The spatial variations in bias between SOG and MODIS-SOS, and between EOG and MODIS-EOS, for CRO were also very clear. A large bias between SOG and MODIS-SOS was found for CRO in arid tropical regions (H20V07), while a large bias between EOG and MODIS-EOS was found for CRO in arid tropical and temperate Asian regions. The SOG (DOY 109) was clearly biased and occurred earlier than the MODIS-SOS (DOY 169), while the EOG (DOY 271) also occurred earlier than the MODIS-EOS (DOY 325) in arid tropical regions for CRO. A large discrepancy between SOG and MODIS-SOS was also found for GRA. The SOG (DOY 148) was earlier than the MODIS-SOS (DOY 174), and there was no bias between EOG (DOY 319) and the MODIS-EOS (DOY 319) in arid tropical regions for GRA.
For the majority of PFTs, the bias between VPP and MODIS-LSP was larger in the European Mediterranean region (H18V04 tile) than in temperate and boreal North America and Subarctic Eurasia. Furthermore, our results show large discrepancies between VPP and MODIS-LSP in the European Mediterranean region and in arid and humid tropical regions compared with temperate and boreal North America and Subarctic Eurasia.

Relationship between VPP, MODIS-LSP, and GPP
We further compared the relationships of LOG and LOS estimates to two individual global GPP products (GPPVPM from the VPM model and GLASS-GPP from the EC-LUE model) across all PFTs from 2001 to 2016, based on the data within the seven tiles ( Figures  7 and S5). The results show that LOG estimates were more closely related to annual GPP than LOS for most PFTs. The R-values of ENF, MF, SVAV, and WVA were above 0.8 using LOG estimates, while only those of WVA and GRA were above 0.8 using LOS. Eight of thirteen PFTs had higher R values (ENF, MF, SAVA, WVA, CRO, URB, and CNV) using LOG estimates compared with those using LOS, while the R values for DNF, OSH, and GRA were lower using LOG estimates than those using LOS. The R-value of WET from LOG estimates was lower than that from LOS estimates for GPPVPM, but the R-value of WET from LOG estimates was higher than that from LOS estimates for GLASS-GPP. There was no significant difference in R-values between LOG and LOS estimates for DBF. The R-values of CRO and CNV using LOG estimates were noticeably higher than those using LOS. The relationship between LOS and annual GPP for CRO and CNV was negative and not significant. These results are consistent with those based on the FLUXNET data. The ability to explain annual GPP was improved when LOS was replaced by LOG for all PFTs.

Accuracy of Site-Level VPP Estimates across PFTs
The estimation accuracy of both SOG and EOG varied across forest types. Consistent with results from previous studies [17] , our results indicate that the VPP for deciduous forest was detected more accurately than that for evergreen forest due to the relatively dramatic seasonal variation in canopy greenness. The accuracy of VPP for MF was medium between DBF and ENF. The results further indicate that the performance of phenology modeling based on vegetation indices largely depends on the variability in canopy greenness. Previous studies showed that the individual use of vegetation indices did not accurately predict LSP of ENF because of stable changes in seasonal canopy greenness [24].
The importance of temperature in improving phenology modeling for ENF has been investigated in a number of studies [7,24]. Temperature is the most important driving factor of photosynthetic activity for ENF and MF because these are primarily distributed in boreal and temperate regions ( Figure S6). Therefore, the accuracy of VPP estimates for MF and ENF was clearly improved by including temperature as a variable in VPP modeling.
Compared with forest types, our results indicate that VPP modeling for non-forest types is more challenging, probably due to the higher heterogeneity, sensitivity to climate change, disturbance, and the human management involved in non-forest types [2,50]. Highly heterogeneous SAVA and OSH include more than two types of plant species, resulting in complicated phenology phases [2]. For example, SAVA consists of mixed grasses and trees, while temporal differences in EOS between these two vegetation types reached three months [2,51]. Different plant species with the same eddy covariance footprint will vary in their contribution to carbon uptake, making it difficult to predict the phenological cycle of an ecosystem as a whole [16,52]. Furthermore, increasing the spatial complexity of ecosystems within a MODIS pixel area reduces the ability of vegetation indices to explain carbon flux and be used for further VPP modeling [16]. The spectral signatures of CRO and GRA are usually combined with soil and vegetation at the beginning and end of the growing season, owing to the low presence of vegetation, while the EVI is less sensitive to changes in soil background [16,53]. Thus, other vegetation

Accuracy of Site-Level VPP Estimates across PFTs
The estimation accuracy of both SOG and EOG varied across forest types. Consistent with results from previous studies [17], our results indicate that the VPP for deciduous forest was detected more accurately than that for evergreen forest due to the relatively dramatic seasonal variation in canopy greenness. The accuracy of VPP for MF was medium between DBF and ENF. The results further indicate that the performance of phenology modeling based on vegetation indices largely depends on the variability in canopy greenness. Previous studies showed that the individual use of vegetation indices did not accurately predict LSP of ENF because of stable changes in seasonal canopy greenness [24].
The importance of temperature in improving phenology modeling for ENF has been investigated in a number of studies [7,24]. Temperature is the most important driving factor of photosynthetic activity for ENF and MF because these are primarily distributed in boreal and temperate regions ( Figure S6). Therefore, the accuracy of VPP estimates for MF and ENF was clearly improved by including temperature as a variable in VPP modeling.
Compared with forest types, our results indicate that VPP modeling for non-forest types is more challenging, probably due to the higher heterogeneity, sensitivity to climate change, disturbance, and the human management involved in non-forest types [2,50]. Highly heterogeneous SAVA and OSH include more than two types of plant species, resulting in complicated phenology phases [2]. For example, SAVA consists of mixed grasses and trees, while temporal differences in EOS between these two vegetation types reached three months [2,51]. Different plant species with the same eddy covariance footprint will vary in their contribution to carbon uptake, making it difficult to predict the phenological cycle of an ecosystem as a whole [16,52]. Furthermore, increasing the spatial complexity of ecosystems within a MODIS pixel area reduces the ability of vegetation indices to explain carbon flux and be used for further VPP modeling [16]. The spectral signatures of CRO and GRA are usually combined with soil and vegetation at the beginning and end of the growing season, owing to the low presence of vegetation, while the EVI is less sensitive to changes in soil background [16,53]. Thus, other vegetation indices with a strong ability to minimize the effect of the soil background may be more suitable for predicting phenology than the EVI for CRO and GRA [16]. Additionally, the variability in phenology of CRO is significantly influenced by human management, and there are large differences in phenology between single and double cropping patterns [54][55][56]. Furthermore, GRA is more sensitive to climate and human disturbance [57,58], which causes multiple factor effects on variations in VPP for GRA. All of these reasons result in high complexity and uncertainty in the modeling of VPP for non-forest types.

Discrepancies between VPP Estimates from the Regression Models and MODIS-LSP
Comparing MODIS-LSP and VPP estimates using the same threshold (15% of the amplitude), we found considerable differences that varied across PFTs. The discrepancies (RMSD) ranged from 10 to 25 days between SOG and MODIS-SOS, and from 11 to 62 days between EOG and MODIS-EOS, depending on PFTs. This is confirmed by previous studies that indicated significant temporal lags between MODIS EVI-derived LSP and VPP [11,59], and between LSP from in situ measured green indices and VPP [11,60].
MODIS-LSP showed good agreement and synchronization with VPP with high correlations and low random differences (RMSDb) for DBF, which is consistent with previous studies [7,11,16,49]. Results from these studies indicate that the photosynthetic activity is closely related to greening and coloring of leaves for deciduous forests [7,11,16]. Leaf emergence is triggered by cumulative temperature at the start of season and leaf senescence is controlled by minimum temperature at the end of the season [61]. Hence, strong linkages and synchronizations among photosynthetic activity, leaf activity, and variations in temperature result in a close relationship between LSP and VPP for deciduous forests.
We found that the bias between MODIS-LSP and VPP varied among PFTs and climatic regions. Spring photosynthetic phenology occurred later than EVI-derived spring phenology, and autumn photosynthetic phenology occurred earlier than EVI-derived autumn phenology for DBF. Previous studies showed that the SOG occurs approximately one week later than the EVI-derived SOS, and EOG occurs approximately two weeks earlier than the EVI-derived EOS for DBF [11,49]. Physiological capacity, such as maturation of chlorophyll and Rubisco enzymes, does not end after leaf expansion [11,62]. The start of photosynthetic capacity in beech trees during spring occurred with an observed lag after leaf expansion [11,16,63,64]. Furthermore, for some deciduous trees leaves in the inner part of the canopy start changing their color earlier than at the top part of canopy [11,65], which results in disagreement between remote sensing observations and photosynthetic capacity. All of these physiological phenomena explain the temporal lags between VPP and MODIS-LSP for deciduous forests.
However, the large negative bias between SOG from the regression models and the MODIS-SOS for ENF indicates that the SOG occurred earlier than the MODIS-SOS, which is consistent with results from [66,67]. Vegetation-indices-derived SOS occurred 50 days late relative to the timing of 5% GPP [66], and 25 days later in high-latitude regions for ENF [67]. For evergreen forests, the positive bias between EOG and MODIS-EOS for ENF indicates that the EOG occurred later than the MODIS-EOS. Photosynthetic phenology in evergreen forests has no correlation with vegetation-indices-derived LSP because the VPP of evergreen forests is more related to climatic factors than canopy greenness and physiological activity is decoupled from changes in greenness [16]. Physiological activity is likely to start early in spring and continue to the end of the season if climatic conditions are still favorable [11]. The greenness of evergreen understory vegetation is barely detectable by optical remote sensing, and the carbon uptake contributed by evergreen understory vegetation will make the SOG occur earlier than the SOS and the EOG occur later than the EOS, which also results in lags and discrepancies between LSP and VPP [11]. The discrepancies between VPP and the MODIS-LSP for MF were more complex than those for ENF and DBF. The temporal lags between VPP and the MODIS-LSP for MF are similar to those of DBF at high latitudes and evergreen forest at low latitudes. When the fraction of snow cover is low during the snowmelt process, greenness can be detected by remote sensing data; however, photosynthesis is weak due to unsuitable temperatures. This may be a reason for the fact that the SOG of MF occurred later than the MODIS-SOS in subarctic regions. Spatial variations in bias and RMSD between VPP and MODIS-LSP were relatively high across the same PFT. The bias between MODIS-LSP and VPP also varied across climatic regions. The bias was larger in tropical regions than in boreal and temperate regions for forest types. Similar findings were also obtained by [66], who showed that the bias between SOG and the MODIS-SOS for ENF in the Mediterranean region with a humid and warm climate was greater than that in the boreal and temperate regions. Joint controls of climatic forcing and canopy greenness result in differences between VPP and MODIS-LSP across climatic regions. Discrepancies between VPP and MODIS-LSP for evergreen and deciduous forests in subarctic, boreal, and temperate regions were relatively low, due to temperature being a major climatic factor influencing both VPP and MODIS-LSP for deciduous forests in these regions [68]. The MODIS-SOS was detected at the time of recovery of pigments in old leaves and the emergence of new leaves for evergreen forest when the temperature was suitable for the start of photosynthesis [66]. Additionally, snow cover has an important influence on spring and autumn phenology [69], and the date of snowmelt may result in a shortening of the temporal lags between SOG and the MODIS-SOS in high-latitude regions [42,66], as leaves emerge directly after snowmelt [45]. In tropical regions, variations in photosynthetic carbon uptake primarily depend on climatic forcing rather than canopy greenness [16,70], and this variation is not reflected by remote sensing data [16,71]. This causes the random difference between VPP and the MODIS-LSP to be relatively high for evergreen forests in tropical regions.
Spatial variations in discrepancies between VPP and MODIS-LSP for non-forest types are relatively smaller but more complex than those for forest types, and primarily occur in CRO and GRA in arid tropical and temperate Asian regions due to the extreme climatic conditions [72] and human land use management [73]. The bias between VPP and LSP in arid tropical regions was larger than those in temperate North America and the European Mediterranean region for CRO and GRA. Water and heat stress result in a decoupling of vegetation greenness and photosynthetic capacity [74], which causes the discrepancy between VPP and MODIS-LSP in arid regions, especially for evergreen plants. High variation in MODIS-SOS and MODIS-EOS for CRO was observed in northern Africa (an arid tropical region), with SOS ranging from June to October and EOS ranging from June to November [56]. The MODIS-LSP is more similar to results from [56] than VPP estimates, implying that there may be large uncertainty in VPP estimates for CRO in arid tropical regions. A large bias between SOG and MODIS-SOS for GRA in arid tropical regions was also observed. The SOG (DOY 144-151) for GRA was more similar to the spring phenology (early June, approximately DOY 151-160) from [56] in northern Africa than the MODIS-SOS (DOY 160-190). Both averaged EOG (DOY 315-324) and MODIS-EOS (DOY 309-330) for GRA were similar to the results (DOY 300-330) from [56] in northern Africa. This comparison shows that the SOG and EOG estimates for GRA in arid tropical regions were relatively reliable.
The averaged SOG (DOY 100) was comparable to the averaged MODIS-SOS (DOY 96) for temperate Asia and both were within the range from DOY 90 to 120 for CRO [54]. The averaged EOG (DOY 279) for CRO occurred later than the averaged MODIS-EOS (DOY 252) for temperate Asia and was closer to the results of [54] (DOY 270 to 300) than that of MODIS-EOS. Agricultural land management is a major reason for the large random differences between MODIS-LSP and VPP for CRO in temperate Asia [54]. In addition, the cropping patterns have a clear difference between temperate Asia and North America [54,75]. The majority of agricultural land in temperate Asia has multiple cropping patterns [54], while the majority of agricultural land in temperate North America and European Mediterranean region has a single cropping pattern [75]. Therefore, the complex phenology for CRO with multiple cropping patterns in temperate Asia results in large random differences between VPP and MODIS-LSP.

Limitations of the Method
Vegetation phenology monitoring based on remote sensing data faces major challenges for non-forest types and for high latitude, arid, and humid tropical regions [47]. In this study, we found that snow cover information, such as snow cover periods from MOD13C1 products, is strongly correlated with spring and autumn phenology. Considering snow cover information as a variable in the model improves the accuracy of spring and autumn phenology estimates in high-latitude regions. However, the statistical models proposed in this study for estimating vegetation phenology may have a number of uncertainties for arid and humid tropical regions because only a few FLUXNET sites located in these regions can be used for model calibration. The method proposed in this study only considers temperature as a major climate factor for VPP estimation. On the other hand, the photosynthetic capability has a weak relationship with temperature, and is primarily regulated by radiation, precipitation, and available soil water content in arid and humid tropical regions [76].
On a global scale, growing seasons for vegetation in different climatic zones are diverse. For example, in the majority of the evergreen forests in humid tropical regions the growing season begins in the preceding year or ends in the following year [56]. Furthermore, some vegetation types in arid regions have multiple and variable growing seasons depending on the variation in temperature in the early months and precipitation in the mid growing season [72]. Considering these specific situations will further improve the model and the method used to decrease uncertainties of VPP estimates in arid and humid tropical regions. The CRO is a special vegetation type and its phenology is largely determined by the cropping pattern of land management. The method proposed in this study has the ability of exacting VPP for CRO with a single cropping pattern but does not allow for the exacting of all start and end dates of growing seasons for CRO with multiple growing seasons. Exacting VPP of CRO with multiple cropping patterns could be the third improvement of the method to increase the accuracy of VPP estimates for CRO.

Conclusions
Our results show that VPP was accurately estimated with a RMSE of approximately two weeks for forest types and three weeks for non-forest types through a combination of MODIS EVI and LST using a series of PFT-specific statistical models. Accurately estimating the VPP for non-forest types was more challenging than for forest types, primarily due to the high heterogeneity and human disturbances in non-forest types. We suggest that methods that are more effective be developed for improving VPP estimation of those PFTs with complicated seasonal variations in photosynthetic carbon uptake, such as multiple growing seasons, as well as SOG and EOG in the preceding year or in the following year. The dates of snow/ice cover used in this study played a useful role, to some extent, in constraining and improving the VPP estimates in high-latitude regions.
The VPP is not identical to LSP, and the bias between VPP and LSP varied across PFTs and climate regions. High agreement between VPP and LSP from different data sources was found for DBF, although systematic differences were present. The spring photosynthetic phenology occurred slightly later than SOS, while the autumn photosynthetic phenology occurred slightly earlier relative to EOS for DBF. The sign of bias between VPP and LSP for ENF was opposite to that of DBF, with SOG occurring earlier than SOS and EOG occurring later than EOS. Generally, for forest types, the bias between VPP and LSP was larger in warm and humid regions than in cold high-latitude and high-altitude regions. Climatic factors and canopy greenness jointly controlled spatial variations in discrepancies between VPP and LSP.
Discrepancies between VPP and LSP for non-forest types were more complicated than those for forest types. High discrepancies between VPP and LSP were found in arid tropical and temperate Asian regions for CRO and GRA due to their sensitivity to extreme climatic conditions and the intensive management of planting and grazing. The method proposed in this study failed to extract the VPP of CRO with multiple cropping patterns and other vegetation types with multiple growing seasons triggered by alternating drought and precipitation periods.
A stronger correlation was found between LOG from VPP estimates and GPP than that between LOS from LSP estimates and GPP for the majority of PFTs, especially for ENF and CRO. The VPP estimates mapped in this study provided a carbon-objective vegetation phenology and optimized the vegetation phenology module of land surface models. This improves the estimates of carbon and water exchange between the atmosphere and terrestrial ecosystem and enhances our understanding of the relationship and feedback between vegetation and climate under current and future climate change.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/rs13245080/s1. Figure Table S2: Detailed descriptions of the twenty-eight variables derived from smoothed MODIS EVI and LST data.
Author Contributions: Conceptualization, X.X. and J.H.; methodology, X.X.; validation, X.X. and Y.T.; data curation, X.X., Y.T., Z.Z. and Y.Q.; writing-original draft preparation, X.X. and Y.T.; writingreview and editing, X.X. and J.H.; funding acquisition, X.X. and J.H. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement: Not applicable for studies not involving humans.

Informed Consent Statement:
Not applicable for studies not involving humans or animals.

Data Availability Statement:
The data that support the findings of this study are available from the author upon reasonable request.
CarboItaly, CarboMont, ChinaFlux, Fluxnet-Canada, GreenGrass, ICOS, KoFlux, LBA, NECC, OzFlux-TERN, TCOS-Siberia, and USCCC. The ERA-Interim reanalysis data were provided by ECMWF and processed by LSCE. The FLUXNET eddy covariance data processing and harmonization were carried out by the European Fluxes Database Cluster, the AmeriFlux Management Project, and the Fluxdata project of FLUXNET, with the support of the CDIAC and ICOS Ecosystem Thematic Center and the OzFlux, ChinaFlux, and AsiaFlux offices. The authors thank the three anonymous reviewers for their valuable comments that helped to improve the quality of this paper.

Conflicts of Interest:
The authors declare no conflict of interest.