Comparison of Gross Primary Productivity Derived from GIMMS NDVI3g, GIMMS, and MODIS in Southeast Asia

: Gross primary production (GPP) plays an important role in the net ecosystem exchange of CO 2 between the atmosphere and terrestrial ecosystems. It is particularly important to monitor GPP in Southeast Asia because of increasing rates of tropical forest degradation and deforestation in the region in recent decades. The newly available, improved, third generation Normalized Difference Vegetation Index (NDVI3g) from the Global Inventory Modelling and Mapping Studies (GIMMS) group provides a long temporal dataset, from July 1981 to December 2011, for terrestrial carbon cycle and climate response research. However, GIMMS NDVI3g-based GPP estimates are not yet available. We applied the GLOPEM-CEVSA model, which integrates an ecosystem process model and a production efficiency model, to estimate GPP in Southeast Asia based on three independent results of the fraction of photosynthetically active radiation absorbed by vegetation (FPAR) from GIMMS NDVI3g (GPP NDVI3g ), GIMMS NDVI1g (GPP NDVI1g ), and the Moderate Resolution Imaging Spectroradiometer (MODIS) MOD15A2 FPAR product (GPP MOD15 ). The GPP results were validated using ground data from eddy flux towers located in different forest biomes, and comparisons were made among the three GPPs as well as the MOD17A2 GPP products (GPP MOD17 ). Based on validation with flux tower derived GPP estimates the results show that GPP NDVI3g is more accurate than GPP NDVI1g and is comparable in accuracy with GPP MOD15 . In addition, GPP NDVI3g and GPP MOD15 have good spatial-temporal consistency. Our results indicate that GIMMS NDVI3g is an effective dataset for regional GPP simulation in Southeast Asia, capable of accurately tracking the variation and trends in long-term terrestrial ecosystem GPP dynamics.


Introduction
Gross primary production (GPP) is an essential flux of the net ecosystem exchange (NEE) of CO 2 between the atmosphere and terrestrial ecosystems, and contributes to human welfare by regulating ecosystem functions [1][2][3][4]. Estimates of the magnitude and variability of GPP and NEE are fundamental to understanding the biogeochemical dynamics of terrestrial ecosystems [5,6]. Since the 1990s, the eddy covariance method has emerged as an important tool to investigate seasonal phasing and amplitudes of CO 2 fluxes between terrestrial ecosystems and the atmosphere [4,7]. The net carbon fluxes from eddy covariance towers can be partitioned into the components GPP and ecosystem respiration (RE), which can be used to validate products developed by biogeochemical models and remotely sensed vegetation indices [4,8]. However, it is a challenging task to extrapolate GPP and CO 2 flux measurements to the large scale due to significant spatial and temporal heterogeneity of ecosystems across complex landscapes.
Two main methods exist for deriving large-scale estimates of GPP [9][10][11]: (1) the application of process-based, biogeochemical models that scale-up and extrapolate site-specific measurements [12][13][14]; and (2) the application of satellite data based on the concept of light use efficiency [9,[15][16][17][18][19][20][21][22][23][24][25][26][27][28][29]. The accuracy of satellite-derived GPP or NPP estimates have been recently highlighted as a matter of concern due to poor satellite data quality as well as lack of accurate ecosystem representation in satellite-based models [30]. In our previous work, an integrated strategy was presented that combined the remote sensing-based GLObal Production Efficiency Model (GLOPEM) and the ecosystem process-based model, the Carbon Exchange between Vegetation, Soil and Atmosphere model (CEVSA). The integrated GLOPEM-CEVSA model was shown to increase the accuracy of spatial and temporal estimates of terrestrial ecosystem GPP, RE, and NEE by taking into consideration the spatial heterogeneity of ecosystems [9]. The model was also shown to be suitable for estimating the spatial-temporal distribution of net primary production (NPP) for forest and grassland ecosystems [31,32].
A main component of the GLOPEM-CEVSA modeling framework is the satellite-derived fraction of photosynthetically active radiation absorbed by vegetation (FPAR), which can be derived from satellite normalized difference vegetation index (NDVI) data. The Advanced Very High Resolution Radiometer (AVHRR) sensor has the longest record of continuous satellite data, spanning 1981 to the present, and the data have been used to produce various versions of the GIMMS (Global Inventory Modeling and Mapping Studies) NDVI data product. For example, the previous GIMMS NDVI dataset has been widely used by the community [33] (GIMMS NDVI1g, hereafter). The newest version of the global GIMMS NDVI dataset version 3, termed NDVI3g, has been recently released and covers 31 years from July 1981-2011 with data sensed by AVHRR onboard NOAA 7,9,11,14,16,17 and 18 [34,35]. NDVI3g represents an important advance by extending the satellite record to more than 30 years. NDVI3g is a global product with a bi-monthly temporal resolution and ~8-km spatial resolution, and features improved data quality by accounting for biases such as calibration loss, orbital drift, volcanic eruptions, etc. [36]. Further, NDVI3g data has been previously shown to represent real responses of vegetation to climate variability [37]; suggesting that significant potential exists for the NDVI3g dataset to contribute to long-term simulations of GPP.
Our main objective here was to compare the newly available GIMMS NDVI3g dataset to similar existing data products in an effort to establish the potential for GIMMS NDVI3g to advance understanding of the terrestrial carbon cycle for the region of Southeast Asia. First, we simulated GPP using the GLOPEM-CEVSA model, the same meteorological datasets, and various FPAR products from GIMMS NDVI3g, GIMMS NDVI1g, and the Moderate Resolution Imaging Spectroradiometer (MODIS) FPAR product (MOD15A2). Second, we validated the derived GPP estimates using the eddy covariance derived GPP data and conducted a comparison among the three GPP results as well as the MODIS primary production products (MOD17A2). With this study, we aim to set the stage for the first continuous 30-year records of GPP for Southeast Asia, which we hope to use to improve our understanding of the climatic and anthropogenic responses of ecosystems that have occurred in the region over the previous three decades.

GLOPEM-CEVSA
GLOPEM-CEVSA represents a coupling of two independently developed models (GLO-PEM and CEVSA), and describes the processes and mechanisms of an ecosystem based on remotely sensed vegetation indices and the simulation of the physiological processes ( Figure 1) [9,32]. Founded on the concept of production efficiency, the GLO-PEM model uses satellite remote sensing data to estimate GPP and NPP [22,38,39]. The GLO-PEM approach has previously been applied successfully at both the regional and global scale [22,[40][41][42][43]. The CEVSA model is an ecosystem process and mechanism model developed to describe the carbon cycles of terrestrial ecosystems. CEVSA has been applied to simulate net ecosystem productivity (NEP) at the regional and global scale [44][45][46]. The combination of these models, the GLOPEM-CEVSA model, can thus simulate GPP, NPP, and NEP. Here, we mainly focus on the evaluation of the GPP product as it is impacted by FPAR data from different sources. According to production efficiency logic, we estimated GPP by assuming a linear relationship between the incoming photosynthetically active radiation (PAR), the fraction of PAR absorbed by the vegetation (FPAR), the potential conversion efficiency or carbon yield of absorbed energy (ε * ), and the reduction of ε * caused by environmental factors (ζ) (Equation (1)) [9,22,43].
GLO-PEM and the newly developed GLOPEM-CEVSA model differ from other production efficiency models in that these models do not assign a fixed value for potential light use efficiency for each vegetation type. Instead, they are based on quantum yield that is affected by, among other factors, photosynthetic pathway, temperature, and the CO 2 /O 2 specificity ratio [22,43,47], which are given in Appendix. We estimated ζ using critical environmental factors known to impact stomatal conductance [43], according to the below equation: where f(T), f(δ q ) and f(δ θ ), respectively, represent the effects of air temperature (T), specific humidity deficit (δ q ), and soil moisture (δ θ ) on stomatal conductance [43]. The temperature effect, f(T), reaches the maximum (1.0) at the optimum temperature and tapers off for warmer or cooler temperatures [48]: where T is air temperature (°C) and T min , T opt and T max are respectively the minimum, optimum and maximum temperatures (°C).
where δ q is the specific humidity deficit (g kg −1 ), Qw(T) is the saturated specific humidity at the air temperature, and q is the specific humidity of the air. The effect of soil moisture is given as: where δ θ is the soil moisture deficit (mm) in the top 1.0 m of soil. The soil moisture deficit is defined as the difference between saturated water content and actual water content, which is calculated with the algorithms described in [21].

The MODIS Algorithm (MOD17)
As the first regular, near-real-time data sets for repeated monitoring of vegetation primary production on vegetated land at 1-km resolution and an eight-day interval [25,49,50], the MODIS primary production products were applied for comparison with GIMMS NDVI3g based GPP. The MODIS algorithm (MOD17) also uses the concept of the light use efficiency. In the model, the actual light use efficiency (ε) calculation includes two parts: (1) the biome-level maximum light use efficiency estimate and (2) the environmental stress scalars: The potential light use efficiency (ε max ) is optimized in advance with a complex ecosystem model, BIOME-BGC, and is given in a Biome Parameter Look-up Table ( where TMIN and VPD are the daily minimum temperature (°C) and average vapor pressure deficit (Pa), TMIN max and VPD max are the daily minimum temperature and average vapor pressure deficit at which ɛ = ɛ max , and TMIN min and VPD min are the daily minimum temperature and average vapor pressure deficit at which ɛ = 0. These parameters were determined for each land cover type in the BPLUT.
The main data inputs into the MOD17 model include: (1) FPAR and LAI from the MODIS MOD15 LAI/FPAR data products; (2) temperature, incoming solar radiation, and vapor pressure deficit from NCEP/DOE reanalysis II [51]; (3) land cover classification from the MODIS MCD12Q1 data product; and (4) a Biome Parameter Look-up Table (BPLUT) containing values of ɛ max and other biome-specific physiological parameters for different vegetation types.

Main Data for GLOPEM-CEVSA Model Input and Validations
The GLOPEM-CEVSA model inputs include the interpolated meteorological data, the remote sensed FPAR and vegetation classification data, and soil data (e.g., soil texture, soil available water and soil carbon). Other auxiliary data included digital elevation model (DEM) data and vegetation carbon storage data.
The DEM was used in the meteorological data interpolation, and was obtained from Shuttle Radar Topography Mission (SRTM) with an original 90-m spatial resolution and was resampled to 8-km resolution. Soil texture and soil carbon storage were obtained from the Harmonized World Soil Database (HWSD) of the International Institute of Applied System Analysis (IIASA) and the Food and Agriculture Organization (FAO). The soil carbon storage data was used to initialize the soil respiration sub-module, and the soil texture data was used to calculate the parameters of soil water content.

Meteorological Data
The meteorological data needed for GLOPEM-CEVSA included the interpolated air temperature, precipitation, wind speed, air relative humidity, and radiance data. In this study, the air relative humidity was calculated with the dew temperature and air temperature according to Allen [52]. The air temperature, dew temperature and wind speed were the Global Surface summary Of Day Data (GSOD) produced by the National Climatic Data Center (NCDC). The eight-day mean was calculated with the daily observations of the station from 1980-2008 and then interpolated to an eight-day grid with 8-km spatial resolution. The ANUSPLIN package (version 4.2), that is based on thin plate smoothing splines of multi-variates, was applied for interpolation in this study because of its good performance compared to other interpolation methods [53].
The daily precipitation datasets were derived from the Asian Precipitation Highly-Resolved Observational Data Integration Towards Evaluation of Water Resources (APHRODITE's Water Resources) project [54]. We used the dataset in the Monsoon Asia area from 1980-2007 and as precipitation showed spatial discontinuity, we re-projected and resampled from 0.25° to 8-km using the nearest neighbour algorithm. The Surface Radiation Budget (SRB) Release 3.0 daily data for 1984-2007 were obtained from the NASA Langley Research Center Atmospheric Sciences Data Center NASA/GEWEX SRB Project. The SRB dataset is a daily global product with 1° × 1° resolution and it was resampled to 8-km spatial resolution with the bilinear algorithm as radiation data showed good spatial continuity. Then, the photosynthetically active radiation and net radiation were calculated with the all-sky surface downward and upward shortwave flux.

Remotely Sensed Data
(1) GIMMS FPAR FPAR was linearly related with NDVI or the NDVI-derived Simple Ratio (SR, SR = (1 + NDVI)/(1 − NDVI)). Previous study has shown that the NDVI-based FPAR model (FPAR NDVI ) and SR-based FPAR model (FPAR SR ) can overestimate or underestimate FPAR respectively, and an integrated model by combining FPAR NDVI and FPAR SR would provide an improved estimate [55]. We used this integrated strategy to calculate the FPAR, where α was arbitrarily set to 0.5. We calculated the FPAR SR by using this empirical algorithm [35] that is for a horizontally homogeneous, closed canopy composed almost exclusively of green material, where SR max and SR min are the simple ratios corresponding to the 98% and 2% of the NDVI frequency distributions of the i vegetation type. The 98% NDVI is assumed to represent vegetation with the maximum FPAR value, FPAR max , here assumed to be 0.95; the 2% value is assumed to represent no vegetation cover and is assumed to correspond to a FPAR value close to 0, FPAR min , here was assumed to be 0.001 [35].
where the NDVI max and NDVI min were estimated as 0.96 and 0.03 corresponding to the 98% and 2% thresholds of the NDVI frequency distributions of all the vegetated area. In order to match with MODIS data, the GIMMS FPAR was interpolated from the half month time step to the eight-day step by the spline algorithms in Matlab [56].
(2) MODIS FPAR The MOD15A2 FPAR is retrieved from the reflectance of two bands (648 and 858 nm) and its algorithm is based on three-dimensional radiative transfer theory and a back-up method based on relations between the NDVI and LAI/FPAR [57][58][59][60][61]. The FPAR data were downloaded from the Land Processes Distributed Active Archive Center (LP DAAC) [62], then filtered to remove the noise in FPAR time series by using the adaptive Savitzky-Golay (SG) method in the TIMESAT software [62,63]. The results had a 1-km spatial resolution and were resampled to 8-km. We will refer to this processed FPAR dataset as FPAR MOD15 from here on.
(3) Land Cover The MODIS land cover product (MOD12Q1) was applied as the vegetation classification data in this study ( Figure 2). This product provides global land cover maps at 1-km spatial resolution principally using the International Geosphere-Biosphere Programme (IGBP) classification systems [64]. To match with the other datasets used in this study, MOD12Q1 data was resampled from 1-km spatial resolution to 8-km with the nearest neighbour algorithm.

Validation and Comparison of the Derived GPPs
The observations for eight forest sites across Southeast Asia (Table 1 [ [65][66][67][68][69][70][71][72][73][74], Figure 2) were used to assess the performance of the modeled GPP and the remote sensing based FPAR in this study. (1) GPP Validation Data GPP validation was conducted for two levels: (1) seasonal variability using the daily GPP data from the Qianyanzhou (QYZ) site of ChinaFlux and the Palangkaraya (PDF) site of AsiaFlux; (2) annual GPP magnitude with data derived from the literature for eight flux towers ( Table 1).
The eddy covariance observations began in 2002 at the QYZ Forest Station, Jiangxi province, China (26°45′16.0″N, 115°4′15.0″E) [65,75]. It has a typical subtropical monsoon climate and the dominant species are masson pine (Pinus massoniana), slash pine (Pinus elliottii) and Chinese fir (Cunninghamia lanceolata) planted in 1984 and 1985 [65,[75][76][77]. The quality of the measured turbulence flux over this hilly region was evaluated and the fluctuation in CO 2 density caused by temperature and water vapor variations was corrected [66]. Mean diurnal variations and nonlinear regression were used in the gap-filling of the missing observations [65]. RE was estimated from the relationship between the nighttime NEE and temperature, then GPP was estimated from NEE and RE [65,78,79].
The eddy covariance tower PDF is located near Palangkaraya, Indonesia (2°20′42″S, 114°2′11″E, (2) Methods of Validation and Their Comparison The GPP estimations from the three FPAR products (GPP NDVI3g , GPP NDVI1g , and GPP MOD15 ) were validated and evaluated with the GPP from the eddy towers. They were also compared with the MODIS GPP product MOD17A2 (GPP MOD17 ) [25,49]. All three GPP products were simulated using the same climatic data input and different FPAR datasets from the GIMMS NDVI3g (FPAR NDVI3g ), the previous version of GIMMS NDVI1g (FPAR NDVI1g ) and the MODIS FPAR product (FPAR MOD15A2 ). This allowed comprehensive exploration of the reasons for GPP differences.
Linear regression analysis was applied to evaluate the performance of the remote sensed GPP and FPAR with reference to the on-ground observed GPP. Correlation coefficient r, adjusted coefficient of determination R 2 , P-value of a standard T-test statistic p, and sample size were used to define accuracy statistically. We also applied Root-mean-square error (RMSE) and relative RMSE to quantify the errors of the remote sensing based GPPs.

GPP Assessments on Seasonal Scale
According to an eight-day interval GPP analysis shown in Figures 3 and 4, the seasonal changes of the modeled GPPs (GPP NDVI3g , GPP NDVI1g , GPP MOD15 and GPP MOD17 ) were more correlative with the eddy covariance derived GPP (GPP OBS ) in the subtropical forest site (QYZ) than in the tropical forest site (PDF). The adjusted coefficient of determination (Adj. R 2 ) between the modeled GPPs and GPP OBS were all above 0.8 (p < 0.001) in the subtropical forest site (QYZ) ( Figure 3); however, the four Adj.  Figure 4) and GPP MOD17 had the lowest RMSE with GPP OBS less than those of GPP NDVI3g , GPP NDVI1g , GPP MOD15 . However, the seasonal variability of GPP MOD17 was not as consistent with GPP OBS as those in GPP NDVI3g , GPP NDVI1g , and GPP MOD15 .
We also conducted a comparison between the modelled and reference GPPs by using the seasonal and annual sum GPP data (Table 2). At the subtropical forest site (QYZ), GPP NDVI3g , GPP NDVI1g , and GPP MOD15 were underestimated by 21%, 35% and 21% for the whole year, by 8%, 23% and 11% for the period from May to October, and by 51%, 62% and 43% for the period from November to April (Table 2). However, at the tropical forest site (PDF), GPP NDVI3g , GPP NDVI1g and GPP MOD15 overestimated GPP by 36%, 27% and −30% for the whole year, 36%, 26% and −26% for May to October, 36%, 28% and −34% for November to April. These results indicate that the FPAR from GIMMS NDVI3g can capture GPP well in the growing season (May-October) in subtropical forest. MOD17A2 showed elegant performance at the two forest sites with the lower bias of −2.9% and −11.8% for the whole year, −6.1% and −10.1% for May to October, and 3.6% and −13.6% for November to April in the QYZ and PDF sites, respectively ( Table 2). Even with the closer values, the MOD17A2 has an insignificant correlation with GPP OBS (p = 0.815) in the PDF site ( Figure 4).

GPP Assessments on Annual Scale
More annual GPP data estimated from the observations on the eddy towers were collected from the peer-reviewed literature (see Table 1). A comparison of results is shown in Figure 5. GPP MOD15 had the best estimate of GPP (Adj. R 2 0.49) and was significantly correlated with GPP OBS products (p < 0.001) based on all the available validation data. GPP NDVI3g (Adj. R 2 0.35), and GPP MOD17 (Adj. R 2 0.16) showed less significant correlations ( Figure 5, Table 3) while GPP NDVI1g was not significantly correlated with GPP OBS (p = 0.74).
The GPP validation data were diverse, e.g., PDF-DB is a site where the forest was burnt in 1997 and 2002. Also, the fetch of the eddy tower with 4-m height was too small (around 250-1000 m in different directions) [73], such that the GPP simulation at 8-km resolution was unable to capture the fire effects. After excluding the PDF-DB site, the Adj. R 2 increased to 0.85 and 0.51, respectively, for the GPP MOD15 and GPP MOD17 estimates (Table 3).
We excluded more abnormal observation points progressively to achieve a group of best fits (shown as the solid lines in Figure 5) at which the Adj. R 2 reached a maximum value (best-fitting hereafter). The adjusted R 2 for GPP NDVI3g was always higher than that for GPP NDVI1g in each step, which suggested that GIMMS NDVI3g has higher quality than its old version (GIMMS NDVI) in GPP monitoring. The adjusted R 2 for GPP NDVI3g and GPP NDVI1g underwent a dramatic improvement to 0.82 and 0.73 after the best-fitting. The GPP NDVI3g overestimated the regression slope (1.05 in Table 3) by 5% and the GPP NDVI1g underestimated it by 43%. The GPP MOD15 reached the best-fit after removing only one site (PDF-DB) with the Adj. R 2 0.85; while the GPP MOD17 reached the best-fit after removing two sites (PDF-DB and MKL) with the Adj. R 2 0.67. The GIMMS NDVI3g had a better estimate of GPP (slope = 1.05) relative to the MOD15 (slope = 0.87) and MOD17 (slope = 0.45) based on the final best-fit sites. These results indicate GIMMS NDVI3g has a reasonably good capability to simulate GPP. GPP MOD15 had the highest accuracy with a higher coefficient of determination (Adj. R 2 ) and significance (p) in all three situations (all data, excluding PDF-DB, and best fitting).

Spatial and Temporal Comparison among Four GPP Results
A linear regression analysis was applied to investigate the consistency between GPP NDVI3g and the other three results (GPP MOD15 , GPP NDVI1g , GPP MOD17 ) and the scatter plots are shown in Figure 6. The results indicated that the GPP NDVI3g have significant linear correlations (p < 0.0001) with the other three results. Due to the higher data range of the original GIMMS NDVI3g data, GPP NDVI3g overestimated the GPP values relative to the other three results. As GPP MOD15 was the best estimation stated in Section 3.2(2), we used it as the baseline and adjusted the GPP NDVI3g by using the linear regression coefficient. The adjusted GPP NDVI3g was used for the following analysis and comparisons.  Table 3 for three levels: all data, data excluding the PDF-DB site, and best fit data (solid points). The sample numbers are detailed in Table 3. Table 3. Comparison of linear regression results of modeled (GPPMOD) and observed GPP (GPPOBS) with the formula: GPP MOD = GPP OBS × slope + Intercept. GPP was modelled by GLOPEM-CEVSA with FPAR from GIMMS NDVI3g, GIMMS NDVI and MODIS product (MOD15A2) respectively. MOD17A2 is the GPP product of MODIS and was calculated by the MOD17 algorithm applying the FPAR from MOD15A2. The results are given with the value and error of intercept and slope, the statistics of adjusted coefficient of determination (Adj. R 2 ) and significance level (p), and the relative root-mean-square error (RMSE, %). PDF-DB is for data acquired at the swamp forest (DB) site of PDF disturbed by burning.  Due to the data availability, GPP NDVI3g , GPP NDVI1g , GPP MOD15 , and GPP MOD17 had different date ranges for Southeast Asia, but all of them had an overlap period from 2000-2006. We calculated the multi-year average of GPPs in 2000-2006 to compare their spatial patterns. Figure 7 shows that all the four GPP results had similar spatial patterns. The Tibetan Plateau had the lowest GPP values, while the highest GPP values occurred in the tropical forest areas in Malaysia and Indonesia. In India, however, the four datasets had inconsistent patterns, and especially GPP MOD17 had a lower estimate than the other three (Figure 7). This discrepancy could be associated with the lower estimate of MOD17 model in the crop GPP simulation (Table 4), as there is a large area of cropland (mainly double paddy rice cropping) in India.  Table 4. The average forest GPP in Southeast Asia ranged from 1.79 ± 0.03 kilogram of carbon per square meter per year (kgC· m −2 · a −1 ) (GPP NDVI1g ) to 2.20 ± 0.06 kgC· m −2 · a −1 (GPP MOD17 ), while GPP NDVI3g , GPP MOD15 had GPP estimates in the order of 2.12 ± 0.06 kgC· m −2 · a −1 and 2.08 ± 0.09 kgC· m −2 · a −1 . All of the four products had highest GPP values in evergreen broadleaf forest (2.02-2.57 kgC· m −2 · a −1 ) while the lowest GPP values were in evergreen needle-leaf forest (0.84-1.14 kgC· m −2 · a −1 ).
The forest ecosystem had the largest carbon uptake with total GPP of 8.8 petagram of carbon per year (Pg· C· a −1 = 10 15 gram carbon per year). This accounted for 43% of the total vegetation carbon uptake although its area accounts for 30% of the study area. Croplands accounted for 32% of the carbon and 31% of the area, and grassland 21% of the carbon uptake over 24% of the area. Of the forest types, evergreen broadleaf forest had the highest GPP of 6.6 Pg· C· a −1 over 69% of the total forest area, followed by mixed forest with 1.5 Pg· C· a −1 over 29.1% of the area (Table 4). In total, all the vegetation in our studied Southeast Asia had a total carbon uptake of 20 Pg· C· a −1 . Beer et al. [3] reported an observation-based global GPP estimate of about 123 ± 8 Pg· C· a −1 .Therefore, from our estimates, Southeast Asia contributes about 14%~16% of the global carbon flux.

Uncertainty in GPP Modelling
GLOPEM-CEVSA based GPP estimates were calculated with three different FPAR datasets (FPAR NDVI3g , FPAR NDVI1g and FPAR MOD15 ) as well as the same meteorological data (e.g., air temperature, dew point, and wind speed). Validations and comparisons showed that GIMMS NDVI3g is comparable in capabilities to MODIS data for spatial and temporal modelling of GPP.
While using the same MODIS FPAR data, GLOPEM-CEVSA based GPP MOD15 showed better accuracy than that of GPP MOD17 based on the MODIS algorithm, indicating better potential for application of the GLOPEM-CEVSA model for the study region. An important factor contributing to uncertainty in GPP NDVI3g could be that of the GIMMS NDVI3g data. For example, Fensholt et al. [81] found that correlations between GIMMS NDVI3g and MODIS NDVI datasets are highly significant for areas with a distinct phenological cycle (e.g., temperate forest), while the discrepancies between them are large in equatorial areas (e.g., tropical evergreen forest). The temporal and spatial resolutions could also affect the quality of satellite-based FPAR and GPP estimates. GIMMS NDVI3g and MODIS have temporal resolutions of half-month and eight-days, respectively, while their spatial resolutions are 8-km and 1-km, respectively. Consequently, GIMMS NDVI3g most likely neglects some fragmentary landscapes. Since the landscape of Southeast Asia has been seriously fragmented due to its mountainous terrain and rapid deforestation over the course of the study period [82], MODIS likely performs better in areas of high disturbance and/or land cover change. However, we aggregated the MODIS FPAR to match the GIMMS NDVI3g data, which diminished the advantages of MODIS in spatial resolution. Despite a relatively coarse spatial resolution, GIMMS NDVI3g is the longest continuous temporal record of vegetation dynamic and as such represents a critical data source for discovering the responses of vegetation to climate change over the long time span from the 1980s to the present.
The land cover information was another main source of uncertainty that influenced GPP estimates. Retrieved GPP would be substantially affected if land cover were wrongly classified, e.g., misclassification of subtropical evergreen forest as woody savanna would lead to a substantial bias due to differences in parameterization. Therefore, accurate land cover classification is critical for the simulation of GPP at the regional scale. Other factors, e.g., canopy structure, could greatly influence upstream inputs (e.g., LAI retrieval) and finally affect the GPP estimates [25,83].
In this study, GPP NDVI3g was shown to agree well with the on-ground GPP observations. However, GPP NDVI3g overestimated GPP in absolute values ( Figure 6). This discrepancy could be related to the higher data range in the original GIMMS NDVI3g data, which was ultimately transferred to FPAR NDVI3g due to the application of a linear algorithm (Equations (9)-(11)). The newly released GIMMS FPAR3g data could solve this problem, as the GIMMS NDVI3g and MODIS FPAR (MOD15A2) products for the overlapping period 2000-2009 were harmonized using a neural network algorithm [84]. However, in our research we simulated GPP using the independent GIMMS NDVI3g data, in an effort to evaluate the potential of this dataset for GPP modelling.

Uncertainty in FPAR
FPAR is one of the main uncertainty sources that linearly influence GPP [16,50,85]. The accuracy of satellite-based FPAR could be affected by many factors, e.g., land cover data, surface reflectance or NDVI data, and the algorithm [86]. According to Los et al. [55], satellite-derived biophysical parameters are less sensitive to errors in the land cover classification than land cover-derived parameters. This is because they have only second order dependency on land cover type and the land cover type accounts for at most 10% of the variation in FPAR. We estimated NDVI max and NDVI min using the NDVIs located at the 98% and 2% of the NDVI frequency distribution for all vegetated area without considering various land cover types.
While using the same GLOPEM-CEVSA model settings, GPP MOD15 was more accurate than GPP MOD17 based on PSN, which indicated a high potential of the GLOPEM-CEVSA model for this study region. However, the higher accuracy in simulating GPPs implied that the adaptive SG filtered FPAR data could have a better performance than the FPAR data based on the linear interpolation approach using in the MODIS algorithm. In order to both fill unreliable and missing FPAR values and keep reliable values, the MODIS model applied linear interpolation of the nearest reliable values to recover temporal time series of FPAR [25]. This method probably removed some intrinsic seasonal variability and resulted in a lower correlation coefficient despite a lower bias. Conversely, we used the MODIS FPAR data filtered by the adaptive SG method in the TIMESAT software package [62,87] as input into the GLOPEM-CEVSA model, which enabled us to capture more of the seasonal variability [88,89]. We did not use the data quality information in the MODIS FPAR product (MOD15A2) as previous studies showed that the quality flag layer does not provide sufficient information regarding data quality [89], while the quality control information was considered in the MODIS algorithm [25].

Conclusions
Tropical forests play a very important role in global carbon cycling and climate change, especially when undergoing tropical forest degradation and deforestation induced by a growing human population and forest resource requirements [90]. Satellite-based estimates of gross primary production (GPP) are a wall-to-wall modelling of vegetation dynamics dependent on NDVI data. As the longest time series of remote sensing data, GIMMS NDVI3g is a critical data source to track vegetation activity and changes for the past 30 years. Based on the GLOPEM-CEVSA model, three different estimates of GPP for Southeast Asia were simulated including GIMMS NDVI3g (GPP NDVI3g ), GIMMS NDVI1g (GPP NDVI1g ), and MODIS (GPP MOD15 ), and then validated with ground based GPP. In addition, comparisons were made among the three GPPs as well as against the MODIS GPP product (GPP MOD17 ). By validation of the intra-annual variability of GPP at the QYZ and PDF flux tower sites and site-year level at an additional eight sites, we found that GPP NDVI3g has a higher accuracy relative to GPP NDVI1g and was comparable with the GPP MOD15 . However, further validation in other ecosystem types are needed in the future, especially when these products are used for non-forest ecosystems such as grassland and cropland. An independent validation of the intermediate variable FPAR would also help to reduce uncertainty and improve future GPP estimates.
Ultimately, our results test and validate GIMMS NDVI3g data for long-term quantification and monitoring of vegetation productivity, specifically for our study region of Southeast Asia, which notably includes tropical forests under increasing deforestation pressure. Our findings suggest that these data have the potential to critically advance the current understanding of ecosystem responses to global climate change and human activities at the regional and global scale

Potential Light Use Efficiency
The potential, gross ɛ was calculated considering C3 and C4 plant photosynthetic differences and a temperature adjustment proposed by Cao et al. [22,43]: where ɛ C3 and ɛ C4 were potential light use efficiency for C3 and C4 plants and P C4 was the proportion of C4 plants. For C4, ɛ does not depend on temperature and has a value of 2.76 g· MJ −1 . The value of ɛ for C3 plants was calculated from the quantum yield, the CO 2 photocompensation point, internal leaf CO 2 concentration and the CO 2 /O 2 specificity ratio [22,47] A biomass check was used to prevent the C4 pathway being selected in forested regions by setting a threshold biomass < 2 kg m −2 .