Improving Estimation of Gross Primary Production in Dryland Ecosystems by a Model-Data Fusion Approach

Accurate and continuous monitoring of the production of arid ecosystems is of great importance for global and regional carbon cycle estimation. However, the magnitude of carbon sequestration in arid regions and its contribution to the global carbon cycle is poorly understood due to the worldwide paucity of measurements of carbon exchange in arid ecosystems. The Moderate Resolution Imaging Spectroradiometer (MODIS) gross primary productivity (GPP) product provides worldwide high-frequency monitoring of terrestrial GPP. While there have been a large number of studies to validate the MODIS GPP product with ground-based measurements over a range of biome types. Few studies have comprehensively validated the performance of MODIS estimates in arid and semi-arid ecosystems, especially for the newly released Collection 6 GPP products, whose resolution have been improved from 1000 m to 500 m. Thus, this study examined the performance of MODIS-derived GPP by compared with eddy covariance (EC)-observed GPP at different timescales for the main ecosystems in arid and semi-arid regions of China. Meanwhile, we also improved the estimation of MODIS GPP by using in situ meteorological forcing data and optimization of biome-specific parameters with the Bayesian approach. Our results revealed that the current MOD17A2H GPP algorithm could, on the whole, capture the broad trends of GPP at eight-day time scales for the most investigated sites. However, GPP was underestimated in some ecosystems in the arid region, especially for the irrigated cropland and forest ecosystems (with R2 = 0.80, RMSE = 2.66 gC/m2/day and R2 = 0.53, RMSE = 2.12 gC/m2/day, respectively). At the eight-day time scale, the slope of the original MOD17A2H GPP relative to the EC-based GPP was only 0.49, which showed significant underestimation compared with tower-based GPP. However, after using in situ meteorological data to optimize the biome-based parameters of MODIS GPP algorithm, the model could explain 91% of the EC-observed GPP of the sites. Our study revealed that the current MODIS GPP model works well after improving the maximum light-use efficiency (εmax or LUEmax), as well as the temperature and water-constrained parameters of the main ecosystems in the arid region. Nevertheless, there are still large uncertainties surrounding GPP modelling in dryland ecosystems, especially for desert ecosystems. Further improvements in GPP simulation in dryland ecosystems are needed in future studies, for example, improvements of remote sensing products and the GPP estimation algorithm, implementation of data-driven methods, or physiology models.


Introduction
Drylands, including arid and semi-arid ecosystems, cover 30%-45% of the Earth's land surface [1,2], and play an important role in the global carbon cycle and future carbon sequestration [3,4].Accurate and continuous monitoring of terrestrial ecosystem production in arid and semi-arid regions is of great importance to improve the understanding of the role of arid terrestrial ecosystems in the global carbon cycle.However, the worldwide paucity of measurement of carbon exchange in arid ecosystems has hindered the full understanding of the magnitude of carbon sequestration and the accurate prediction of the carbon cycle [5,6].
Terrestrial gross primary production (GPP) is the largest component of the global carbon cycle and is essential to understand and quantify the contribution of terrestrial ecosystems to the global carbon cycle [7].Satellite remote sensing provides continuous and temporally repetitive observation of land surfaces and has advanced tremendously over the past few decades that has become a useful tool in estimating the terrestrial ecosystem production across broad temporal and spatial scales.Production efficiency models (PEMs), developed for predicting global GPP with remote sensing, have been widely used to quantify the spatial and temporal variation of terrestrial ecosystem productivity [8][9][10].In the absence of widespread ground observations, remote sensing models are also commonly used to estimate dryland CO 2 exchange [4,11].Previous data and remote sensing models comparisons have only included a few dryland sites [12].Thus, there is a need to understand how well commonly used remote sensing models capture the magnitude and inter-annual variability of measured CO 2 exchange [13].
Since 2000, satellite-based GPP estimation have increasingly used data from the Moderate Resolution Imaging Spectroradiometer (MODIS) due to its continuous worldwide availability [8].The MODIS GPP algorithm (i.e., MOD17) is a type of PEM, which provides high frequency worldwide observations of GPP [14,15].To date, MODIS has issued multiple versions of GPP [14,16].Currently, the MOD17 product has been updated to Collection 6 (C6), which has improved the algorithm parameters and forcing data of previous collections [15,17], as the spatial resolution has increased from 1000 m to 500 m.A large number of studies have validated the capacity of MODIS GPP products with eddy covariance (EC) measurements across multiple biomes, such as forests [18,19], shrublands [20], grasslands [21,22], savanna [23], croplands [24], and across biomes [12,[25][26][27].However, most of these studies validated previous versions of MODIS GPP products (i.e., Collection 4 and 5).Comprehensive evaluation of the performance of MODIS GPP C6 products in arid regions of China remains limited to this date [1].
Previous studies showed no consistent results in the validation of Collection 4 and 5 of MODIS GPP products.MODIS GPP may underestimate at some sites, such as at cropland sites [24], overestimate at some low productivity sites [25,28], or agree well [26] with tower-based GPP.Meanwhile, the MODIS GPP Collection 6 products (i.e., MOD17A2H) also tend to overestimate GPP in alpine meadows of the Tibetan Plateau [22] and underestimate flux-derived GPP at most sites across the globe [27].However, because of inadequate observations in arid regions compared with other regions, it remains uncertain whether these biases also exist in other ecosystems in arid regions for the improved Collection 6 GPP products.Therefore, it is necessary to validate the performance of the latest version of MODIS products in arid regions.
The overall uncertainty of carbon flux modelling includes uncertainty of input variables, model structure, and model parameters [29], which can significantly impact carbon flux at regional scales.Several attempts have been made to address the uncertainties of the PEM algorithm [26,27,30,31].For the MOD17 products, inaccuracies in the parameterization of model parameters (such as maximum light-use efficiency (ε max or LUE max )) were found to be one of the most important factors attributed to the bias of MODIS GPP [12,20].The current MOD17 algorithm uses the constant maximum LUE and other parameters for one ecosystem [18], which is not suitable for variability of climate conditions and ecosystems.Previous studies found that the LUE parameter in the MOD17 algorithm was underestimated [26].Several attempts have been made to calibrate the maximum LUE parameters and to improve the performance of MODIS GPP estimation [24,26,32].However, most of these studies overlooked the potential impacts of other model parameters' uncertainty on the estimation of GPP, e.g., water-limited factors, which are important factors for GPP estimation, especially for ecosystems in the arid region.
Research community have established that by adjusting the key parameters of the model can improve GPP estimation using MODIS GPP algorithm, which can compensate for the errors introduced by the model structures [23].A model-data fusion approach provides powerful tools for optimizing the model parameters and quantifying the influence of uncertainties, and is being increasingly used to estimate the parameters of ecological models [33][34][35][36][37][38].Model-data fusion approaches include Bayesian and non-Bayesian approaches.Non-Bayesian approaches, such as global optimization algorithms, can efficiently determine the optimal parameter solutions by minimizing (or maximizing) objective functions [36], but cannot quantify uncertainty.In contrast, the Bayesian approach can be employed to update the parameter distributions when new information becomes available [37], and produce reliable estimates of parameter and predictive uncertainty [38].Some past studies have strengthened the importance of parameters estimation in carbon cycle models [32,39], but have mainly focused on single site to constrain the parameters of a given plant functional type (PFT), in addition, few studies have assessed the variability of parameters within a PFT [40].For MODIS GPP validation, since the PFT parameters in the MOD17 algorithm are obtained from flux towers worldwide, they are not appropriate for specific regions such as the arid regions of China.
Thus, this study aims to examine the performance of newly released MODIS GPP C6 products and MOD17 algorithms in predicting GPP in a typical arid region of China.The overall goals of this study are to: (1) Evaluate the model performance of the MODIS GPP Collection 6 products at eight-day to annual time scale across various ecosystem types in a typical arid region of China; (2) analyze the uncertainty of remote sensing models in simulating GPP in typical arid regions; and (3) quantify the parameter uncertainties in GPP estimation for the main ecosystem types in arid regions of China by using a Bayesian approach with calibration of maximum LUE and water and temperature-limited factors.This research will contribute to the development and improvement of GPP estimates in arid regions.

In Situ Meteorological Observations and Carbon Flux Data
The fluxes and meteorological data used in this study are mainly based on a flux observation network located in a typical inland river basin: The Heihe River Basin (HRB) in the arid region of Northwest China.The HRB (37.7 • -42.7 • N, 97.1 • -102.0 • E), second largest inland river basin in China, is located in the middle part of the Hexi corridor and covers an area of approximately 1,432,000 km 2 [41].The HRB is a unique region in China and can be viewed as an epitome of the arid region of western China for its varied distributed landscapes of alpine meadow, wetland-oasis-desert and natural oasis-desert ecosystems from upstream to downstream [42,43].We constructed a comprehensive flux observation network in the whole river basin to investigate the complexity of hydrological and ecological processes in the arid region (Figure 1).In this study, we compiled 12 EC flux sites covering 3 grassland sites, 3 desert grassland sites, 3 cropland sites (including a wetland site), and 3 forest sites, which almost covered the major plant function types (PFTs) and typical ecosystem types in the arid region of an inland river basin.Figure 2 shows the meteorological observations of all the flux tower sites over HRB including precipitation, air temperature (T), and vapor pressure deficit (VPD).A large variability of climate conditions exist within and across the species.The specific locations and related information of the sites are shown in Table 1.The open-path eddy covariance (OPEC) system was used to measure carbon and water vapor fluxes in the flux observation network.The OPEC system at each site consists of a 3D sonic anemometer (CSAT-3/Gill, Campbell Scientific Instruments Inc., USA/Gill, UK) and an open path infrared gas analyzer (Li-7500/7500A, Licor Inc., USA).The meteorological variables were measured simultaneously at each site including air temperature, rainfall, solar radiation, photosynthetically active radiation (PAR), relative humidity and soil moisture.VPD was calculated using measured relative humidity and actual vapor pressure.The meteorological data were measured at automatic weather stations at every 10 min interval, which were carefully checked for quality and summed into 30 mins and daily timescales.The raw EC measurements of 10 Hz data were processed into half-hourly flux data using the flux processing software Eddypro (http://www.licor.com/env/products/eddy_covariance/software.html)developed by LI-COR Biosciences (Lincoln, NE, USA).The flux data processing steps included spike detection, coordinate rotation, time-lag correction, coordinate rotation, sonic virtual temperature correction, frequency-response correction, and density correction [44,45].Then, the flux data were gap-filled using the marginal distribution sampling (MDS) method and partitioned into GPP and ecosystem respiration (Reco) following the flux partitioning algorithms from the REddyProc package [46].

MODIS Datasets
The MODIS data used in this study include MODIS GPP data (MOD17A2H products), FPAR data (MOD15A2H products), and Surface Reflectance data (MOD09A1 products) with Collection 6 at 500m spatial resolution were downloaded directly from the Oak Ridge National Laboratory Distributed Active Center (ORNL DAAC) website.FPAR is the fraction of photosynthetically active radiation (400-700 nm) absorbed by green vegetation, which is a critical component of the MODIS GPP algorithm.To correct inferior values caused by the effects of clouds and aerosols, we reconstructed the MODIS FPAR time series data with Savizky-Golay filter algorithm [47].Meanwhile, to validate the performance of the MODIS FPAR data in the study area, we observed The open-path eddy covariance (OPEC) system was used to measure carbon and water vapor fluxes in the flux observation network.The OPEC system at each site consists of a 3D sonic anemometer (CSAT-3/Gill, Campbell Scientific Instruments Inc., USA/Gill, UK) and an open path infrared gas analyzer (Li-7500/7500A, Licor Inc., USA).The meteorological variables were measured simultaneously at each site including air temperature, rainfall, solar radiation, photosynthetically active radiation (PAR), relative humidity and soil moisture.VPD was calculated using measured relative humidity and actual vapor pressure.The meteorological data were measured at automatic weather stations at every 10 min interval, which were carefully checked for quality and summed into 30 mins and daily timescales.The raw EC measurements of 10 Hz data were processed into half-hourly flux data using the flux processing software Eddypro (http://www.licor.com/env/products/eddy_covariance/software.html)developed by LI-COR Biosciences (Lincoln, NE, USA).The flux data processing steps included spike detection, coordinate rotation, time-lag correction, coordinate rotation, sonic virtual temperature correction, frequency-response correction, and density correction [44,45].Then, the flux data were gap-filled using the marginal distribution sampling (MDS) method and partitioned into GPP and ecosystem respiration (Reco) following the flux partitioning algorithms from the REddyProc package [46].

MODIS Datasets
The MODIS data used in this study include MODIS GPP data (MOD17A2H products), FPAR data (MOD15A2H products), and Surface Reflectance data (MOD09A1 products) with Collection 6 at 500 m spatial resolution were downloaded directly from the Oak Ridge National Laboratory Distributed Active Center (ORNL DAAC) website.FPAR is the fraction of photosynthetically active radiation (400-700 nm) absorbed by green vegetation, which is a critical component of the MODIS GPP algorithm.To correct inferior values caused by the effects of clouds and aerosols, we reconstructed the MODIS FPAR time series data with Savizky-Golay filter algorithm [47].Meanwhile, to validate the performance of the MODIS FPAR data in the study area, we observed the actual FPAR data of cropland and desert grassland sites in HRB using AccuPAR (METER Group, Inc., Pullman, USA) during the growing seasons of vegetation in 2012 [43], and then compared the observations with the MOD15A2H FPAR data at corresponding sites.In addition, we also used the MODIS surface reflectance data to derive vegetation indices, such as the normalized difference vegetation index (NDVI).For the site of NTZ, due to the growing season of cropland (i.e., cantaloupe) is short, the desert or low vegetation land cover was identified in the MOD15A2H product, and thus we calculated the FPAR from NDVI data following the empirical formula of: FPAR = 1.24 × NDVI − 0.168 [48].

Description of MOD17A2H Algorithm
The MOD17A2H algorithm is based on light-use efficiency (LUE) approach [49,50], which provides global GPP estimates of 8 day temporal and 500 m spatial resolution [15].The MODIS GPP product is calculated from the following equation: where ε max is the maximum LUE obtained from the Biome-specified Parameters Look Up Table (BPLUT) on the basis of vegetation type.The BPLUT contains values specifying minimum temperature and VPD limits, specific leaf area and respiration coefficients for the standard land cover classes [48].SW rad is shortwave solar radiation of which 45% is photosynthetically active radiation (PAR), FPAR is the fraction of PAR absorbed by vegetation and the scale factors f(Tmin) and f(VPD) reduce ε max under unfavorable conditions of low temperature and high VPD.The forcing data such as SW rad , Tmin and VPD in the MOD17A2H GPP product were implemented by the Global Modeling and Assimilation Office (GMAO) Reanalysis data.The MODIS GPP algorithm is described in detail in previous literature [12,15,18].

Parameter Optimization and Uncertainty Analysis
The current MOD17 BPLUT is too general for local regional application [20].The same set of parameters was applied indiscriminately to diverse types of the same ecosystems, introducing large uncertainties for the simulation of GPP in the arid region.To improve the accuracy of the GPP estimation in desert-oasis-alpine ecosystems in the arid region, we calibrated the parameters of the MOD17 model based on in situ flux tower observations using Bayesian model-data fusion approach.The model parameters were calibrated against GPP time series from the flux tower measurement network through a Bayesian data model synthesis [33,38].According to Bayesian theory, posterior probability density functions (PDFs) of model parameters (θ) given the existing data (D), denoted P(θ|D), can be obtained from prior knowledge of the parameters and information generated by comparison of simulated and observed variables, and can be described as: where P(D) is the probability of observed GPP and P(D|θ) is the conditional probability density of observed GPP with prior knowledge, also called the likelihood function for parameter θ.
Given a collection of N measurements, the likelihood function (L) can be expressed as: where σ represents the standard deviation of the data-model error, X i represents the ith of N measurements, and µ i is the model-derived estimates of a measurement.
In our study, we assumed the parameter priors are uniform, and the posterior PDFs for the model parameters were generated from prior PDFs P(θ) with observation data by a Markov chain Monte Carlo (MCMC) sampling technique [33].Herein, the Metropolis-Hasting algorithm [51,52] was adopted to generate a representative sample of parameter vectors from the posterior distribution.We ran the MCMC chains with 50,000 iterations each, and regarded the first 15,000 iterations as the burn-in period for each MCMC run.All accepted samples from the runs after burn-in periods were used to compute the posterior parameter statistics of the models.

Experiment Configuration and Validation
The original MOD17A2H GPP products used the GMAO Reanalysis data as the driving metrological database, and calculated the GPP with the biome based parameters look up table on a global scale.To validate and improve the performances of the MODIS GPP estimations and quantify the uncertainty of the MODIS GPP simulation algorithm (MOD17 model), we replaced the satellite-derived and meteorological inputs in the MOD17 model and compared the modeled GPP estimates with flux tower observations with the following experiment configurations: (1) We firstly assessed the performance of original MOD 17A2H GPP product at spatial resolution of 500m with the tower based GPP.The results of the model validation, in this study, is called GPP_MODIS; (2) we used in situ meteorological data to run the MOD17 algorithm to understand the influence of meteorological inputs (i.e., incoming solar radiation, minimum temperature and vapor pressure deficit) on GPP modelling rather than the GMAO Reanalysis dataset, we called this GPP_Insitu; and (3) we compared the performances between the calibration of one parameter only (ε max ) and calibration of all parameters of the MOD17 model to examine the sensitivity of the water and temperature-limited parameters on GPP estimation.The results are called GPP_LUEopt and GPP_Fiveopt, respectively.To understand the effects of parameter uncertainty on GPP simulation, we compared the calibrated MOD17 model algorithm with in situ meteorological inputs from the flux tower network.Similar to GPP_Insitu, GPP_LUEopt, and GPP_Fiveopt were also calculated using the in situ meteorology data.However, for GPP_LUEopt, we only optimized the parameter of ε max by Bayesian approach and other parameters used the default BPLUT parameters in the MOD17 algorithm.Whereas, for GPP_Fiveopt we optimized all the five parameters using the Bayesian approach.

Statistical Analyses and Model Evaluation
We firstly calculated the daily values of EC-based GPP and then aggregated to the eight-day and annually values for seasonal and yearly GPP validation.To evaluate the performance of the MOD17A2H GPP model, we compared the modeled GPP with the flux tower-estimated GPP both in 8-day and annual time steps.We extracted the eight-day composite MODIS C6 GPP product (MOD17A2H) and the other MODIS products (e.g., MOD15A2H and MOD09A1 products) from the pixels centered on the flux towers, and compared the MODIS GPP product with the EC-based GPP observations.The model performance (i.e., differences between simulated and tower-based GPP) were quantified by using the coefficient of determination (R 2 ), root mean squared error (RMSE), and relative RMSE (rRMSE): where, Y sim and Y obs represent the simulated and observed GPP data, respectively, and n is the total number of samples.All the statistical analyses and results presentation are performed in Matlab R2016b software (Mathworks, Natick, MA, USA).

Site-Specific Evaluation of MODIS GPP Products and MOD17 Algorithm
The eight-day EC flux tower GPP (GPP_obs) was compared with the results of MOD17A2H GPP (GPP_MODIS), GPP simulated with the in situ meteorology forcing data (GPP_Insitu), and GPP simulated with optimized maximum LUE parameter (GPP_LUEopt) and with optimized all five parameters (GPP_Fiveopt).As illustrated in Figure 3a, the overall eight-day MOD17A2H GPP products were significantly underestimated when compared with the EC-observed GPP.The RMSE between MOD17 products and in situ flux observations of all sites was 1.80 gC/m 2 /day, while R 2 was 0.71 and the slope of the model was 0.49, which means the model could only contribute 71% of the tower-observed GPP.As shown in Figure 3b, when we used the in situ meteorology data to simulate the MOD17 model, a better correlation between simulation and observation was found.The model could explain 79% of the observation (the slope was 0.43, R 2 was 0.79), with a large biases close to that of MOD17A2H products.However, the modeled GPP still underestimate as compared to the observed GPP, which means that the meteorology forcing data were not the main reasons for the underestimation of GPP.By contrast, when we optimized the maximum LUE parameter (Figure 3c), a significant improvement of model performance for all sites was seen, with R 2 = 0.86, RMSE = 1.01 gC/m 2 /day, rRMSE = 6.99%, and the slope of the regression lines was closer to the 1:1 line, which signifies the importance of the LUE parameter in GPP modelling.Furthermore, as we optimized all parameters, a better performance of the model occurred.Almost all the points were close to the 1:1 line (Figure 3d), with R 2 = 0.91, RMSE = 0.81 gC/m 2 /day, and rRMSE = 5.59%, which indicates the best performance in these simulations.As we accumulated the observed and simulated GPP at a yearly timescale for every site, a significant underestimation of MOD17A2H GPP products also existed (Figure 4), which were similar to the results of the eight-day time scale.On an annual time scale, the simulated GPP showed a generally good agreement with the tower-observed GPP across all sites with R 2 = 0.69, RMSE = 347.31gC/m 2 /y and rRMSE = 60.48% (Fig. 4a) A better relationship was found between the modeled GPP and tower-observed GPP (R 2 = 0.73).The model was improved by using in situ climate forcing data.However, the modeled GPP was still underestimated as compared with observation.Moreover, the modeled GPP was significantly improved as we optimized the model parameters (Figure 4c and 4d).The modeled GPP was closer to the observed GPP (almost all points close to the 1:1 line), and all five parameters optimization results were better than for LUEmax parameter optimized only with R 2 of 0.87 and 0.92, respectively.The rRMSE was 23.93% and 19.55%, respectively, which signifies the importance of optimizing the temperature and water-constrained factors in arid regions.As we accumulated the observed and simulated GPP at a yearly timescale for every site, a significant underestimation of MOD17A2H GPP products also existed (Figure 4), which were similar to the results of the eight-day time scale.On an annual time scale, the simulated GPP showed a generally good agreement with the tower-observed GPP across all sites with R 2 = 0.69, RMSE = 347.31gC/m 2 /y and rRMSE = 60.48% (Figure 4a) A better relationship was found between the modeled GPP and tower-observed GPP (R 2 = 0.73).The model was improved by using in situ climate forcing data.However, the modeled GPP was still underestimated as compared with observation.Moreover, the modeled GPP was significantly improved as we optimized the model parameters (Figure 4c,d).The modeled GPP was closer to the observed GPP (almost all points close to the 1:1 line), and all five parameters optimization results were better than for LUE max parameter optimized only with R 2 of 0.87 and 0.92, respectively.The rRMSE was 23.93% and 19.55%, respectively, which signifies the importance of optimizing the temperature and water-constrained factors in arid regions.

Biome-Specific Evaluation of MODIS GPP Product and MOD17 Algorithm
The MOD17A2H GPP products and the other three model estimated GPP based on MOD17 algorithm were compared with the flux-derived eight-day time scale of GPP values for various biome types (Figure 5).We divided the original grassland into two types, grasslands, and desert grasslands, because of the large diversities in species and climate conditions in these sites.As shown in Figure 5, the original MOD17A2H GPP products were significantly underestimated in grassland, cropland and forest ecosystems, but not in the desert ecosystems.A good correlation between MOD17A2H GPP products and EC-observed GPP is illustrated in grassland ecosystems (R 2 = 0.82), followed by the cropland ecosystems (R 2 = 0.80) and forest ecosystems (R 2 = 0.53), while the weakest was in desert ecosystems (R 2 = 0.42).In addition, the slope of the linear regression for the scatter plot can also revealed the biases between MOD17A2H GPP and tower-observed GPP.We can see the slope of linear regression at the forest ecosystems is far from the 1:1 line, which demonstrates the largest biases between MOD17A2H GPP and the tower based GPP, followed by those of the cropland ecosystems, then the grassland and desert ecosystems.Therefore, it indicates that larger biases existed in most ecosystems in the arid regions of China, especially for the forest and cropland ecosystems.As we used the in situ forcing data, we did not find significant improvement for all biome types, and the simulations of GPP forced with in situ data in most ecosystems were still underestimated.However, as we optimized the parameters of the MOD17 model, the GPP simulation results were improved significantly in most ecosystems.The scatter points of modeled GPP and EC-measured GPP were distributed closely around the 1:1 line, indicating that the GPP simulation results can be improved after the parameter optimization of LUEmax and other parameters in most ecosystems in the arid region.However, a larger bias still

Biome-Specific Evaluation of MODIS GPP Product and MOD17 Algorithm
The MOD17A2H GPP products and the other three model estimated GPP based on MOD17 algorithm were compared with the flux-derived eight-day time scale of GPP values for various biome types (Figure 5).We divided the original grassland into two types, grasslands, and desert grasslands, because of the large diversities in species and climate conditions in these sites.As shown in Figure 5, the original MOD17A2H GPP products were significantly underestimated in grassland, cropland and forest ecosystems, but not in the desert ecosystems.A good correlation between MOD17A2H GPP products and EC-observed GPP is illustrated in grassland ecosystems (R 2 = 0.82), followed by the cropland ecosystems (R 2 = 0.80) and forest ecosystems (R 2 = 0.53), while the weakest was in desert ecosystems (R 2 = 0.42).In addition, the slope of the linear regression for the scatter plot can also revealed the biases between MOD17A2H GPP and tower-observed GPP.We can see the slope of linear regression at the forest ecosystems is far from the 1:1 line, which demonstrates the largest biases between MOD17A2H GPP and the tower based GPP, followed by those of the cropland ecosystems, then the grassland and desert ecosystems.Therefore, it indicates that larger biases existed in most ecosystems in the arid regions of China, especially for the forest and cropland ecosystems.As we used the in situ forcing data, we did not find significant improvement for all biome types, and the simulations of GPP forced with in situ data in most ecosystems were still underestimated.However, as we optimized the parameters of the MOD17 model, the GPP simulation results were improved significantly in most ecosystems.The scatter points of modeled GPP and EC-measured GPP were distributed closely around the 1:1 line, indicating that the GPP simulation results can be improved after the parameter optimization of LUE max and other parameters in most ecosystems in the arid region.However, a larger bias still existed even after parameter optimization.The impacts of parameter optimization on GPP simulation of desert ecosystems were less, indicating that it is difficult to effectively simulate the GPP in desert ecosystems in the current MOD17 model.existed even after parameter optimization.The impacts of parameter optimization on GPP simulation of desert ecosystems were less, indicating that it is difficult to effectively simulate the GPP in desert ecosystems in the current MOD17 model.

Site-Specific Evaluation of MODIS GPP Products and MOD17 Algorithms
The flux tower-observed GPP were compared with the original MOD17A2H GPP and GPP estimated from the MOD17 model with in situ meteorology forcing data (GPP_Insitu), LUE optimized (GPP_LUEopt) and five optimized parameters (GPP_Fiveopt) (Figure 7 and Table 2).Figure 6 illustrates the scatter plots between EC GPP and simulated GPP at the eight-day time scale at all sites.From the slope of linear regression for the scatter plot in Figure 6, most of the slope values were less than 1.0, which revealed the MOD17A2H GPP in most of the sites were obviously underestimated, as compared with the flux tower-observed GPP (Figure 6), except for the three desert grassland sites, where MODIS GPP was close to the observed GPP in most cases.However, relatively large biases existed in the desert sites.While all sites of MODIS GPP were underestimated except the desert sites, a good correlation between MOD17A2H GPP and tower-observed GPP was shown in grassland and cropland sites (coefficients of determination were greater than 0.7), followed by forest and desert ecosystems.After modelling GPP using in situ climate data, a better correlation between modeled GPP and observed GPP occurred in most sites.However, there were still

Site-Specific Evaluation of MODIS GPP Products and MOD17 Algorithms
The flux tower-observed GPP were compared with the original MOD17A2H GPP and GPP estimated from the MOD17 model with in situ meteorology forcing data (GPP_Insitu), LUE optimized (GPP_LUEopt) and five optimized parameters (GPP_Fiveopt) (Figure 7 and Table 2).Figure 6 illustrates the scatter plots between EC GPP and simulated GPP at the eight-day time scale at all sites.From the slope of linear regression for the scatter plot in Figure 6, most of the slope values were less than 1.0, which revealed the MOD17A2H GPP in most of the sites were obviously underestimated, as compared with the flux tower-observed GPP (Figure 6), except for the three desert grassland sites, where MODIS GPP was close to the observed GPP in most cases.However, relatively large biases existed in the desert sites.While all sites of MODIS GPP were underestimated except the desert sites, a good correlation between MOD17A2H GPP and tower-observed GPP was shown in grassland and cropland sites (coefficients of determination were greater than 0.7), followed by forest and desert ecosystems.After modelling GPP using in situ climate data, a better correlation between modeled GPP and observed GPP occurred in most sites.However, there were still apparently underestimations in most sites, which means the forcing data were not the main reason of the underestimation of GPP.Instead of the forcing data, the inappropriate BPLUT parameters were the main source of the uncertainty of GPP simulation.After the optimization of LUE and other parameters in the MOD17 model, GPP in most sites was improved significantly.Meanwhile, the performance of optimization of five parameters was better than that of only optimization of the LUE parameter.As shown in Table 2 and Figure 6, good performance of GPP simulation was observed in DMZ, ARZ, and SDZ (R 2 were greater than 0.9).However, the MODIS GPP showed a moderate performance in capturing the corresponding GPP simulation of desert ecosystems.Overall, the current MODIS GPP model correctly simulated the dynamics of GPP at most sites in the arid region.After the parameter optimization, the coefficients of determination were improved apparently, and the RMSE of most sites was less than 1 gC/m 2 /day.Table 2.A summary of the performances of the MOD17 algorithm (GPP_MODIS) and the in situ metrological data forced GPP, LUEmax parameter optimized GPP (GPP_LUE), and five parameters optimized GPP (GPP_Fiveopt).GPP_LUE and GPP_Fiveopt were estimated from the in situ climate data.In the GPP_Insitu and GPP_LUE algorithms, the default values for model parameters were used in MOD17 for the original land cover types and optimal parameter values for the optimization approach.

GPP_MODIS
GPP_Insitu GPP_LUE GPP_Fiveopt One of the first MODIS products used in the MOD17 algorithm is the Land Cover Product, MOD12Q1.The importance of this product cannot be overstated as the MOD17 algorithm relies heavily on land cover type through use of the BPLUT [15].Based on the locations of the flux tower sites, we obtained the land cover type of each site from the MCD12Q1 results, and compared them with the actual land cover types.We found that MCD12Q1 misclassified most of the sites downstream of the HRB (Figure 7), which is an artificial oasis ecotone with sparse vegetation in the extremely arid region of China.For example, in our study, the land cover type of the NTZ site is a cropland ecosystem; however, it was classified as grassland in the MCD12Q1 land cover data (Figure 7).In addition, MCD12Q1 also misclassified the forest types at the sites of HHL, SDQ, and HYL as grassland.
Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 23 with the actual land cover types.We found that MCD12Q1 misclassified most of the sites downstream of the HRB (Figure 7), which is an artificial oasis ecotone with sparse vegetation in the extremely arid region of China.For example, in our study, the land cover type of the NTZ site is a cropland ecosystem; however, it was classified as grassland in the MCD12Q1 land cover data (Figure 7).In addition, MCD12Q1 also misclassified the forest types at the sites of HHL, SDQ, and

Impacts of Uncertainty of FPAR Data on MODIS GPP Simulation
Figure 8 showed the comparisons of AccuPAR observed FPAR data with the MOD15A2H FPAR data in the corresponding sites in HRB.We found the MODIS FPAR data was overestimated compared to the observations in the growing season of desert grassland sites, as well as the low values of FPAR in the cropland sites, and underestimated in some stages of the high values of FPAR in the cropland sites.The overestimated FPAR impacted the APAR, thus leading to an overestimated GPP.In contrast, the underestimated FPAR would underestimate the GPP.Meanwhile, a good correlation between MODIS FPAR and observed FPAR occurred in cropland sites, while in the desert grassland sites, no significant relationship was found.This revealed that

Impacts of Uncertainty of FPAR Data on MODIS GPP Simulation
Figure 8 showed the comparisons of AccuPAR observed FPAR data with the MOD15A2H FPAR data in the corresponding sites in HRB.We found the MODIS FPAR data was overestimated compared to the observations in the growing season of desert grassland sites, as well as the low values of FPAR in the cropland sites, and underestimated in some stages of the high values of FPAR in the cropland sites.The overestimated FPAR impacted the APAR, thus leading to an overestimated GPP.In contrast, the underestimated FPAR would underestimate the GPP.Meanwhile, a good correlation between MODIS FPAR and observed FPAR occurred in cropland sites, while in the desert grassland sites, no significant relationship was found.This revealed that the accuracy of the current MOD15A2H FPAR data in the arid region needs to be further improved, and could also be an important source of uncertainty for the estimation of GPP in desert ecosystems.

Uncertainty and Variability of Biophysical Parameters for Diversity Ecosystems in Arid Regions
Since the performance after calibration of all five parameters of the MOD17 model was better than after the calibration of only the parameter εmax, indicating the important role of temperature and water-constrained factors in the estimation of GPP in the arid region.We thus calibrated all the parameters of MOD17 algorithms (Table 3).Our study illustrated that variability of biophysical parameters not only exist across different ecosystems, but also within the same ecosystems, such as the diverse biophysical parameters of grassland ecosystems and the desert grassland ecosystems.The current version of MOD17 BPLUT does not consider the differentials of these two types-they shared the same BPLUT parameters of grassland.However, there are different climate conditions and species in these two ecosystems in the study region.Meanwhile, there are different photosynthesis paths between C3 cropland and C4 cropland, which have many differences in their biophysical properties.However, these two types are also shared in the current version of MOD17 BPLUT.
Table 3. Prior distribution (initial value and range) and posterior distribution (mean value and 95% confidence interval) of the parameters of the MOD17 model for all sites.For the parameters εmax (gC/MJ APAR), Tmin_min (°C), Tmin_max (°C), VPDmin (Pa), and VPDmax (Pa), we set the original values of MOD17 BPLUT as the initial values (with bold font).

Uncertainty and Variability of Biophysical Parameters for Diversity Ecosystems in Arid Regions
Since the performance after calibration of all five parameters of the MOD17 model was better than after the calibration of only the parameter ε max , indicating the important role of temperature and water-constrained factors in the estimation of GPP in the arid region.We thus calibrated all the parameters of MOD17 algorithms (Table 3).Our study illustrated that variability of biophysical parameters not only exist across different ecosystems, but also within the same ecosystems, such as the diverse biophysical parameters of grassland ecosystems and the desert grassland ecosystems.The current version of MOD17 BPLUT does not consider the differentials of these two types-they shared the same BPLUT parameters of grassland.However, there are different climate conditions and species in these two ecosystems in the study region.Meanwhile, there are different photosynthesis paths between C3 cropland and C4 cropland, which have many differences in their biophysical properties.However, these two types are also shared in the current version of MOD17 BPLUT.
The value of ε max is biome specific, representing the maximum LUE of the corresponding vegetation in the process of photosynthesis.For a given biome type, the value of ε max is constant and assigned by the MOD17 BPLUT.While the newly released version of BPLUT has corrected and updated the ε max values, the ε max value were still significantly underestimated in the main ecosystems in arid regions (Table 3).The mis-estimation of their values inherently further reduced the accuracy of GPP estimations.Table 3. Prior distribution (initial value and range) and posterior distribution (mean value and 95% confidence interval) of the parameters of the MOD17 model for all sites.For the parameters ε max (gC/MJ APAR), T min_min ( • C), T min_max ( • C), VPD min (Pa), and VPD max (Pa), we set the original values of MOD17 BPLUT as the initial values (with bold font).Meanwhile, the large variations in the temperature and water-constrained stress factors also existed due to the diversity of climate conditions in different parts of HRB (Table 3).For example, the climate is cold and humid in the upstream HRB, therefore, the temperature stress factor has a great impact on GPP estimation in the grassland ecosystems in the upstream HRB.However, as Table 3 reveals, the original MOD17 BPLUT overestimated the parameters of the minimum temperature stress factors and the VPD max values.In comparison, the climate is extremely arid in the downstream HRB, however, the original MOD17 BPLUT underestimated the parameters of the maximum temperature stress factors and the VPD max values.

Sites
In addition, the Bayesian approach can estimate the posterior distribution of model parameters, which is a useful tool to reduce model uncertainty.Using the Bayesian approach, the uncertainty of model parameters was reduced significantly for some sites (e.g., the NTZ site) located in the extremely arid region (Figure 9).The value of εmax is biome specific, representing the maximum LUE of the corresponding vegetation in the process of photosynthesis.For a given biome type, the value of εmax is constant and assigned by the MOD17 BPLUT.While the newly released version of BPLUT has corrected and updated the εmax values, the εmax value were still significantly underestimated in the main ecosystems in arid regions (Table 3).The mis-estimation of their values inherently further reduced the accuracy of GPP estimations.
Meanwhile, the large variations in the temperature and water-constrained stress factors also existed due to the diversity of climate conditions in different parts of HRB (Table 3).For example, the climate is cold and humid in the upstream HRB, therefore, the temperature stress factor has a great impact on GPP estimation in the grassland ecosystems in the upstream HRB.However, as Table 3 reveals, the original MOD17 BPLUT overestimated the parameters of the minimum temperature stress factors and the VPDmax values.In comparison, the climate is extremely arid in the downstream HRB, however, the original MOD17 BPLUT underestimated the parameters of the maximum temperature stress factors and the VPDmax values.
In addition, the Bayesian approach can estimate the posterior distribution of model parameters, which is a useful tool to reduce model uncertainty.Using the Bayesian approach, the uncertainty of model parameters was reduced significantly for some sites (e.g., the NTZ site) located in the extremely arid region (Figure 9).The MODIS Collection 6 GPP products improved the spatial resolution of GPP estimation, which means the estimated GPP is more comparable with the footprint in the areas with

Evaluations of the MOD17A2H Products over Diversity Ecosystems in the Arid Region
The MODIS Collection 6 GPP products improved the spatial resolution of GPP estimation, which means the estimated GPP is more comparable with the footprint in the areas with heterogeneous landscapes in the desert-oasis-alpine ecosystems in the arid region.Meanwhile, the MOD17A2H products updated the meteorological forcing data, FPAR data and land cover data, which highlight the better spatial resolution of 500 m.However, compared to the flux tower-based GPP data, the MOD17A2H GPP products still illustrate some limitations in the simulations of magnitude and spatial-temporal variation of GPP in the desert-oasis-alpine ecosystems in the arid region.From the slope of linear regression for the scatter plot in Figures 3a and 4a, the slope values were only 0.49, which revealed significant underestimation of the GPP in the study region.When compared to the site level of flux tower-based GPP (Figure 6), the slope values in most sites of the study regions were also less than 1.0, except for some desert ecosystem sites.This showed that the MOD17 product underestimated GPP in most high productivity sites of cropland, grassland, and forest ecosystems in the arid region, but overestimated GPP at some low productivity sites of desert ecosystems compared with tower-based GPP, consistent with the results of Reference [12] and Reference [27].

Uncertainty of Input Data in MODIS GPP Estimation in Diversity Ecosystems in the Arid Region
The accuracy of GPP estimation highly depends on the precision of all input data of the MOD17A2H GPP algorithm.Therefore, uncertainties of GPP products arise mainly from the climate drivers, parameter variability, and land cover classification [20].There are three meteorological data types (PAR, T min and VPD), as well as FPAR and land cover classification data involved in the MOD17A2H GPP algorithm, which could be the main source of error in the GPP estimates.The MOD17A2H products used GMAO Reanalysis data for direct meteorological inputs, which is an hourly time-step data set with about a half-degree spatial resolution (0.5 latitude degree by 0.67 longitude degree) generated by the Goddard Earth Observing System Model, Version 5 (GEOS-5) data assimilation system [15].In this study, we replaced the GMAO dataset with in situ meteorological data and recalculated the MOD17 algorithm with default parameters in comparison (GPP_Insitu).Our study revealed that using the in situ forcing data can improve the relationship between modeled GPP and tower-observed GPP compared to the original MOD17A2H products both at eight-day and annual timescales (Figures 3 and 4); the determination coefficients (R 2 ) of these sites were slightly higher than that of the original MOD17 products (R 2 ranging from 0.71 to 0.79 for eight-day step and 0.69 to 0.73 for annual step).However, larger biases still exist between GPP_Insitu and GPP_tower.Using in situ meteorological data did not result in obvious improvements of the GPP estimation performances; on the contrary, some sites were not even as accurate as those calculated with the GMAO datasets (ARZ, DSL, DMZ, and SDZ in Table 2), which is similar to the other results from validation of the MOD17 GPP products [21,27,32].This is caused by some missing values in the original MOD17A2H GPP products making a shorter length of model evaluations, thus reducing the model errors of the GPP_MODIS.The other implication of the results is that an improvement in meteorological data did not have a significant effect on the MODIS GPP estimation, which means the meteorological data is not the main source of uncertainty in GPP simulation in the arid region.
An accurate land cover classification map is vital to MOD17 GPP simulation [18].Misclassification of the land cover directly determines the value of maximum light use efficiency and the other MOD17 BPLUT parameters, thus further influencing the inaccuracies of GPP calculation [20].We validated the MOD12Q1 vegetation maps with our site observations and found the MODIS data misclassified almost all sites of forest and cropland types in the downstream HRB (Figure 7).Study suggested that the accuracies of MOD12Q1 vegetation maps are within 65-80%, and most inaccuracies are in between similar classes [55].Since large desert-natural oasis ecosystems are distributed in the downstream HRB and most of the vegetation cover was less than 30%, the 500 m unit of MODIS land cover classification could pose a risk at such a coarse resolution.Mixed pixels, composed of varied ecosystem types, may occur in the sparsely vegetated region, thus making it difficult to describe the biophysical parameters properly.This incorrect classification of land cover types will therefore lead to an inaccurate GPP calculation.
In addition, FPAR is also an important input physiology variable in the MOD17 model, which directly modulates the essential energy input to photosynthetic processes [8,9].In our study, we compared the MOD15A2H FPAR products with the observations in the study area, and found it significantly overestimated the ecosystems with low productivity (such as the desert ecosystems) and underestimated the ecosystems with high productivity (such as the crop ecosystems) in MODIS FPAR products in the HRB (Figure 8).This will greatly impact the energy redistribution in photosynthetic systems, and thus the GPP estimations in arid regions.Research revealed that FPAR often produces misleading signals in GPP estimations due to contamination by atmospheric characteristics [19].The overestimation of FPAR data is caused by sparse vegetation cover and the effects of large desert cover that impacts the signals of vegetation detection in arid regions.To improve the FPAR estimation in the arid region, we can use the improved FPAR retrieval products with the multi-angle vegetation index information in the future [56].

Uncertainty and Variability of Biophysical Parameters in Modelling GPP over Diversity Ecosystems in Arid Regions
To estimate GPP across varied worldwide ecosystems, MOD17 algorithms use the biophysical variability of parameters generated from look up tables, which include five biome-specific physiological parameters in the model, i.e., ε max , T min_min , T min_max , VPD min , and VPD max [56].In this study, we optimized the maximum LUE, one of the most important parameters for GPP estimation, with the tower-based GPP.An obvious improved performance of GPP simulation can be found in Figures 3c  and 4c, which show good agreements between the observed GPP and the simulated GPP with the parameter optimized (GPP_LUE).The R 2 increased from 0.71 to 0.86, and rRMSE decreased from 12.45% to 6.99%, at an eight-day timescale.Those improved performances were also seen at an annual timescale, with R 2 increasing from 0.69 to 0.87, and rRMSE decreasing from 60.48% to 23.97%.Table 2 reveals the performance of the GPP_LUE model and flux GPP observation were better than the results of only using the ground forcing data; the determination coefficients increased significantly when using optimized ε max parameter, with smaller RMSE, which revealed greater improvements after performing LUE optimization.
Calibration of the maximum LUE parameter improved the performance of MODIS GPP estimation was in accordance with other studies [24,27,32].In addition, we also investigated the potential impacts of uncertainty of the other model parameters (e.g., VPD-limited factors) on GPP monitoring, which has been overlooked by other researchers.In fact, water stress is one of the most important limiting factors controlling terrestrial primary production, especially in arid regions.Previous studies showed that the MOD17 products underestimate water stress, and thus overestimate GPP in some extremely arid regions lacking in water [57].In our study, we found that optimizing the VPD-limited factors can further improve the performance of the GPP estimations.From the eight-day time step of the overall performances, the R 2 increased from 0.86 for the results of only optimizing the LUE parameter to 0.91 for the results of optimizing all parameters, and rRMSE decreased from 6.99% to 5.59% (Figure 3).These results were mainly distributed in some sites in the extremely arid region (i.e., SDQ, HHL, HYL etc.), which revealed the important role of the parameters of water-constrained factors in GPP simulation in arid regions.

Uncertainty of GPP Modelling of the Desert Ecosystems and Its Implications for GPP Simulation in Arid Regions
The current MOD17 model can effectively simulate GPP of main ecosystems in the arid region, however, there are still some difficulties in simulating GPP more accurately in the desert ecosystems.
Model analyses indicate the importance of arid regions in the global carbon cycle, while the models suffer from a lack of data in water-limited regions [3,4].The large errors of GPP simulation in desert ecosystems is caused by the uncertainty of remote sensing vegetation products in regions with large heterogeneity of landscape and low vegetation cover.Moreover, the uncertainty of flux tower observation in desert ecosystems makes it is difficult to estimate a relative 'true' value of GPP [58].To improve GPP estimation in arid regions, several directions can be explored further in the future.For example, improving the estimation of MODIS FPAR and land cover classification products in arid regions using data-driven approaches [59] and improving model structures [60] could be better choices for improving GPP simulation in arid regions.
Meanwhile, since the biome-specific look up tables (BPLUT) are constant for a given biome at any time.Since the current BPLUT of the MOD17 cannot meet the needs of accurate definition of the parameters for all ecosystems [19,61], especially for the diverse and complex ecosystems in arid regions, further research needs to be done to update these BLUPT of the model.In addition to update the parameter of ε max , the water and temperature-limited parameters are also of great importance in GPP estimation, especially for the ecosystems in arid regions.As the development of eddy covariance technique, there are more than 900 EC flux sites in the world currently [62].With the availability of these large number of flux datasets, it provided us the opportunity to retrieve the biome specific parameters for each vegetation type more reasonable, which may improve the accuracy of the current GPP simulation in the arid region.

Conclusions
This study validated and optimized the performance of MODIS-derived GPP compared with EC-observed GPP at seasonal and annual time scales for the main arid ecosystems relying on flux networks constructed in arid and semi-arid ecosystems in China.Our study revealed that the current MODIS GPP products were significantly underestimated, as compared with the tower-observed GPP for most types of ecosystems in the arid region of China, especially the irrigated cropland and forest ecosystems, due to uncertainty of meteorological data and model parameters.Using ground-based meteorological data and updated land use data can improve GPP estimation.In addition to the light-use efficiency parameters, the temperature-limited stress factors and the VPD-limited factors also need to be recalibrated for ecosystems in arid regions.After using the proper model parameters, great improvements to the GPP model can be performed through a Bayesian approach.However, it is difficult to estimate GPP accurately in desert ecosystems because of the uncertainty of remote sensing vegetation products in arid regions.Hence, improvements in modelling GPP in desert ecosystems are needed in future studies.Moreover, this study implies that the current MODIS-derived GPP product requires further improvements to provide accurate monitoring of terrestrial ecosystem productivity in arid regions worldwide.

Figure 1 .
Figure 1.Locations of the flux observation sites over Heihe River Basin (HRB).

Figure 1 .
Figure 1.Locations of the flux observation sites over Heihe River Basin (HRB).

Figure 2 .
Figure 2. Plots of monthly accumulated precipitation, monthly averaged air temperature (T) and monthly averaged vapor pressure deficit (VPD) over HRB.

Figure 2 .
Figure 2. Plots of monthly accumulated precipitation, monthly averaged air temperature (T) and monthly averaged vapor pressure deficit (VPD) over HRB.

Figure 3 .
Figure 3. Comparisons of eight-day gross primary productivity (GPP) of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for all sites.Eight-day GPP scatter plots of the EC-observed GPP and (a) the original MOD17A2H products; (b) in situ meteorology forcing data; (c) only LUE optimized results; and (d) all parameters optimized results.The units of RMSE and rRMSE are gC/m 2 /day and %, respectively.

Figure 3 .
Figure 3. Comparisons of eight-day gross primary productivity (GPP) of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for all sites.Eight-day GPP scatter plots of the EC-observed GPP and (a) the original MOD17A2H products; (b) in situ meteorology forcing data; (c) only LUE optimized results; and (d) all parameters optimized results.The units of RMSE and rRMSE are gC/m 2 /day and %, respectively.

Figure 4 .
Figure 4. Comparisons of annual GPP of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for all sites: (a) Original MOD17A2H products; (b) in situ meteorology forcing data; (c) only LUE optimized results; and (d) all parameters optimized results.The units of RMSE and rRMSE are gC/m 2 /y and %, respectively.

Figure 4 .
Figure 4. Comparisons of annual GPP of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for all sites: (a) Original MOD17A2H products; (b) in situ meteorology forcing data; (c) only LUE optimized results; and (d) all parameters optimized results.The units of RMSE and rRMSE are gC/m 2 /y and %, respectively.

Figure 5 .
Figure 5.Comparison between eight-day GPP of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for the major ecosystems including: (a) Grassland; (b) cropland; (c) desert steppe; and (d) forest.The unit of RMSE is gC/m 2 /day.The blue points represent original MOD17A2H products; green points represent in situ meteorology forcing data; pink points represent only LUE optimized results; and red points represent all parameters optimized results.

Figure 5 .
Figure 5.Comparison between eight-day GPP of MOD17A2H products and GPP simulations by MOD17 model with the flux tower GPP for the major ecosystems including: (a) Grassland; (b) cropland; (c) desert steppe; and (d) forest.The unit of RMSE is gC/m 2 /day.The blue points represent original MOD17A2H products; green points represent in situ meteorology forcing data; pink points represent only LUE optimized results; and red points represent all parameters optimized results.

23 Figure 6 .
Figure 6.Time series of eight-day MODIS GPP and GPP simulations derived from the MOD17 model with the tower-estimated GPP.The full name for each site is listed inTable1.The blue points represent original MOD17A2H products; green points represent in situ meteorology forcing data; pink points represent only maximum LUE optimized results; and red points represent all parameters optimized results.

Figure 6 .
Figure 6.Time series of eight-day MODIS GPP and GPP simulations derived from the MOD17 model with the tower-estimated GPP.The full name for each site is listed inTable1.The blue points represent original MOD17A2H products; green points represent in situ meteorology forcing data; pink points represent only maximum LUE optimized results; and red points represent all parameters optimized results.

Figure 7 .
Figure 7. Misclassification of land cover in the MOD12Q1 products at the downstream HRB, which classified the land cover of forest (i.e.SDQ, HHL, HYL) and cropland (NTZ) as the grassland type in MOD12Q1 data.(a) SDQ, (b) HHL, (c) HYL, and (d) NTZ.

23 Figure 8 .
Figure 8. Validation of MODIS FPAR compared with FPAR measured by AccuPAR.The left plot is the validation at DMZ and SDZ, the right plot is the validation at GBZ and HZZ.

Figure 8 .
Figure 8. Validation of MODIS FPAR compared with FPAR measured by AccuPAR.The left plot is the validation at DMZ and SDZ, the right plot is the validation at GBZ and HZZ.

Figure 9 .
Figure 9. Relative reduction of parameter uncertainty (95% confidence interval) from prior to posterior distribution.The green bar and blue bar represent the reduction of uncertainty in model parameters for the MOD17A2H model at the NTZ and DMZ sites.

1 .
Evaluations of the MOD17A2H Products over Diversity Ecosystems in the Arid Region

Figure 9 .
Figure 9. Relative reduction of parameter uncertainty (95% confidence interval) from prior to posterior distribution.The green bar and blue bar represent the reduction of uncertainty in model parameters for the MOD17A2H model at the NTZ and DMZ sites.

Table 1 .
Characteristics of the flux observation network sites used in this study.MAT (ºC) represented mean annual air temperature, MAP (mm) represented mean annual accumulated precipitation, and PET (mm) represented mean annual potential evapotranspiration.

Table 1 .
Characteristics of the flux observation network sites used in this study.MAT ( • C) represented mean annual air temperature, MAP (mm) represented mean annual accumulated precipitation, and PET (mm) represented mean annual potential evapotranspiration.

Table 1
. The blue points represent original MOD17A2H products; green points represent in situ meteorology forcing data; pink points represent only maximum LUE optimized results; and red points represent all parameters optimized results.3.2.Uncertainty of Satellite Data in MODIS GPP Simulation over Ecosystems in the Arid Region 3.2.1.Impacts of the Accuracy of the Land Cover Classification on MODIS GPP Simulation One of the first MODIS products used in the MOD17 algorithm is the Land Cover Product, MOD12Q1.The importance of this product cannot be overstated as the MOD17 algorithm relies