Simulation of Forest Carbon Fluxes Using Model Incorporation and Data Assimilation

This study improved simulation of forest carbon fluxes in the Changbai Mountains with a process-based model (Biome-BGC) using incorporation and data assimilation. Firstly, the original remote sensing-based MODIS MOD_17 GPP (MOD_17) model was optimized using refined input data and biome-specific parameters. The key ecophysiological parameters of the Biome-BGC model were determined through the Extended Fourier Amplitude Sensitivity Test (EFAST) sensitivity analysis. Then the optimized MOD_17 model was used to calibrate the Biome-BGC model by adjusting the sensitive ecophysiological parameters. Once the best match was found for the 10 selected forest plots for the 8-day GPP estimates from the optimized MOD_17 and from the Biome-BGC, the values of sensitive ecophysiological parameters were determined. The calibrated Biome-BGC model agreed better with the eddy covariance (EC) measurements (R2 = 0.87, RMSE = 1.583 gC ̈m ́2 ̈d ́1) than the original model did (R2 = 0.72, RMSE = 2.419 gC ̈m ́2 ̈d ́1). To provide a best estimate of the true state of the model, the Ensemble Kalman Filter (EnKF) was used to assimilate five years (of eight-day periods between 2003 and 2007) of Global LAnd Surface Satellite (GLASS) LAI products into the calibrated Biome-BGC model. The results indicated that LAI simulated through the assimilated Biome-BGC agreed well with GLASS LAI. GPP performances obtained from the assimilated Biome-BGC were further improved and verified by EC measurements at the Changbai Mountains forest flux site (R2 = 0.92, RMSE = 1.261 gC ̈m ́2 ̈d ́1).


Introduction
Terrestrial ecosystems have acted as a substantial carbon dioxide (CO 2 ) sink and sequestered about 30% of CO 2 emissions over the past decade [1].Of the terrestrial ecosystems, forest land is the main contributor to CO 2 emissions and removals [2].Mitigation through forest may be achieved by either reducing net carbon stock losses or increasing the long-term average carbon stocks [3].Carbon fluxes represent the exchange of CO 2 among forests and the atmosphere, and thus ascertain whether or not forests function as a significant carbon sink for atmospheric CO 2 .Quantifying forest carbon fluxes and understanding forest carbon flux dynamics are both important to better understand interactions between atmosphere, forest, and soil, as well as their impact on climate variation.
Observation and simulation have been the main techniques for quantifying carbon fluxes.The Eddy Covariance (EC) technique is considered the standard tool: measuring carbon, water and energy fluxes between ecosystem and atmosphere directly [4].This method has been widely used for the observation of CO 2 exchange within the FLUXNET [5,6].The EC measurements are temporally continuous (e.g., half-hourly), and the technique is only considered reliable for a limited area of a few square kilometers [7].
An alternative approach was developed for large-scale simulating and forecasting carbon fluxes through models.Simulation of carbon dynamics still experiences major uncertainty, because of the large variation between different models [8] and their often uncertain parameterization [9].A remote sensing-based model (i.e., MODIS MOD_17 GPP (MOD_17)) is exploited to acquire spatially distributed simulation, but relies on the applicability and explicability of remote sensing data, and cannot depict the intrinsic characterizations of forest processes (e.g., evapotranspiration, photosynthesis, respiration, allocation).The process-based model Biome-BGC can explain the detailed forest processes or the response of these processes to climate variation temporally on a continuous basis.However, the simulations of the model rely on abundant ground information, input data, ecophysiological parameters, and the accurate calibration of the parameters.Some of the parameters may be hard to acquire in the field and the number of samples provided may be limited [10].Therefore, the incorporation of MOD_17 with the Biome-BGC model was chosen in this study in order to calibrate the Biome-BGC model and satisfy the requirement of spatial and temporal continuity of the carbon fluxes.
Errors and uncertainties, which form an inherent problem in models, are mainly caused by input variables, model structure, and model parameters [11,12].Data assimilation provides an effective way to integrate information from observations and a model system, taking into account errors both in the observations and the models to realize a best estimate of the true state of the model system [9,13].Sequential data assimilation is designed to update the model system in a sequential manner by separately weighting the observational and modeling errors.The Ensemble Kalman filter (EnKF), an extended Kalman filter, has been a popular data fusion algorithm and its formulation is mainly been contributed to by Evensen [14,15].This method is Monte Carlo-based and uses recursive data processing.It tracks the model error statistics using an ensemble of model state variables.
Both field measurements and remote sensing observations have been successfully assimilated into a process-based model, for either updating the relevant variables in the model, or adjusting the initialization and parameterization of the model [16,17].Wang et al. [18] employed a strategy of assimilating observed soil moisture into the LPJ-DGVM model with an EnKF, and thus improved the performance of carbon and water flux predictions.Mo et al. [19] optimized the parameters of the BEPS model through assimilating the flux data into the model using the EnKF, and the results indicated that this strategy could faithfully retrieve the seasonal the inter-annual variations in parameters.When remote sensing data are used to update the state of simulated variables, the data error is assumed to be acceptable for being propagated within the simulated system.Migliavacca et al. [20] developed a ProsailH-BGC model and assimilated flux data and MODIS NDVI into the model, resulting in good accuracy of daily and annual carbon flux predictions.Demarty et al. [21] assimilated MODIS LAI into a process-based model, improving the performance of carbon flux predictions.
Nevertheless, a series of issues occurred in existing reports.Firstly, current model calibration or parameter optimization neglected the requirement of spatial and temporal variation in the ecophysiological parameters of the process-based model.Secondly, quantifying the uncertainties in process-based models and remote sensing data remained difficult.The uncertainty caused by models included systematic errors and parameter errors [17].The uncertainty in the remote sensing data stemmed from sensor error, cloud and aerosol contaminants, and inversion algorithms [17].
The technique used in this study offers a strategy of simulating forest carbon fluxes with temporal and spatial continuity and making full use of time series remote sensing data, the optimized MOD_17 model, and the Biome-BGC model.The results were checked using EC measurements at the forest flux tower, and global calibration of the model may be used on a local or regional scale for long-term simulation in future studies.Specifically, the original MOD_17 model was optimized first, using refined input data, specific parameters, and Global LAnd Surface Satellite (GLASS) f PAR at a Changbai Mountains forest site dominated by broadleaf Korean pine forest, during 2003 to 2007.Considering the complicated ecophysiological parameters of Biome-BGC, a sensitivity analysis was conducted using the Extend Fourier Amplitude Sensitivity Test (EFAST).The Biome-BGC model was then calibrated by adjusting the most sensitive ecophysiological parameters to fit the simulated eight-day GPP of the selected 10 broadleaf Korean pine forest plots representing different meteorological conditions (i.e., temperature, precipitation) to those obtained from the optimized MOD_17 model.The best agreement confirmed the high accuracy obtained by regenerating the values of sensitive ecophysiological parameters.During the simulation, errors emanating from the model, the input data and the observations were corrected by assimilating GLASS LAI into the calibrated Biome-BGC using the EnKF.Finally, the simulated carbon fluxes were validated against EC measurements.

Changbai Mountains Forest Flux Site
The studied forest flux site (42 ˝24'N, 128 ˝28'E) is located in the Changbai Mountains of Jilin province, and is one of the northernmost forest flux sites in China.The climate in the Changbai Mountains is temperate and continental and is influenced by the monsoon.The annual average precipitation is approximately 713 mm, and precipitation occurs mainly during the summer.The annual average temperature is 3.6 ˝C, and the difference in temperature between summer and winter is large.The elevation at the flux site is 738 m and the main forest type covering the site is temperate broadleaf Korean pine forest, mainly including Pinus koraiensis, Tilia amurensis, and Fraxinus mandshurica [22].

Meteorological Data and EC Measurements
A measurement tower 62 m high was set up with seven levels of routine meteorological profile measurement systems and an open path eddy covariance measurement system.Meteorological data were continuously measured at the forest flux tower from 2003 to 2007.These data included air temperature and relative humidity (Model HMP45C, Campbell Scientific Inc., Logan, UT, USA), precipitation (Model 52203, Rm Young, Traverse City, MI, USA), wind speed and direction.Photosynthesis active radiation (PAR) was measured with a quantum sensor (Model LI190SB, LICOR Inc., Lincoln, NE, USA).Other meteorological data, including vapor pressure deficit (VPD), incoming shortwave radiation (Srad) and day-length from sunrise to sunset were calculated with the MTCLIM 43 model based on the measured daily maximum and minimum temperature and precipitation [23,24].For large scale simulations, DAYMET was used to lengthen the period of recording, and to interpolate and extrapolate from daily meteorological data to grid estimates over large regions [25].DAYMET data contained daily maximum and minimum temperature, precipitation, VPD, Srad, and day-length and were calculated on a 1-km grid over the Changbai Mountains.
The open path eddy covariance system is composed of a three-dimensional sonic anemometer (CAST3, Campbell, KY, USA) and a fast responding open path infrared gas analyzer (LI-7500, LI-COR Inc., Lincoln, NE, USA).The collection frequency for raw flux data was 10 Hz, and for climate data 0.5 Hz.The 30-min averaged values of every variable were calculated.Also a series of preprocessing was conducted including outlier removal, coordinate rotation, time lag analysis, frequency response calibration, and WPL correction.As the half-hour net CO 2 exchange was calculated using EdiRe software, the net ecosystem exchange (NEE) could be obtained.To estimate the night-time net CO 2 exchange, the net CO 2 exchange was regressed with the air or soil temperature using an exponential function.The built model was then used to calculate the ecosystem respiration (ER).Next, NEE and ER were summed to estimate the ecosystem gross primary productivity (GPP).
Five-year (from 2003 to 2007) carbon flux data and meteorological data were thus collected at the forest flux site.Additionally, the latitude, topography, soil texture, and other background information were collected at the forest site.

Remote Sensing Data
Five years of eight-day GLASS LAI and f PAR products, provided by the Center for Global Change Data Processing and Analysis of Beijing University, were used as input data to the remote sensing-based model (http://glass-product.bnu.edu.cn/).The LAI products were obtained from time-series remote sensing data using general regression neural networks (GRNNs).The GLASS f PAR products were then generated from the GLASS LAI products, which were without missing values and spatially complete [26].To compare the GPP of the original and the optimized MOD_17 model, MOD 15A2 and MOD17A2 products were downloaded from NASA LAADS (http://ladsweb.nascom.nasa.gov).The MOD15A2 products included eight-day LAI and f PAR maximum value composite (MVC) products, derived directly from MODIS Reflectance and ancillary data (e.g., land cover type, background).The MOD17A2 simulated eight-day composites of the GPP and annual NPP.
All the GLASS and MODIS products have a resolution of 1 km.The standard preprocessing procedure for the production of these original data includes georeferencing of the original images using the nearest neighbour algorithm.
The elevation of the forest flux site was obtained from ASTER GDEM.Ensuing, slope and aspect were extracted.The soil map was provided by the Data Center for Resources and Environmental Sciences of the Chinese Academy of Sciences.

Methodology
The methodology contains four steps.Firstly, the original MOD_17 model was optimized using meteorological data, calibrated maximum light use efficiency (LUE max ) and GLASS f PAR at the forest flux site.Secondly, a sensitivity analysis of the Biome-BGC model was conducted using EFAST, and the most sensitive parameters were retained for the next calibration.Thirdly, the sensitive parameters were calibrated for optimal incorporation into the model.Finally, the eight-day GLASS LAI products were assimilated into Biome-BGC model using the EnKF.The improved performance of the GPP is compared with EC measurements.Figure 1 represents the overall methodology in this study, and details are given in the subsequent sections.

Remote Sensing Data
Five years of eight-day GLASS LAI and fPAR products, provided by the Center for Global Change Data Processing and Analysis of Beijing University, were used as input data to the remote sensing-based model (http://glass-product.bnu.edu.cn/).The LAI products were obtained from timeseries remote sensing data using general regression neural networks (GRNNs).The GLASS fPAR products were then generated from the GLASS LAI products, which were without missing values and spatially complete [26].To compare the GPP of the original and the optimized MOD_17 model, MOD 15A2 and MOD17A2 products were downloaded from NASA LAADS (http://ladsweb.nascom.nasa.gov).The MOD15A2 products included eight-day LAI and fPAR maximum value composite (MVC) products, derived directly from MODIS Reflectance and ancillary data (e.g., land cover type, background).The MOD17A2 simulated eight-day composites of the GPP and annual NPP.
All the GLASS and MODIS products have a resolution of 1 km.The standard preprocessing procedure for the production of these original data includes georeferencing of the original images using the nearest neighbour algorithm.
The elevation of the forest flux site was obtained from ASTER GDEM.Ensuing, slope and aspect were extracted.The soil map was provided by the Data Center for Resources and Environmental Sciences of the Chinese Academy of Sciences.

Methodology
The methodology contains four steps.Firstly, the original MOD_17 model was optimized using meteorological data, calibrated maximum light use efficiency (LUEmax) and GLASS fPAR at the forest flux site.Secondly, a sensitivity analysis of the Biome-BGC model was conducted using EFAST, and the most sensitive parameters were retained for the next calibration.Thirdly, the sensitive parameters were calibrated for optimal incorporation into the model.Finally, the eight-day GLASS LAI products were assimilated into Biome-BGC model using the EnKF.The improved performance of the GPP is compared with EC measurements.Figure 1 represents the overall methodology in this study, and details are given in the subsequent sections.

The MOD_17 Model
The MOD_17 model is a light-use-efficiency model based on remote sensing [27] and is driven by LUE max , a scalar of the eight-day vapor-pressed deficit (VPD), a scalar of the eight-day minimum air temperature (T min ), and absorbed photosynthetically active radiation (APAR, MJ day-1).Therefore, GPP is calculated as follows: GPP " LUE max ˆfpT min q ˆfpVPDq ˆAPAR, where LUE max is the potential maximum LUE without environmental stress [28]; and APAR is derived from PAR multiplied by f PAR.
The MOD_17 model requires forcing data from the following three sources: (1) biome specific parameters such as LUE max ; (2) meteorological data (i.e., PAR, T min , and VPD); and (3) f PAR.In the past, MODIS GPP products have been generated using data from MODIS surface reflectance with information regarding vegetation phenology, the canopy absorbance of f PAR, and climate data from the NASA Data Assimilation Office (DAO) climate model.Specifically, for estimating GPP, biome specific parameters (LUE max ) are assigned based on an eight-class MODIS land cover classification product (with a 1 km resolution) and the associated Biome Parameters Look Up Table (BPLUT) [29][30][31][32].Using scalars (0-1) for T min and VPD, GPP is then calculated [33].
MODIS GPP products have been evaluated for different ecosystems in various studies [34][35][36][37][38][39].Uncertainties in the original products included those from biome-specific parameters, input data, and vegetation maps [40].Specifically, coarse-scale meteorological data (e.g., T min , VPD) were required for DAO climate data [41]; and errors in radiometry data could lead to the miscalculation of f PAR.

The Biome-BGC Model
Biome-BGC is a process-based model that can depict forest processes (photosynthesis, respiration, carbon, nitrogen, and hydrological cycle).It is developed from the FOREST-BGC model and relies on the Farquhar photosynthesis routine to calculate GPP for the illuminated and shaded foliage [42,43].The autotrophic respiration was separated into the maintenance respiration and growth respiration.Specifically, the former is calculated as a function of the nitrogen content and temperature of the tissues, and the latter is computed as a function of the amount of carbon allocated to the different pools [44].The leaf onset and offset day is also described in the phenological module of Biome-BGC.
Biome-BGC firstly operates through a spin-up run to find a quasi-equilibrium condition with the local environment.It requires three types of driving data: (1) daily meteorological data (maximum and minimum temperature, precipitation, VPD, solar radiation); (2) site information (latitude, topography, and soil texture); (3) ecophysiological parameters, including the ratio of carbon to nitrogen of leaf, fine root and coarse root, fraction of leaf N in Rubisco, and maximum stomatal conductance.
The most recent version (ver.4.2) of Biome-BGC was used in this study, including the complete biome types (e.g., evergreen needle leaf forest, evergreen broadleaf forest, shrub, grass).However, the settings for different types must be modified to adapt to the local environment.

Extend Fourier Amplitude Sensitivity Test (EFAST)
A sensitivity analysis of the Biome-BGC model was necessary to quantify the effects of multiple ecophysiological parameters on the model outputs.The forest flux site was mainly dominated by broadleaf Korean pine, so ecophysiological parameters of broadleaf Korean pine were used in a weighted average of the needle leaf forest (40%-50%) and broadleaf forest (50%-60%) based on the component percentage according to the forestry investigation.
EFAST was applied to analyze the sensitivity of the carbon flux to ecophysiological parameters in this study.This method was widely used to estimate the expected value and variance of the output and the contribution of the input parameters to uncertainty and sensitivity of the outputs [45].The output Y of a model simulator can be calculated as follows: where X i (i = 1,2, . . .,n) represents each input parameter, which has a range of variation to show its uncertainty.The total variance of the model output (V Y ) is: where V i is the first order conditional variance of Y given that the input X i has a fixed value x i .V ij is the second order conditional variance when X i = x i and X ij = x ij .V ijk and V 1,2, . . .,n are the higher order variance of interaction among multiple variables.
The first order sensitivity index, S i , is defined as, S i = V i /V Y .This quantifies the effect of varying X i alone, but averaged over variation in other input parameters.The value of S i is between 0 and 1.The second and higher order sensitivity indices are: S ij = V ij /V Y and S ijk = V ijk /V Y .The total order sensitivity index (S t ) was introduced as the sum of all the above indices, including the contributions of individual parameters and the interactions between a specific parameter and other parameters.

Sensitivity Analysis on Biome-BGC Model
Based on the EFAST (SimLab 2.2), 5000 pseudo values were generated for each parameter according to the probability distribution function (PDF).We used the contributions of the 25 crucial ecophysiological parameters to the five-year average GPP to measure the sensitivity of the model to parameters.
The uncertainty information of the 25 crucial parameters for broadleaf Korean pine, characterized by PDF, is listed in Table 1.Defining the PDF for each crucial parameter was the first step for EFAST, and the variability of the parameters was mainly obtained from White et al. [10].Vapor pressure deficit: complete conductance reduction Pa U(2000, 6000) a U(min, max): uniform distribution, N (mean, standard deviation): normal distribution.

Model Incorporation
Incorporation of the optimized MOD_17 and Biome-BGC model was used to make full use of space-continuity of the remote sensing-based model and the time-continuity characteristics of the process-based model and calibrate the multi-parameters of the process-based model.
The extremely heterogeneous forest landscape (e.g., distribution, age, height) and environmental conditions (e.g., location, weather, terrain) of the study site made it necessary to globally calibrate the parameters (e.g., ecophysiological parameters) of the Biome-BGC model [10].Site parameters, such as vegetation type, soil properties, and terrain information, were derived from vegetation and soil type maps, and digital elevation models (DEMs).The ecophysiological parameters for the dominant tree species and the surrounding atmospheric parameters of the study area were difficult to determine due to a lack of prior knowledge.
For this study, the incorporation of the optimized MOD_17 and Biome-BGC model was performed by calibrating sensitive parameters of the Biome-BGC model.Eight-day GPP estimates over five years (2003)(2004)(2005)(2006)(2007) during the growing seasons were computed for the 10 selected plots by running the optimized MOD_17.The estimates were then used as reference data for calibrating the Biome-BGC model configurations.As a result, a total of 1150 (10 plots ˆ23 layers/plot/year ˆ5 years) GPPs from the Biome-BGC model were plotted against the reference values obtained from the optimized MOD_17 model.When a best fit was found, the optimal configuration for the Biome-BGC model was established.

The Ensemble Kalman Filter Scheme
Data assimilation was designed to create the best analysis of the state of the system and find the optimal model performance, most consistent with observations.EnKF is a sequential data assimilation method and a Monte Carlo based variant of the Kalman filter [15,17], able to integrate multi-source observations sequentially in time.The standard analysis of EnKF can be expressed as: A a " A `Pe H T pHP e H T `Re q ´1pD ´HAq, where A a and A are the analysis matrix and forecast matrix of state variable ensembles, respectively.P e and R e represent the covariance matrixes of state variables and observation variables, respectively.H is the observation operator, and D is the observation vector.D-HA denotes the innovation vectors.In this study, LAI is considered the state variable that is directly assimilated into Biome-BGC.The model state and external observational data share the same variable.Consequently, the observation operator is a unit matrix, as only the remote sensing data are involved.
In Biome-BGC, LAI is a key state variable reflecting vegetation growth and development, and is calculated as follows: LAI " C leaf ˆSLA, where C leaf (kg¨m ´2¨C) is the carbon state variable and SLA is the special leaf area (m 2 ¨kg ´1¨C).Leaf carbon content in current time (C leaf¨t ) is summed by the previous leaf carbon content (C leaf¨t´1 ) and the change in leaf carbon content (∆C leaf¨t ).Once the LAI observations were assimilated into the model, LAI would convert into the leaf carbon content through Equation ( 5), with the leaf carbon being crucial for photosynthetic assimilation and allocation.

Optimization of MOD_17 Model
The original MODIS GPP products were validated against the EC measurements at the forest flux site.The results showed that the original GPPs (GPP_Default in Figure 2a,b) contained significant underestimation compared to the EC measurements, especially in winter and summer (Figure 2b, R 2 = 0.65, RMSE = 26.510gC¨m ´2¨(8d) ´1).Similar underestimation of a forest flux site was also reported by Wang et al. [46].To analyze the impact surrounding the input of parameters collected from three sources (meteorological, biome-specific, and f PAR parameters) on model behavior, three optimized simulations were designed to improve the performance of the GPP.The first experiment (GPP_MOD1) was conducted using DAYMET data, including Tmin, VPD, and PAR, as well as the default LUEmax and MODIS fPAR.As the blue line shows in Figure 2a, the simulated GPPs improved significantly, especially during growing seasons, even though some underestimation still persisted.Better results were obtained in the GPP_MOD1 experiment regarding R 2 and RMSE (Figure 2b, R 2 = 0.87, RMSE = 16.835gC•m −2 •(8d) −1 ).A further optimization (GPP_MOD2) was performed using the DAYMET data, as well as calibrated LUEmax (calculated from the measured GPP and APAR during growing seasons), and MODIS fAPR.The calibrated value of LUEmax, 1.658 gC•MJ −1 APAR, was much greater than the default value obtained from the BPLUT (1.116 gC•MJ −1 APAR).As shown in Figure 2b, the good accordance obtained in the GPP_MOD2 experiment was supported by the R 2 and RMSE (R 2 = 0.89, RMSE = 12.941 gC•m −2 •(8d) −1 ).
To mitigate the influence of noise (e.g., clouds, aerosols, water vapor) in the original MODIS fPAR products on GPP values, the GLASS fPAR products were introduced into the GPP_MOD3 simulation.The original MODIS fPAR products contained some serious noise and exhibited dramatic fluctuations for the forest flux site.In contrast, the temporal profiles for the GLASS fPAR performed The first experiment (GPP_MOD1) was conducted using DAYMET data, including Tmin, VPD, and PAR, as well as the default LUE max and MODIS f PAR.As the blue line shows in Figure 2a, the simulated GPPs improved significantly, especially during growing seasons, even though some underestimation still persisted.Better results were obtained in the GPP_MOD1 experiment regarding R 2 and RMSE (Figure 2b, R 2 = 0.87, RMSE = 16.835gC¨m ´2¨(8d) ´1).A further optimization (GPP_MOD2) was performed using the DAYMET data, as well as calibrated LUE max (calculated from the measured GPP and APAR during growing seasons), and MODIS f APR.The calibrated value of LUE max , 1.658 gC¨MJ ´1 APAR, was much greater than the default value obtained from the BPLUT (1.116 gC¨MJ ´1 APAR).As shown in Figure 2b, the good accordance obtained in the GPP_MOD2 experiment was supported by the R 2 and RMSE (R 2 = 0.89, RMSE = 12.941 gC¨m ´2¨(8d) ´1).
To mitigate the influence of noise (e.g., clouds, aerosols, water vapor) in the original MODIS f PAR products on GPP values, the GLASS f PAR products were introduced into the GPP_MOD3 simulation.The original MODIS f PAR products contained some serious noise and exhibited dramatic fluctuations for the forest flux site.In contrast, the temporal profiles for the GLASS f PAR performed relatively smoothly during 2003-2007.Generally, GPP simulations from the GPP_MOD3 yielded the highest agreement with EC measurements (Figure 2b, R 2 = 0.93, RMSE = 8.736 gC¨m ´2¨(8d) ´1), therefore, the simulations of the selected 10 plots were conducted by means of GPP_MOD3 as next calibration of the Biome-BGC model.

Sensitivity Analysis of Biome-BGC
The results from the EFAST analysis indicated that the contributions by the total-sensitivity index were higher than those by the first-order-sensitivity index, which means that the effects of interactions among the 25 parameters on the model output were much more significant than the effects of the individual parameters.Specifically, the annual mean GPP was most sensitive to Fraction of leaf N in Rubisco (FLNR), which is greatest for both the first order sensitivity index and the total order sensitivity index.Followed by FLNR, the total contributions of canopy average specific leaf area (SLA), maximum stomatal conductance (G max ), and the canopy water interception coefficient (W int ) were 0.35, 0.34, and 0.27, respectively.Together they formed the most important ecophysiological parameters to the model predictions.Some parameters exhibited only small (St < 0.1) or negligible (St < 0.05 or St « 0) main and interaction effects on the annual mean GPP.
The ecophysiological parameters played an important role in the model simulation, and the sensitive parameters listed in Figure 3 were the most important contributors to the annual mean GPP.Therefore, the sensitivity analysis presented an efficient way of reducing the complexity of calibrating the Biome-BGC model by only considering the crucial parameters obtained from the sensitivity analysis.

Sensitivity Analysis of Biome-BGC
The results from the EFAST analysis indicated that the contributions by the total-sensitivity index were higher than those by the first-order-sensitivity index, which means that the effects of interactions among the 25 parameters on the model output were much more significant than the effects of the individual parameters.Specifically, the annual mean GPP was most sensitive to Fraction of leaf N in Rubisco (FLNR), which is greatest for both the first order sensitivity index and the total order sensitivity index.Followed by FLNR, the total contributions of canopy average specific leaf area (SLA), maximum stomatal conductance (Gmax), and the canopy water interception coefficient (Wint) were 0.35, 0.34, and 0.27, respectively.Together they formed the most important ecophysiological parameters to the model predictions.Some parameters exhibited only small (St < 0.1) or negligible (St < 0.05 or St ≈ 0) main and interaction effects on the annual mean GPP.
The ecophysiological parameters played an important role in the model simulation, and the sensitive parameters listed in Figure 3 were the most important contributors to the annual mean GPP.Therefore, the sensitivity analysis presented an efficient way of reducing the complexity of calibrating the Biome-BGC model by only considering the crucial parameters obtained from the sensitivity analysis.

Calibration of Biome-BGC Model
A key determinant for the model to apply in a specific landscape or region is the parameterization, especially for the complicated process-based model.Here a preliminary calibration of the Biome-BGC model was conducted by incorporating the optimized MOD_17 model.Calibration was performed by adjusting the eight-day GPP outputs for the growing seasons from the Biome-BGC model in order to fit those obtained from the optimized MOD_17 model for the 10 selected forest plots.Sensitive (St > 0.1) parameters were calibrated accordingly once the best agreement was determined.During calibration, the sensitive parameters varied by 10% in both directions from the mean values reported by White et al. [10].A scatter plot (Figure 4) indicates optimal agreement between growing season GPP values obtained from the optimized MOD_17 model and those from

Calibration of Biome-BGC Model
A key determinant for the model to apply in a specific landscape or region is the parameterization, especially for the complicated process-based model.Here a preliminary calibration of the Biome-BGC model was conducted by incorporating the optimized MOD_17 model.Calibration was performed by adjusting the eight-day GPP outputs for the growing seasons from the Biome-BGC model in order to fit those obtained from the optimized MOD_17 model for the 10 selected forest plots.Sensitive (S t > 0.1) parameters were calibrated accordingly once the best agreement was determined.During calibration, the sensitive parameters varied by 10% in both directions from the mean values reported by White et al. [10].A scatter plot (Figure 4) indicates optimal agreement between growing season GPP values obtained from the optimized MOD_17 model and those from the calibrated Biome-BGC model (R 2 = 0.84, RMSE = 10.185gC¨m ´2¨(8d) ´1).

Carbon Fluxes from the Calibrated and Assimilated Model
Five years of GLASS LAI products were assimilated into the calibrated Biome-BGC model using the EnKF with an assimilation window of eight days.When the size of the ensemble was larger than 100, the R 2 and RMSE between predicted GPP and EC measurements reached approximately stable values.The errors (from the model and GLASS LAI) were determined through LAI sites all over the world reported by Xiao et al. [47].The error of GLASS LAI products ranged from −0.008 to 0.12.Then the variance (0.032) was calculated and used in the EnKF algorithm.The model error was estimated at the same time, and the value ranged from −0.32 to 0.44, with a variance of 0.616 in EnKF.
Figure 5 shows the comparison between GLASS LAI and simulated LAI, depicted as: LAI_sim (LAI obtained from the calibrated Biome-BGC without data assimilation) and LAI_DA (LAI simulated by the assimilated Biome-BGC model using the EnKF).It is noticeable that LAI_sim generally underestimated LAI compared with the observations, especially in growing seasons.LAI_sim increased rapidly at the beginning of the growing season and the values were higher than the remote sensing observations at the end of the growing season.The LAI_DA agreed well with the GLASS LAI, except for a few fluctuations in winter.

Carbon Fluxes from the Calibrated and Assimilated Model
Five years of GLASS LAI products were assimilated into the calibrated Biome-BGC model using the EnKF with an assimilation window of eight days.When the size of the ensemble was larger than 100, the R 2 and RMSE between predicted GPP and EC measurements reached approximately stable values.The errors (from the model and GLASS LAI) were determined through LAI sites all over the world reported by Xiao et al. [47].The error of GLASS LAI products ranged from ´0.008 to 0.12.Then the variance (0.032) was calculated and used in the EnKF algorithm.The model error was estimated at the same time, and the value ranged from ´0.32 to 0.44, with a variance of 0.616 in EnKF.
Figure 5 shows the comparison between GLASS LAI and simulated LAI, depicted as: LAI_sim (LAI obtained from the calibrated Biome-BGC without data assimilation) and LAI_DA (LAI simulated by the assimilated Biome-BGC model using the EnKF).It is noticeable that LAI_sim generally underestimated LAI compared with the observations, especially in growing seasons.LAI_sim increased rapidly at the beginning of the growing season and the values were higher than the remote sensing observations at the end of the growing season.The LAI_DA agreed well with the GLASS LAI, except for a few fluctuations in winter.
The measured GPP profile from 2003 to 2007 is shown in Figure 6a, together with those obtained from the default (GPP_Default), calibrated (GPP_Calibrated) and assimilated Biome-BGC model (GPP_DA).The original model was incapable of simulating GPP behavior, especially in summer and winter (Figure 6b, R 2 = 0.72, RMSE = 2.419 gC¨m ´2¨d ´1).After calibration, the underestimation during summer had significantly diminished by adjusting the key parameters (FLNR, SLA, G max , etc.), with R 2 = 0.87, and RMSE = 1.583 gC¨m ´2¨d ´1 (Figure 6c).However, some fluctuations still remained, especially in spring.Therefore the time series GLASS LAI was assimilated into Biome-BGC.The LAI was calculated from leaf carbon in Biome-BGC, and the assimilation directly affected the photosynthesis, thus affecting the performance of GPP. Figure 6d shows that GPPs from the assimilated Biome-BGC agreed well with the EC measurements.
the variance (0.032) was calculated and used in the EnKF algorithm.The model error was estimated at the same time, and the value ranged from −0.32 to 0.44, with a variance of 0.616 in EnKF.
Figure 5 shows the comparison between GLASS LAI and simulated LAI, depicted as: LAI_sim (LAI obtained from the calibrated Biome-BGC without data assimilation) and LAI_DA (LAI simulated by the assimilated Biome-BGC model using the EnKF).It is noticeable that LAI_sim generally underestimated LAI compared with the observations, especially in growing seasons.LAI_sim increased rapidly at the beginning of the growing season and the values were higher than the remote sensing observations at the end of the growing season.The LAI_DA agreed well with the GLASS LAI, except for a few fluctuations in winter.The measured GPP profile from 2003 to 2007 is shown in Figure 6a, together with those obtained from the default (GPP_Default), calibrated (GPP_Calibrated) and assimilated Biome-BGC model (GPP_DA).The original model was incapable of simulating GPP behavior, especially in summer and winter (Figure 6b, R 2 = 0.72, RMSE = 2.419 gC•m −2 •d −1 ).After calibration, the underestimation during summer had significantly diminished by adjusting the key parameters (FLNR, SLA, Gmax, etc.), with R 2 = 0.87, and RMSE = 1.583 gC•m −2 •d −1 (Figure 6c).However, some fluctuations still remained, especially in spring.Therefore the time series GLASS LAI was assimilated into Biome-BGC.The LAI was calculated from leaf carbon in Biome-BGC, and the assimilation directly affected the photosynthesis, thus affecting the performance of GPP. Figure 6d shows that GPPs from the assimilated Biome-BGC agreed well with the EC measurements.

Discussion
GPP estimates obtained from the optimized MOD_17 model performed well in this study, and thus provided the possibility of incorporating the remote sensing-based model with the processbased model (Biome-BGC) to acquire the carbon fluxes with temporal and spatial continuity.Furthermore, this strategy provided the option of circumventing the problem of scarce field data describing the various forest processes in specific biomes.Previous investigations have applied this method in a Mediterranean forest ecosystem [44,48,49], but the errors caused by the uncertainties in the remote sensing-based model would there be propagated to the next calibration and the predictions of carbon fluxes; the calibration of the parameters mainly focused on the parameters related to summer drought without considering exhaustive sensitivity analysis of other crucial parameters or a global calibration.Additionally, many studies concentrating on the calibration of Biome-BGC using field measurements (e.g., dendrochronological data, eddy covariance measurements) were carried out without considering temporal continuity or spatial heterogeneity based on reliable remote sensing data [50,51].
The results of sensitivity analysis showed that the total sensitivity indices of the crucial parameters were the main contributors to the annual mean GPP.This means that one parameter may

Discussion
GPP estimates obtained from the optimized MOD_17 model performed well in this study, and thus provided the possibility of incorporating the remote sensing-based model with the process-based model (Biome-BGC) to acquire the carbon fluxes with temporal and spatial continuity.Furthermore, this strategy provided the option of circumventing the problem of scarce field data describing the various forest processes in specific biomes.Previous investigations have applied this method in a Mediterranean forest ecosystem [44,48,49], but the errors caused by the uncertainties in the remote sensing-based model would there be propagated to the next calibration and the predictions of carbon fluxes; the calibration of the parameters mainly focused on the parameters related to summer drought without considering exhaustive sensitivity analysis of other crucial parameters or a global calibration.Additionally, many studies concentrating on the calibration of Biome-BGC using field measurements (e.g., dendrochronological data, eddy covariance measurements) were carried out without considering temporal continuity or spatial heterogeneity based on reliable remote sensing data [50,51].
The results of sensitivity analysis showed that the total sensitivity indices of the crucial parameters were the main contributors to the annual mean GPP.This means that one parameter may enlarge the effects of other parameters [52].The interaction effects of FLNR showed the highest influence on the annual mean GPP.FLNR controls the carboxylation capacity (V cmax ), which is the first step in the fixing of CO 2 by the ribulose-1, 5-biphosphate protein [53].V cmax is a state variable in Biome-BGC with high spatial variability, so it is difficult to acquire [54].The SLA also made contributions to the annual mean GPP by affecting the LAI.G max controlled the leaf conductance and was crucial for regulating water loss and carbon assimilation.The W int coefficient determined the amount of precipitation intercepted by the canopy, which in turn controlled the amount of precipitation infiltrating [50].With the interactions between the sensitive parameters being so extensive, the correlation between the input parameters should be further considered in a comprehensive calibration of the Biome-BGC.EnKF sequential data assimilation is an efficient way to improve the simulated carbon fluxes, capable of tracing the seasonal estimates of model outputs.As a key indicator of vegetation growth, the time series GLASS LAI was assimilated into the Biome-BGC model.These products of higher quality than the MODIS LAI were validated by Xiao et al. [47] at sites worldwide.The estimates of errors from GLASS LAI and model structure were important to the EnKF algorithm as the errors determined the extent to which the model predictions could be corrected.Considering the drawbacks of Biome-BGC in describing the water cycle [55], it is necessary to assimilate a soil moisture parameter into the model to improve the performance of evapotranspiration and soil moisture.The assimilation of long-term remote sensing observations (e.g., LAI, soil moisture) into process-based models could provide information on forest disturbance, an avenue that needs to be explored in further work.
The strategy explored in this study could be applied in long-term simulations of forest carbon fluxes over large regions and even facilitate analyzing the response of carbon fluxes to climatic variation.The reliability of the strategy has recently been proved in our study conducting long-term (between 2000 and 2012) validation of regional net primary productivity using dendrochronological data over the Qilian Mountains (R 2 = 0.73, RMSE = 24.46gC¨m ´2¨a ´1) [56].However, annual estimates of forest carbon fluxes still have limitations, some of which could be improved by better data or better understanding of carbon fluxes using optimized methods.Long-term carbon storage will be influenced by different kinds of forest disturbances (e.g., afforestation, deforestation, tree mortality, wildfire).Therefore, research on how carbon fluxes relate to forest disturbances, land management, and soil conditions should be considered in long-term simulations.The five-yearly national forest inventories conducted in China could provide additional information on natural and human disturbance.

Conclusions
The optimization of the MOD_17 model using refined input data improved the forest GPP behavior compared with applying the original MODIS model driven by default parameters.The optimized model was then incorporated in the Biome-BGC model exploiting the powers of both models for the simulation of forest carbon fluxes.The incorporation is also beneficial for the calibration of ecophysiological parameters and can make Biome-BGC more resistant to the impact of environment variability.The key ecophysiological parameters were determined with a sensitivity analysis on Biome-BGC using EFAST, a global analysis method that could provide the first order sensitivity index and the total order sensitivity index of the parameters to model predictions.The results demonstrated that FLNR, SLA, and FRC:LC were the parameters most sensitive for the model output.
A data assimilation method using EnKF has been designed to improve the simulation of carbon fluxes and reduce the errors based on time series remote sensing observations (GLASS LAI).This method takes errors in observations, input data and model structure into account to ensure that the output behavior is consistent with the observations.Eight-day GLASS LAI products were assimilated into the calibrated Biome-BGC model from 2003 to 2007 at a Changbai Mountains forest flux site to track the model over time.It was found that GPP behavior was smoothed by the EnKF, especially in spring, with the performance generally improving significantly (R 2 = 0.92, RMSE = 1.261 gC¨m ´2¨d ´1).
With the model calibration and data assimilation, GPP behavior showed good agreement with EC measurements at the forest flux site, and it is possible to simulate other forest processes (e.g., evapotranspiration, respiration) based on this strategy as well, both on a local and a regional scale.

Figure 1 .
Figure 1.The overall flowchart of this study.

Figure 1 .
Figure 1.The overall flowchart of this study.
, biome-specific, and fPAR parameters) on model behavior, three optimized simulations were designed to improve the performance of the GPP.

Figure 3 .
Figure 3.The first order and total order sensitivity indexes for annual mean GPP.

Figure 3 .
Figure 3.The first order and total order sensitivity indexes for annual mean GPP.

Figure 4 .
Figure 4.A comparison of eight-day GPP values for the growing season obtained using the optimized MOD_17 model and those derived from the calibrated Biome-BGC model.

Figure 4 .
Figure 4.A comparison of eight-day GPP values for the growing season obtained using the optimized MOD_17 model and those derived from the calibrated Biome-BGC model.

Figure 5 .
Figure 5.Comparison of simulated LAI and GLASS LAI.Figure 5. Comparison of simulated LAI and GLASS LAI.

Figure 5 .
Figure 5.Comparison of simulated LAI and GLASS LAI.Figure 5. Comparison of simulated LAI and GLASS LAI.

Figure 6 .
Figure 6.Results of default, calibrated, and assimilated GPP: (a) Performances of GPPs from the EC measurement, default, calibrated and assimilated Biome-BGC; (b) comparison and validation of GPP values from the EC measurement and default Biome-BGC; (c) comparison and validation of GPP values from the EC measurement and calibrated Biome-BGC; (d) comparison and validation of GPP values from the EC measurement and assimilated Biome-BGC.

Figure 6 .
Figure 6.Results of default, calibrated, and assimilated GPP: Performances of GPPs from the EC measurement, default, calibrated and assimilated Biome-BGC; (b) comparison and validation of GPP values from the EC measurement and default Biome-BGC; (c) comparison and validation of GPP values from the EC measurement and calibrated Biome-BGC; (d) comparison and validation of GPP values from the EC measurement and assimilated Biome-BGC.