Simultaneous Assimilation of Remotely Sensed Soil Moisture and FAPAR for Improving Terrestrial Carbon Fluxes at Multiple Sites Using CCDAS

The carbon cycle of the terrestrial biosphere plays a vital role in controlling the global carbon balance and, consequently, climate change. Reliably modeled CO2 fluxes between the terrestrial biosphere and the atmosphere are necessary in projections of policy strategies aiming at constraining carbon emissions and of future climate change. In this study, SMOS (Soil Moisture and Ocean Salinity) L3 soil moisture and JRC-TIP FAPAR (Joint Research Centre—Two-stream Inversion Package Fraction of Absorbed Photosynthetically Active Radiation) data with respective original resolutions at 10 sites were used to constrain the process-based terrestrial biosphere model, BETHY (Biosphere, Energy Transfer and Hydrology), using the carbon cycle data assimilation system (CCDAS). We find that simultaneous assimilation of these two datasets jointly at all 10 sites yields a set of model parameters that achieve the best model performance in terms of independent observations of carbon fluxes as well as soil moisture. Assimilation in a single-site mode or using only a single dataset tends to over-adjust related parameters and deteriorates the model performance of a number of processes. The optimized parameter set derived from multi-site assimilation with soil moisture and FAPAR also improves, when applied at global scale simulations, the model-data fit against atmospheric CO2. This study demonstrates the potential of satellite-derived soil moisture and FAPAR when assimilated simultaneously in a model of the terrestrial carbon cycle to constrain terrestrial carbon fluxes. It furthermore shows that assimilation of soil moisture data helps to identity structural problems in the underlying model, i.e., missing management processes at sites covered by crops and grasslands.


Introduction
The terrestrial biosphere plays a vital role in the global carbon, water, and energy cycles [1,2].Exchanges of carbon between the land surface and the atmosphere directly influence atmospheric carbon dioxide concentrations.The interactions of these three cycles result in a climate-carbon cycle feedback, which is a major source of uncertainty in climate change projections [2].Large uncertainties exist in estimates of the terrestrial carbon balance both in observations and in model simulations [2][3][4][5].Therefore, a better understanding of the response of the terrestrial biosphere to climate variability is both necessary and fundamental for improving projections of future climate change.
Several land data assimilation systems (LDASs) have been developed recently to assimilate various observations into land surface models, such as the carbon cycle data assimilation system (CCDAS) [6], the global land data assimilation system (GLDAS) [7], the coupled land vegetation LDAS (CLVLDAS) [8], and the LDAS-Monde [9].These LDASs either optimize state variables or process parameters of the land surface model.CCDAS is an ideal tool to infer carbon exchanges both within the terrestrial biosphere and with the atmosphere by combining observations (remotely sensed or in-situ data) with process information as represented in models [6,10,11] to constrain model process parameters.A terrestrial biosphere model, in our case, the Biosphere Energy Transfer HYdrology (BETHY) model [12], parameterizes relevant processes in the terrestrial biosphere, from water and energy balance to photosynthesis.The actual values of the parameters used in these process-based models are often badly constrained by a limited set of observations taken on small scales with limited plant types and soil textures as well as climates.The typical spatial and temporal heterogeneity of various ecosystems can cause large uncertainties in modeling the global response of the terrestrial biosphere [2,13].Also, the many nonlinear processes implemented in terrestrial biosphere models result in more complicated relationships between chosen parameters and modeled carbon fluxes at different temporal and spatial scales [3,14].
Assimilating observations into terrestrial biosphere models has been shown to be a powerful and efficient tool in constraining model parameters [10,11,[15][16][17][18][19].One of the observational datasets most used within the CCDAS, either as the unique observational constrain or jointly with others, is the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) [11,[15][16][17][18]20,21].Work has been done to assimilate FAPAR into the BETHY model, which was modified by an improved phenology scheme coupled to the model's two-flux representation of the radiative transfer in the canopy-soil system [16].They found that most of the uncertainty reduction could be seen in parameters related to leaf phenology.The simultaneous assimilation of FAPAR with other datasets has also been shown to be promising.In a study by [22], FAPAR together with atmospheric CO 2 concentrations were assimilated into BETHY, and the model's hydrology could be significantly constrained in a global simulation.Another work was done by assimilating latent heat flux and FAPAR for a woodland site in Botswana [18].The study demonstrated that the model parameter controlling the maximum water uptake was efficiently constrained, but its optimal value was sensitive to the choice of the observational data used in the assimilation procedure.This leads to the probable conclusion that different observational constraints are connected to different processes in the terrestrial biosphere model.Recently, a comparison of assimilations with two FAPAR products using the ORCHIDEE model (Organising Carbon and Hydrology In Dynamic Ecosystems) [23] in an assimilation system was performed, with a simplified observation operator that employs the Lambert-Beer law to simulate FAPAR as a function of LAI (Leaf Area Index) [20].They found that different FAPAR products could result in different model performances and in reducing uncertainties of different parameters.The combination with other datasets (e.g., eddy covariance fluxes) would improve the fit to independent observations.Remotely sensed soil moisture has been widely used for assimilation in hydrological models [24][25][26], but also for assimilation in the context of updating gross carbon fluxes on a regional scale [27,28].Recently, the assimilation work has shown that remotely sensed soil moisture is able to provide, as a complementary dataset, an additional constraint on hydrological as well as carbon cycle processes in a terrestrial biosphere model.Soil moisture, especially plant-available soil moisture, plays a key role in controlling photosynthesis and transpiration in plants and therefore regulates CO 2 uptake through leaf stomata.Moreover, drought, freeze/thaw and, again, surface soil moisture control plant carbon cycling [29].Nowadays, surface soil moisture can be observed by microwave sensors and is therefore independent of cloud cover.Such data sets have the potential to improve the simulation of hydrological processes in terrestrial biosphere models, thus resulting in improved terrestrial carbon cycle simulations.
Soil moisture from the European Space Agency's Soil Moisture and Ocean Salinity (SMOS) mission provides such a high-quality dataset for assimilation in hydrological and terrestrial biosphere models [19,[30][31][32].Assimilation of SMOS L3 soil moisture data together with atmospheric CO 2 concentration data into the BETHY model to constrain the terrestrial carbon cycle on a global scale was done [19].Assimilating SMOS soil moisture data into BETHY improved several aspects of the hydrological and carbon cycle [19]: (1) The agreement between both modeled and independently observed surface soil moisture (such as in-situ measurements of soil moisture from the International Soil Moisture Network (ISMN) [33]) improved significantly, and (2) the agreement between modeled and observed CO 2 concentrations improved.A combined assimilation of FAPAR and soil moisture has been proposed by many studies as a potential improvement of the modeled carbon cycle [16,18,19].
In this study, we investigate the potential of such a simultaneous assimilation of remotely sensed surface soil moisture and FAPAR data to constrain the terrestrial carbon cycle both in single-site and in multi-site setups.The questions addressed are: What are the advantages of combining soil moisture and FAPAR data in constraining parameters of BETHY in a multiple sites approach?What do the posterior parameter values obtained by the assimilation experiments tell us about the mechanistic processes represented in the model?We also address the question whether an optimal parameter set obtained from site scale assimilation experiments improves global scale simulations.

Study Site Descriptions
For this study, we chose 10 sites from the FLUXNET network as provided in the FLUXNET2015 data set (www.fluxdata.org,last accessed 6 July 2018).They were representative of different climate regimes, land cover types (in the model represented by Plant Functional Types (PFTs)), and soil textures, as depicted in Table 1.Detailed descriptions of all 10 sites can be found on the FLUXNET website (www.fluxdata.org).Site names used here are consistent with the naming and site codes from FLUXNET.The 10 sites contained 7 of the 13 PFTs used in BETHY, 4 of the 6 soil textures, and very different climate zones, such as a tropical site in northern Australia or a polar, tundra dominated site in Finland.Detailed PFT information is given in Table 2.At these sites, FLUXNET measurements were available from 2010 to 2014, corresponding to the SMOS L3 soil moisture data period used in this study.scale.BETHY has 13 plant functional types (PFTs) to globally describe vegetation, as shown in Table A1.
Each grid cell can be characterized by up to 3 PFTs, with a fractional coverage assigned for each PFT.Most PFTs use the same set of basic equations in modeling carbon cycle processes with different parameter values assigned.However, C3 and C4 photosynthesis and the growing season onset and termination (phenology) of each PFT are described by PFT-specific equations [16].BETHY is forced by daily meteorological data, namely maximum and minimum air temperatures, precipitation, and incoming shortwave radiation.Here, we used meteorological data derived from the CRUNCEP V7 dataset (https://www.earthsystemgrid.org,last accessed 6 July 2018) as input data for running the model because the meteorological data measured at the sites are incomplete over the simulation period of 2010-2015.The original CRUNCEP data with a 6-hourly temporal resolution was converted to daily values.Meteorological data for each study site was extracted from the 0.5 • × 0.5 • gridded dataset.These data were then corrected by measured meteorological data from site meteorological stations to fill existing gaps in the site meteorology.The correction was done by using a linear regression method for air temperature and radiation.Precipitation was corrected by adjusting CRUNCEP gridded data to the total precipitation over the simulation period at the corresponding site.The model's hydrology was initialized by a spin-up run of the model using climatological forcing from 2009, which was sufficient for initializing the model as shown to be reasonable by [19].
CCDAS uses a probabilistic inversion method as described by [36].A more detailed description of the optimization strategy can be found in [22].In short, the system uses Gaussian Probability Density Functions (PDFs) to describe the state of the control vector, x (containing the model's process parameters), and the observations, d.The uncertainty covariance matrices of the model parameters and the observations are represented by C x and C d , respectively.The model, M(x), includes appropriate observation operators and maps the parameters to observables.The assimilation was done by minimizing the following cost function: The summation in Equation ( 1) is over the different observations when assimilating the data simultaneously, in our case, soil moisture and FAPAR.The cost function, J total , consists three parts, the parameter part, J p (third term in Equation ( 1)), the soil moisture part, J sm (first term in Equation ( 1)), and the FAPAR part, J fapar (second term in Equation ( 1)).In practice, the minimization of J total (= J p + J sm + J fapar ) is performed iteratively by a gradient-based algorithm.The search direction is determined via the gradient of J with respect to the control vector, which, in this study, was derived from a finite difference approximation: Within this approach, it is crucial to estimate a reasonable value for ε [37].After a series of tests, we chose ε =1.0 × 10 −2 .Figure 1 shows a flow chart of the calculation of the different terms in the cost function in CCDAS.

BETHY Soil Hydrology Scheme-Modified VIC 1L
To simulate surface soil moisture within BETHY in a way that can be compared to remotely sensed surface soil moisture, a modified version of the Variable Infiltration Capacity 1-layer (VIC-1 L) soil model by [38] was implemented in BETHY, for more details see [19].The VIC model has the distinct property of representing within-grid cell spatial variability of infiltration capacity of the soil in a statistical fashion, something that is particularly suitable for simulations at the scale of the SMOS footprint of several hundred km 2 [39].Therefore, a thin layer was added from which all soil evaporation occurs.The VIC-1 L evapotranspiration was also slightly modified, because in its original form, vegetation effects were not accounted for.This was needed to be consistent with transpiration and canopy throughfall of BETHY, and changed the base flow equation to prevent the soil from draining below field capacity.This new scheme is thus a 2-layer hydrology model with a surface and root-zone layer and the same 1-day time step as BETHY's original bucket model.
Soil moisture at saturation and at wilting point were computed from soil texture, total soil depth, and -for the root-zone layer-rooting depth.The values for rooting depth have been adapted from [12], taking account of the higher rooting depth of tropical trees and assigning a slightly higher rooting depth for temperate trees and shrubs and a lower one for grasses and tundra.A global soil depth and texture map and parameterization of soil porosity (saturation water content) and water potential was adapted from [40]: where θ is the soil water content, n is the soil porosity, ψ is the soil water potential, ψs is the soil water potential at saturation, and b is the slope parameter of the retention curve.Details as well as validation of the modified VIC-1 L model are given in [19].For the simulations described in this study, we used a physical depth of the surface soil layer of 40 mm to be comparable to the SMOS observations.sn and sb are scaling parameters for n and b included in the optimization.Additionally, soil hydrological parameters were set as fixed values derived from the VIC_3L model [41].

Parameters
In this study, we optimized a total of 100 parameters (Table S2) belonging to BETHY.Some of the parameters were global, some differentiated by PFT, and some by soil texture class.Parameters 72 to 100 were added to the version of CCDAS after implementation of a new phenology scheme [16].For a detailed description of the parameters, 1 to 71, as well as the sensitivity we refer to [16].The remaining parameters refer to stomatal control (72-74), energy balance (75), or soil water balance (76-Figure 1. Flow chart of the calculation of the total cost function, J total (= J p + J sm + J fapar ) for simultaneous assimilation of soil moisture and FAPAR data as the sum of contributions from the parameter part (J p ) and for the two data terms, J sm and J fapar .Ovals denote data and rectangular boxes denote processes (i.e., code modules).

BETHY Soil Hydrology Scheme-Modified VIC 1L
To simulate surface soil moisture within BETHY in a way that can be compared to remotely sensed surface soil moisture, a modified version of the Variable Infiltration Capacity 1-layer (VIC-1 L) soil model by [38] was implemented in BETHY, for more details see [19].The VIC model has the distinct property of representing within-grid cell spatial variability of infiltration capacity of the soil in a statistical fashion, something that is particularly suitable for simulations at the scale of the SMOS footprint of several hundred km 2 [39].Therefore, a thin layer was added from which all soil evaporation occurs.The VIC-1 L evapotranspiration was also slightly modified, because in its original form, vegetation effects were not accounted for.This was needed to be consistent with transpiration and canopy throughfall of BETHY, and changed the base flow equation to prevent the soil from draining below field capacity.This new scheme is thus a 2-layer hydrology model with a surface and root-zone layer and the same 1-day time step as BETHY's original bucket model.
Soil moisture at saturation and at wilting point were computed from soil texture, total soil depth, and -for the root-zone layer-rooting depth.The values for rooting depth have been adapted from [12], taking account of the higher rooting depth of tropical trees and assigning a slightly higher rooting depth for temperate trees and shrubs and a lower one for grasses and tundra.A global soil depth and texture map and parameterization of soil porosity (saturation water content) and water potential was adapted from [40]: where θ is the soil water content, n is the soil porosity, ψ is the soil water potential, ψ s is the soil water potential at saturation, and b is the slope parameter of the retention curve.Details as well as validation of the modified VIC-1 L model are given in [19].For the simulations described in this study, we used a physical depth of the surface soil layer of 40 mm to be comparable to the SMOS observations.s n and s b are scaling parameters for n and b included in the optimization.Additionally, soil hydrological parameters were set as fixed values derived from the VIC_3L model [41].

Parameters
In this study, we optimized a total of 100 parameters (Table A2) belonging to BETHY.Some of the parameters were global, some differentiated by PFT, and some by soil texture class.Parameters 72 to 100 were added to the version of CCDAS after implementation of a new phenology scheme [16].For a detailed description of the parameters, 1 to 71, as well as the sensitivity we refer to [16].
The remaining parameters refer to stomatal control (72-74), energy balance (75), or soil water balance (76-100), and are described in detail in [19].Rooting depth was differentiated by PFT.The soil hydrology parameters, i.e., the scaling factors for soil porosity and the slope parameter of the retention curve, were differentiated by soil texture class.

SMOS Dataset
We used the reprocessed SMOS L3 (ver.300) daily soil moisture product (L-band, 1.4 GHz) from CATDS (www.catds.fr,last accessed 6 July 2018) for the assimilation.The SMOS L3 soil moisture product represents soil moisture at a spatial resolution of ~25 km (EASE grid) within a thin surface layer (4 cm).The data were filtered considering radio frequency interference (RFI) according to [19].Similar filtering was applied at grid points with a high probability of snow, frozen soil, water body, strong topography, and urban area.
We used the extended triple collocation (ETC) [42] analysis to calculate the uncertainty estimate for the SMOS L3 data.The other two datasets entered into the ETC calculation were the surface soil moisture computed by BETHY based on prior parameter values, and the Advanced SCATterometer (ASCAT) soil moisture observations (H109 Metop ASCAT DR2016 SSM time series 12.5 km sampling (see http://hsaf.meteoam.it)derived from an active microwave radar, C-band, 5.255GHz [43]).Then, the calculated uncertainty from ETC (model uncertainty) and uncertainty from data retrieval (data uncertainty) were formulated into the final SMOS soil moisture uncertainty, by: where σ SMOS is the SMOS soil moisture uncertainty used in the assimilation, and σ data and σ model are data and model uncertainty, respectively.Additionally, we corrected the remotely sensed SMOS soil moisture to obviate the need for accurate soil texture data, and to retain as much useful information from the SMOS measurements as possible [19].This was done by using the mean and standard deviation of the simulated surface soil moisture to correct the SMOS soil moisture: where s denotes the SMOS soil moisture (converted to mm), m s is the long-term mean and σ s its standard deviation, and S is the corrected SMOS soil moisture that is matched to the modeled surface soil moisture (in mm).m w and σ w are the long-term mean and standard deviation of the modeled soil moisture derived from the prior model parameter set.

TIP-FAPAR Dataset
In the assimilation experiments, we used FAPAR data (16-day resolution) derived from the Joint Research Center with a Two-stream Inversion Package (JRC-TIP) [44].The JRC-TIP FAPAR product was derived by the JRC-TIP algorithm from MODIS 0.01-degree albedo data [45].A quality assurance was performed at two steps in the FAPAR retrieval algorithm.First, the MODIS broadband albedo product comes with a quality flag and low-quality input albedos were discarded before the JRC-TIP was applied.Second, dubious retrieval results were detected and if possible, improved in an iterative procedure [46].Uncertainties for the FAPAR data were derived from the MODIS input albedo uncertainty by error propagation [47].For more detailed information on the uncertainty propagation, we refer to the related papers [48].
We added an extra uncertainty of 0.05, taking into account very sparse observations (every 16-days).The total uncertainty of the FAPAR data was calculated as: where σ data is the uncertainty of the product.σ FAPAR is the FAPAR uncertainty used in the assimilation.
A similar bias correction for the soil moisture observations (Equation ( 5)) was done for the JRC-TIP FAPAR product.

FLUXNET Dataset
Estimates of daily net ecosystem exchange (NEE) were provided by FLUXNET2015 (www.fluxdata.org)for 8 of the 10 study sites, and for the two remaining sites by the European fluxes database (site ES-LMa; see www.europe-fluxdata.eu,last accessed 6 July 2018) and by the AmeriFlux data base (site US-Kfs; see https://ameriflux.lbl.gov, last accessed 6 July 2018), respectively.The NEE data covers the time period of interest of this study (2010-2014).In-situ daily soil moisture data covering 2010-2014 period from two sites, AU-DaS and US-Kfs, were also used for validation of the soil moisture simulations.

SIO CO 2 Concentrations
We used the same global setup as [19] in the 8 • × 10 • spatial resolution for validating our posterior net carbon flux results against atmospheric CO 2 exemplarity at two stations.The simulation period spanned from 2009 to 2015, with 2009 as an additional spin-up year.Monthly CO 2 concentrations were taken from the Scripps Institution of Oceanography (SIO) network (http://scrippsco2.ucsd.edu/,last accessed 6 July 2018).We used 2 stations (Table A3), for which both SIO and the Carbon Cycle Cooperative Global Air Sampling Network of the National Oceanic and Atmospheric Administration's (NOAA) Earth System Research Laboratory (ESRL) (ftp://aftp.cmdl.noaa.gov/data/trace_gases/co2/flask/surface/, last accessed 6 July 2018) CO 2 observations were available.As in [49], the uncertainties of the CO 2 observations were calculated based on the maximum of two values: The first one is the absolute difference between the SIO and the ESRL monthly mean values, and the second one is the standard deviation from a spline fit according to [50].

Assimilation Experiments
We performed three groups of assimilation experiments for both single-site and multi-site cases.In the first experiment (denoted as 'sm'), we assimilated only SMOS L3 soil moisture data, in the second experiment (denoted as 'fapar'), we assimilated only JRC-TIP FAPAR data, and in the third experiment (denoted as 'sm+fapar'), we assimilated both SMOS L3 soil moisture and JRC-TIP FAPAR data.As mentioned, for two of the selected sites, DK-Sor and FI-Sod, no soil moisture data were available for assimilation due to the filtering of the SMOS L3 data set with high RFI.Therefore, only JRC-TIP FAPAR data were assimilated at these two sites.This resulted in 29 assimilation experiments (8 for the single-site 'sm', 10 for the single-site 'fapar', 8 for the single-site 'sm+fapar', 3 for the multi-site assimilation).
We note here the different spatial resolutions of the remote sensing products used in the assimilation (~25 km for the SMOS soil moisture and ~1 km for the JRC-TIP FAPAR as mentioned in the previous section) and the even smaller footprint (~few hundred meters) of the eddy covariance FLUXNET data used for validating our results.We also note the difference in the temporal resolutions of the SMOS soil moisture product (daily) and the JRC-TIP FAPAR product (16 daily).Hence, the number of soil moisture data points was higher than the FAPAR data points (6887 versus 1218 in the 'sm+fapar' multi-site case).However, our objective was to use all the observational information at the temporal and spatial scales available and not to achieve an equal balance between observational data streams in terms of numbers of observations.In any case, the weight of an individual observation in the cost function calculation was a consequence of the specified uncertainty for this observation.In that sense, it was difficult to judge the constraint of an observational data streams on the model by the number of data points alone.The temporal resolution of a dataset determines the temporal scales that can be resolved by the data.The daily SMOS L3 product will be able to capture this temporal variability and constrain surface soil moisture and the associated hydrological processes on this time scale.However, phenological quantities (such as LAI) show less high temporal variability than soil moisture, therefore, the 16 daily FAPAR observations were of sufficient temporal resolution to constrain the phenology part of the model.

Optimization Performance
Table 2 shows a summary of the optimization results for the different experiments.For all three experiments, cost function values were largely reduced, with some exceptions in the 'fapar' experiment, where the initial cost functions were already very small (AU-DaS, 0% reduction in the cost function value, FR-Pue 1.38% reduction, and US-Wkg 7.74% reduction).In general, for all experiments, the optimization was successful, with gradients close to zero.In the 'sm+fapar' experiments, the cost function values were reduced by 16.37% to 69.28% in the case of single-site assimilation, and it was reduced by 49.49% in the case of multi-site assimilation.

Surface Soil Moisture and FAPAR
We calculated the cost function, J, over the simulation period both for the prior and the various posterior results to have a quantitative measure for the improvements obtained by the different assimilation experiments.We used the cost function as a performance metric for the assimilation because the cost function also considers the observation uncertainty (Equation (1)).Therefore, it was more suitable for quantifying the model performance with respect to the different variables.
Table 3 summarizes the cost function values for soil moisture (denoted as J sm hereafter).Generally, the assimilation of the satellite datasets improved the model performance with respect to soil moisture.However, improvements differed between sites and depended sensitively on the assimilated dataset.As shown in Table 3, assimilation of soil moisture alone or simultaneously with FAPAR achieved the best performance in simulating soil moisture as indicated by the smallest J sm values.Assimilating FAPAR only ('fapar') resulted in a larger J sm , and thus inferior results for the simulated soil moisture than the other experiments for both single-site and multi-site cases.In addition, we validated the posterior surface soil moisture values against in-situ soil moisture observations at eight sites from the FLUXNET2015 dataset and calculated the relative changes in R 2 (determination coefficient) and RMSE (Root Mean Squared Error), shown in Table 4.For seven of the sites, either the R 2 was improved (positive relative change values), or RMSE was improved (negative relative change values), or both R 2 and RMSE were improved (four sites).As shown in Figure 2, at the site, US-Wkg, the RMSE for soil moisture reduced from 2.51 mm in the prior case to 1.92 mm in the posterior case.For the site, FI-Sod, located at high latitudes, with very limited observations available (427 observations during 2010-2015), both R 2 and RMSE were not improved in the posterior case compared to the prior.The simultaneous assimilation of SMOS soil moisture and JRC-TIP FAPAR in the multi-site case was shown to improve not only carbon fluxes, but also, as expected, the soil moisture content in the surface layer.Similarly, Jfapar values are shown in Table 5.In the 'sm' experiment, Jfapar values did not decrease much or even increase in comparison to the prior simulation.The smallest Jfapar values among the three experiments were achieved for the single-site assimilation, except for sites, FR-Pue, NL-Loo, and US-Wkg.For these sites, the cost function values for parameters (Jp) were much smaller, which resulted in smaller total cost function values (Jtotal) in the single-site optimizations.In the 'fapar' case, single-site assimilation achieved smaller Jfapar than multi-site assimilation.For the 'sm+fapar (multi)' case, the combination of SMOS soil moisture data with JRC-TIP FAPAR data resulted in a large cost function value for the parameters (Jp is 855.25).However, improvements in FAPAR in the 'sm+fapar(multi)' case were visible at most sites.For example, at site US-Kfs (Figure 3), when soil moisture and FAPAR were simultaneously assimilated at multiple sites, the FAPAR was constrained much better than the prior model simulation in matching remotely sensed FAPAR.Similarly, J fapar values are shown in Table 5.In the 'sm' experiment, J fapar values did not decrease much or even increase in comparison to the prior simulation.The smallest J fapar values among the three experiments were achieved for the single-site assimilation, except for sites, FR-Pue, NL-Loo, and US-Wkg.For these sites, the cost function values for parameters (J p ) were much smaller, which resulted in smaller total cost function values (J total ) in the single-site optimizations.In the 'fapar' case, single-site assimilation achieved smaller J fapar than multi-site assimilation.For the 'sm+fapar (multi)' case, the combination of SMOS soil moisture data with JRC-TIP FAPAR data resulted in a large cost function value for the parameters (J p is 855.25).However, improvements in FAPAR in the 'sm+fapar(multi)' case were visible at most sites.For example, at site US-Kfs (Figure 3), when soil moisture and FAPAR were simultaneously assimilated at multiple sites, the FAPAR was constrained much better than the prior model simulation in matching remotely sensed FAPAR.Three parameters, however, showed a large spread in the different assimilation experiments: ξ (rate for initial linear leaf growth), f C i , C 3 (the relation of maximum leaf internal CO 2 to ambient atmospheric CO 2 concentration for C3 plants), and emis0 (emissivity of the atmosphere).Large changes of these parameters were associated with the simultaneous assimilation experiments, that is 'sm+fapar'.
The largest changes in the initial leaf growth rate, ξ, were computed at the sites, CA-Gro and US-Wkg, in the 'smos+fapar' experiment.Sites dominated by C3 grass plant types (e.g., CN-Sw2 and US-Kfs) showed the largest changes for f C i , C 3 .

Posterior Parameters
In Figure 5, relative parameter changes, that is parameter changes from prior to posterior divided by their prior uncertainty, are shown.Some parameters related to phenology, water, and energy balance processes showed large changes (see parameters no.57-75, and no.89-97).This indicates that the observations of soil moisture and FAPAR have a major impact on the assimilation procedure and strongly constrain these processes in the model.
separately plotted in (b) due to its different range.Cross symbols denote the mean over all experiments and, error bars correspond to ±1 standard deviation.
Three parameters, however, showed a large spread in the different assimilation experiments:  (rate for initial linear leaf growth),  ,  (the relation of maximum leaf internal CO2 to ambient atmospheric CO2 concentration for C3 plants), and 0 (emissivity of the atmosphere).Large changes of these parameters were associated with the simultaneous assimilation experiments, that is 'sm+fapar'.The largest changes in the initial leaf growth rate, , were computed at the sites, CA-Gro and US-Wkg, in the 'smos+fapar' experiment.Sites dominated by C3 grass plant types (e.g., CN-Sw2 and US-Kfs) showed the largest changes for  ,  .

Posterior Parameters
In Figure 5, relative parameter changes, that is parameter changes from prior to posterior divided by their prior uncertainty, are shown.Some parameters related to phenology, water, and energy balance processes showed large changes (see parameters no.57-75, and no.89-97).This indicates that the observations of soil moisture and FAPAR have a major impact on the assimilation procedure and strongly constrain these processes in the model.S2.
A wide range of posterior values was computed for 0 (Figure 5, Parameter no.75).This parameter regulates the emissivity of the atmosphere and therefore has a strong impact on the surface water and energy fluxes through the radiation balance at the surface.At sites dominated by C3/C4 grasses (e.g., AU-DaS, ES-LMa, US-Kfs and US-Wkg), 'sm+fapar' experiments resulted in a remarkable range for 0 across the respective sites.C3/C4 grasses had smaller root depths (in the model the prior values amount to 0.5 m) in comparison to trees (prior values of 1.5-3.0m).Therefore, the growth conditions of grass in the model, but also in nature, depend strongly on soil moisture in the surface layer as well as on radiation.Moreover, surface conditions also differed from site to site due to roughness and surface cover variations, which potentially led to the large range of 0 values.A2.
A wide range of posterior values was computed for emis0 (Figure 5, Parameter no.75).This parameter regulates the emissivity of the atmosphere and therefore has a strong impact on the surface water and energy fluxes through the radiation balance at the surface.At sites dominated by C3/C4 grasses (e.g., AU-DaS, ES-LMa, US-Kfs and US-Wkg), 'sm+fapar' experiments resulted in a remarkable range for emis0 across the respective sites.C3/C4 grasses had smaller root depths (in the model the prior values amount to 0.5 m) in comparison to trees (prior values of 1.5-3.0m).Therefore, the growth conditions of grass in the model, but also in nature, depend strongly on soil moisture in the surface layer as well as on radiation.Moreover, surface conditions also differed from site to site due to roughness and surface cover variations, which potentially led to the large range of emis0 values.
Figure 6 shows the range of values for the two parameters that were spatially differentiated by soil texture.The parameters were related to the soil water retention curve, namely s b , scalar for water retention curve shape parameter, and s n , scalar for soil porosity.In Figure 6, the optimized values span a wide range across almost all soil textures.This indicates that there are large uncertainties in soil hydrology processes.In the BETHY model, soil hydrology was modeled by a bucket model with an overlying surface soil moisture layer [19].The soil water retention curve cannot be robustly constrained when soil moisture or FAPAR was assimilated as demonstrated by the large range of posterior values for the respective parameters.
Figure 6 shows the range of values for the two parameters that were spatially differentiated by soil texture.The parameters were related to the soil water retention curve, namely  , scalar for water retention curve shape parameter, and  , scalar for soil porosity.In Figure 6, the optimized values span a wide range across almost all soil textures.This indicates that there are large uncertainties in soil hydrology processes.In the BETHY model, soil hydrology was modeled by a bucket model with an overlying surface soil moisture layer [19].The soil water retention curve cannot be robustly constrained when soil moisture or FAPAR was assimilated as demonstrated by the large range of posterior values for the respective parameters.When comparing the 'sm' experiment and the 'fapar' experiment, a sort of a "trade-off" between soil moisture and FAPAR becomes evident in the assimilation.The 'sm' assimilations resulted in relatively large changes in the parameter, emis0, that is the emissivity of the atmosphere, and the parameter, s , which controls the shape of the water retention curve (see [19]).To the contrary, both parameters hardly changed when assimilating FAPAR only ('fapar').For 'fapar' experiments, parameters related to phenology experienced the largest changes relative to their prior values, e.g., T (no.62-63), which is the spatial range (essentially the with-in grid cell variability) of the phenology temperature trigger, k (no.67-68), which denotes the inverse of leaf longevity, and τ (no.69-71), which parameterizes the length of dry spells before leaf shedding.

Soil Moisture and FAPAR Controls on Soil Hydrology
Surface soil moisture and FAPAR data control the soil moisture process in different ways.FAPAR controls the photosynthetic processes and phenology as represented by the model.Both influence plant water uptake indirectly under water-limited conditions.In other assimilation studies, [15] and [18], the authors also noticed that the inclusion of FAPAR in the assimilation resulted in significant adjustments in the maximum plant-available soil moisture.The impact of FAPAR on the model's soil hydrology is not fully understood and obviously rather indirect compared to the assimilation of soil moisture data.Besides of FAPAR's role in controlling the model's phenology, the assimilation of FAPAR also constrains soil hydrology related parameters [15].
Vice versa, the assimilation of soil moisture observations constrains also the model's phenology and photosynthesis by influencing the water uptake and surface energy balance, as was When comparing the 'sm' experiment and the 'fapar' experiment, a sort of a "trade-off" between soil moisture and FAPAR becomes evident in the assimilation.The 'sm' assimilations resulted in relatively large changes in the parameter, emis0, that is the emissivity of the atmosphere, and the parameter, s b , which controls the shape of the water retention curve (see [19]).To the contrary, both parameters hardly changed when assimilating FAPAR only ('fapar').For 'fapar' experiments, parameters related to phenology experienced the largest changes relative to their prior values, e.g., T r (no.62-63), which is the spatial range (essentially the with-in grid cell variability) of the phenology temperature trigger, k L (no. 67-68), which denotes the inverse of leaf longevity, and τ W (no. 69-71), which parameterizes the length of dry spells before leaf shedding.

Soil Moisture and FAPAR Controls on Soil Hydrology
Surface soil moisture and FAPAR data control the soil moisture process in different ways.FAPAR controls the photosynthetic processes and phenology as represented by the model.Both influence plant water uptake indirectly under water-limited conditions.In other assimilation studies, [15] and [18], the authors also noticed that the inclusion of FAPAR in the assimilation resulted in significant adjustments in the maximum plant-available soil moisture.The impact of FAPAR on the model's soil hydrology is not fully understood and obviously rather indirect compared to the assimilation of soil moisture data.Besides of FAPAR's role in controlling the model's phenology, the assimilation of FAPAR also constrains soil hydrology related parameters [15].
Vice versa, the assimilation of soil moisture observations constrains also the model's phenology and photosynthesis by influencing the water uptake and surface energy balance, as was demonstrated by [19].In fact, assimilation of surface soil moisture corrects the process parameters in the model's hydrological scheme.Because surface and root zone soil moisture share common parameters, such as soil porosity (which depends on the soil texture in a given gridcell), to calculate soil water properties, the assimilation of surface soil moisture also constrains (through these parameters) plant-available soil moisture.
When assimilating soil moisture only ('sm'), single-site optimizations achieved better soil moisture results for most sites than multi-site assimilation (Table 3).For the FR-Pue site, the single-site optimization resulted in a larger J sm than the single-site 'sm+fapar' optimization did.This was probably due to the relatively small number of SMOS soil moisture data points at FR-Pue (31) for the period from 2010 to 2015, which moreover showed nearly no temporal variability.An optimized parameter set for this site using these few and relatively constant observations was obviously difficult to compute in a single-site approach.Therefore, in the 'sm+fapar' case, the FAPAR observations provided a stronger constraint through common parameters on the soil moisture simulation than the actual observations at FR-Pue.We also noticed that the smaller J sm at 'sm+fapar' for single-site optimization is compensated with a larger J p .This indicates that due to limited soil moisture observations, model parameters were over-adjusted to achieve a better performance.
Better results for soil moisture were obtained when soil moisture and FAPAR were simultaneously assimilated ('sm+fapar'), with slight differences between the single-site and multi-site cases.This was due to specific site characteristics, which were not easily detectable in the multi-site assimilation and the processes embedded in the model.For example, the sites, AU-DaS, CN-Sw2, FR-Pue, and US-Kfs, are dominated by either grassland or cropland (or both).Water transport processes, however, were difficult to simulate at these sites by our optimization process since they are strongly affected by human management processes, such as irrigation, harvest, or grazing, which are not represented in the model.Another complicated case for our assimilation approach can be seen at the site, CA-Gro, which is located in North America and dominated by a cold climate.The soils at this site are subject to freezing/thawing conditions and snow cover for some periods of time.The associated cold region's soil hydrology is complicated and not fully represented in the model.
When soil moisture and FAPAR were simultaneously assimilated ('sm+fapar'), model parameters related to soil hydrology processes were constrained more than the ones related to phenology processes.There are two reasons for the stronger impact on the model's soil hydrology: (1) The assimilated soil moisture data control directly the surface water and energy balance, which then directly affect plant phenology changes.On the other hand, the FAPAR data have a direct control on the plant phenology, which then only indirectly influences the hydrological cycle.(2) The different temporal resolutions of soil water versus FAPAR data.The FAPAR data have a 16-day temporal resolution, while SMOS L3 data have a daily resolution; hence there were many more SMOS L3 data than FAPAR.In a simultaneous assimilation of both data sets, this favours a stronger constraint on soil moisture than FAPAR.

Model Representation of In-situ Carbon Flux Observations
In a next step, we performed an independent analysis of the simulation results.In Figure 7, the Net Ecosystem Productivity (NEP) and Gross Primary Productivity (GPP) from site CA-Gro as simulated by the different assimilation experiments are compared with FLUXNET NEP and GPP data.For the whole 10 sites, results are shown in Figure A1.When comparing the eddy covariance measurements against our simulation results, we needed to keep in mind the difference in footprint size of the flux towers (typically a few hundred m 2 ) to the spatial resolution of the products used in the assimilation (~600 km 2 for the SMOS soil moisture and ~1 km 2 for the JRC-TIP FAPAR).Therefore, we did not expect a perfect fit against the observations even with the posterior model.In fact, there were large differences of the various optimized NEP values.As shown in Figure 7, the assimilation of SMOS and FAPAR at multi-site achieved the best performance for NEP and GPP for site CA-Gro.For most of the sites, the optimized NEP and GPP captured the seasonality of observations very well, except for two sites, AU-DaS and site US-Wkg (Figures A1 and A2).As above, a major influence of land management processes not represented in the model seems a probable explanation for this deviation.Both sites are dominated by C4 grasses (C4Gr) and evergreen shrubs (EvShr) and are characterized by an arid/semi-arid climate.Both sites are strongly influenced by human activities, such as farming and grazing.For the other sites, modelled NEP and GPP matched the fluxes for the non-growing season fluxes much better than for the growing season.For example, at the site CN-Sw2 (Figure S1c), the peak value of modeled NEP was much higher than the observations.For this site, only measurements for the year, 2011, were available.This site experienced a severe drought in 2011, causing a significant decrease in observed NEP during the growing season.Only the prior and soil moisture assimilation experiment in the single-site case detected the changes in July (Figure S1c).A similar decrease in observed NEP was detected at the site, US-Wkg (Figure S1j) between June and July.The discrepancy between modeled and measured NEP indicates again that the model was not able to capture the reduction in NEP in dry seasons.
Table 6 shows the relative changes of RMSE (%) to prior simulation for NEP (RMSENEP) using the eddy covariance measurements at the sites and the simulated NEP from the various experiments.Generally, there was a reduction of the RMSENEP values of up to 43% when compared to the prior simulation.However, especially when only one dataset was assimilated in a single-site experiment, the agreement with the NEP observations was occasionally degraded.In general, better results were achieved when soil moisture and FAPAR were simultaneously assimilated in the multi-site For the other sites, modelled NEP and GPP matched the fluxes for the non-growing season fluxes much better than for the growing season.For example, at the site CN-Sw2 (Figure A1c), the peak value of modeled NEP was much higher than the observations.For this site, only measurements for the year, 2011, were available.This site experienced a severe drought in 2011, causing a significant decrease in observed NEP during the growing season.Only the prior and soil moisture assimilation experiment in the single-site case detected the changes in July (Figure A1c).A similar decrease in observed NEP was detected at the site, US-Wkg (Figure A1j) between June and July.The discrepancy between modeled and measured NEP indicates again that the model was not able to capture the reduction in NEP in dry seasons.
Table 6 shows the relative changes of RMSE (%) to prior simulation for NEP (RMSE NEP ) using the eddy covariance measurements at the sites and the simulated NEP from the various experiments.Generally, there was a reduction of the RMSE NEP values of up to 43% when compared to the prior simulation.However, especially when only one dataset was assimilated in a single-site experiment, the agreement with the NEP observations was occasionally degraded.In general, better results were achieved when soil moisture and FAPAR were simultaneously assimilated in the multi-site assimilation case.Similar results were found for gross primary productivity (GPP), as shown in Figure A2 and Table A3.Recently, assimilation experiments were conducted at a site-scale using latent heat and FAPAR observations by [18].They noticed that the combination of two different datasets could lead to a higher uncertainty reduction in model parameters and in other model variables of interest than in single data set cases.In another study, single-site and multi-site assimilation experiments were compared with the CCDAS [51].They also demonstrated a comparable model performance for multi-site assimilation to single-site assimilation.Also, the multi-site assimilation was found to result in a lower cost function than the single site optimizations [52].We notice in Table 2 that for single-site and single dataset assimilations, our optimizations could not achieve as low final gradient values as in the multi-site, multi-dataset optimizations.

Model Representation of Atmospheric CO 2
We used atmospheric CO 2 concentration observations to corroborate our assimilation results.We used the posterior values from the multi-site, multi-dataset optimization in a global model set-up forced by the global CRUNCEP meteorological driving data, i.e., the terrestrial model simulated global carbon fluxes.The simulated net CO 2 exchange fluxes were then used as input together with a set of background CO 2 fluxes (ocean-atmosphere exchange fluxes, fossil fuel emissions, and land use change emissions) for the global tracer transport model, TM2, to simulate atmospheric CO 2 concentrations for the years, 2010 to 2015.This was essentially the model set-up as described by [19], but here it was only used in a forward modelling sense.Table 7 summarizes the various performance metrics (R 2 and RMSE for the CO 2 time series, and mean error (ME) in the annual growth rates calculated as the method for calculating the annual growth rate in ESRL, https://www.esrl.noaa.gov/gmd/ccgg/mbl/crvfit/crvfit.html) for the two stations, Cape Kumukahi (KUM) and South Pole (SPO, see Table A4 for details on the stations).Obviously, the optimized parameter set improved simulated CO 2 concentrations (Figure A3), as shown by the reduction in the average RMSE values in CO 2 concentrations over the whole time period: At station KUM, RMSE for CO 2 was reduced by 10.4%, and at station SPO, it was reduced by 20.6%.Similarly, the ME in the annual growth rates was improved by 5.5% for KUM and 11.3% for SPO, respectively.This demonstrates that the optimized parameters from multi-site assimilation using SMOS soil moisture and JRC-TIP FAPAR observation can provide a more reasonable parameter set for global scale model experiments.We note, however, that the R 2 values slightly worsened in the 'sm+fapar (multi)' experiment compared to the 'prior', by about 6% at the station KUM.This was mainly due to a reduction of the amplitude in the seasonal cycle in atmospheric CO 2 for the optimized case since, in contrast to [19], we did not assimilate atmospheric CO 2 observations in this study.

Conclusions
We performed data assimilation experiments at multiple and single sites with two sources of satellite datasets, namely SMOS L3 soil moisture and JRC-TIP FAPAR.These data sets were assimilated in the global terrestrial biosphere model, BETHY, over a time period from 2009 to 2015.The sites for the multiple and single site approach were 10 stations from the FLUXNET network.Model output was also analyzed and compared to independent NEP, GPP, and in-situ soil moisture observations at different sites.This is the first report on experiments that SMOS L3 soil moisture data were used in combination with remotely sensed FAPAR data within site-scale assimilation experiments in order to constrain model parameters.
In all cases, the cost function was reduced significantly, whether a single data set was assimilated or both simultaneously.However, the simultaneous assimilation of both data sets reduced the cost functions for the 10 considered FLUXNET sites to a larger degree (between 16.37% to 69.28%) than the experiments assimilating only one dataset.Parameters related to phenology, soil water, and energy balance processes showed the largest relative changes.
Generally, assimilation of FAPAR observations mainly changed parameters related to phenological processes in the BETHY model, and thus mainly improved simulated FAPAR.SMOS soil moisture, to the contrary, constrained efficiently the soil water and energy balance as computed by the model as indicated by larger changes in the related parameters, such as the emissivity of the atmosphere (emis0).In general, the combination of SMOS L3 soil moisture and FAPAR in a multi-site assimilation performed best in fitting both simulated soil moisture and FAPAR.Considering the range of optimal parameters computed in the different experiments, we conclude that a multi-site assimilation with various sources of well-processed observational data streams achieved the best overall results.
Soil moisture has shown to be important in controlling ecosystem production.Adding more detailed processes related to soil water and heat transfer in soil profiles to the model might therefore be key in improving the model response to extremes, such as drought, flood, or frost.
Posterior parameters were also analyzed by comparing the results from the different assimilation experiments.For single sites and single data set assimilation, some global and PFT dependent parameters showed a large range of posterior values.This was probably due to problems with sparse observational data, specific processes associated to a cold/arid climate, and more complicated phenology processes in certain PFT (e.g., crops and grasses) as imposed by management practices.In the assimilation experiments, the emissivity of the atmosphere (emis0) covered a wide range of posterior values, in particular considering that emis0 is a global parameter.Additional independent data on the radiative balance at the surface, such as outgoing longwave radiation, would help to constrain this parameter.
Comparison of simulated NEP with FLUXNET NEP observations as well as simulated soil moisture with FLUXNET in-situ soil moisture observations gave further evidence for the robustness of an assimilation approach using both SMOS and FAPAR to constrain the carbon cycle simulation in the terrestrial biosphere model.When comparing simulation results against the in-situ observations, one needs to keep in mind the differences in the spatial representation of the assimilated products with the in-situ observations as mentioned above (see Section 4.3).Nevertheless, the fit against the NEP observations was improved at six of the 10 sites.For soil moisture, we see a reduction of the RMSE error against the in-situ observations of 51.5% and 23.4% for two sites, respectively.Our results from the multi-dataset assimilation (simultaneous assimilation of soil moisture and FAPAR) are consistent with previous studies recommending a combination of these two datasets in constraining parameters in terrestrial biosphere models [18,19].Also, the evaluation of the model performance on a global scale against atmospheric CO 2 indicated that optimized parameters result in a better fit as demonstrated by, for instance, a reduction of the RMSE of 10.4% against data from Cape Kumukahi and 20.6% against data from South Pole.However, the deterioration of the seasonal cycle amplitude, as mentioned in Section 4.4 is a sign for still unresolved sub-spaces in the total parameter space, i.e., parts of the biosphere model cannot be constrained by soil moisture and FAPAR data alone.By adding atmospheric CO 2 observations in the assimilation, we expect to constrain also these unresolved sub-spaces, at least partly.In general, assimilating further independent and complementary observations will help to constrain such unresolved sub-spaces in the total parameter space.
We demonstrated here that a careful definition of prior parameters and their uncertainties is of importance for an efficient optimization of these parameters.The analysis of the assimilation results, however, revealed that for some sites, the model was not significantly improved.This was especially true for sites covered by grassland and crops, i.e., land cover types that are influenced by management processes, which are not included in the model.Besides a more detailed representation of water and energy balance processes, special attention should be paid to include more sites marked by an even wider spread in climate zones, climate variability, and ecosystem types.Such a wider spread and the addition of additional observations might help to constrain still 'unobserved' (i.e., unobserved by the assimilated soil moisture and FAPAR data) processes in the model as mentioned above.Using remote sensing data has the advantage of providing global data products for the assimilation, which will be further explored in future studies.6.The black line denotes simulation with prior parameter values, the blue line denotes multi-site assimilation result with soil moisture and FAPAR, the green line denotes multi-site assimilation result with soil moisture, the red line denotes multi-site assimilation result with FAPAR, the cyan line denotes single-site assimilation result with soil moisture and FAPAR from each site grid, the magenta line denotes single-site assimilation  6.The black line denotes simulation with prior parameter values, the blue line denotes multi-site assimilation result with soil moisture and FAPAR, the green line denotes multi-site assimilation result with soil moisture, the red line denotes multi-site assimilation result with FAPAR, the cyan line denotes single-site assimilation result with soil moisture and FAPAR from each site grid, the magenta line denotes single-site assimilation result with soil moisture from each site grid, and the dark yellow line denotes single-site assimilation results with FAPAR from each site grid.
result with soil moisture from each site grid, and the dark yellow line denotes single-site assimilation results with FAPAR from each site grid.

Figure S2.
Multi-year mean of simulated daily gross primary productivity (GPP) and observed data (dot) from FLUXNET.(a-j) corresponds to 10 sites listed in Table S3.The black line denotes simulation with prior parameter values, the blue line denotes multi-site assimilation result with soil moisture and FAPAR, the green line denotes multi-site assimilation result with soil moisture, the red line denotes multi-site assimilation result with FAPAR, the cyan line denotes single-site assimilation result with soil moisture and FAPAR from each site grid, the magenta line denotes single-site assimilation result with soil moisture from each site grid, and the dark yellow line denotes single-site assimilation results with FAPAR from each site grid.A3.The black line denotes simulation prior parameter values, the blue line denotes multi-site assimilation result with soil moisture and FAPAR, the green line denotes multi-site assimilation result with soil moisture, the red line denotes multi-site assimilation result with FAPAR, the cyan line denotes single-site assimilation result with soil moisture and FAPAR from each site grid, the magenta line denotes single-site assimilation result with soil moisture from each site grid, and the dark yellow line denotes single-site assimilation results with FAPAR from each site grid.

Figure 1 .
Figure 1.Flow chart of the calculation of the total cost function, Jtotal (= Jp + Jsm + Jfapar) for simultaneous assimilation of soil moisture and FAPAR data as the sum of contributions from the parameter part (Jp) and for the two data terms, Jsm and Jfapar.Ovals denote data and rectangular boxes denote processes (i.e., code modules).

Figure 2 .
Figure 2.Comparison of simulated surface soil moisture (0-4 cm) at site US-Wkg against in-situ soil moisture observations.The blue dot-dash line denotes the simulation with the prior parameters, the red line denotes multi-site assimilation results with both soil moisture and FAPAR used for assimilation ('sm+fapar (multi)'), and the black dot denotes in-situ observations from the FLUXNET sites at the top layer.

Figure 2 .
Figure 2.Comparison of simulated surface soil moisture (0-4 cm) at site US-Wkg against in-situ soil moisture observations.The blue dot-dash line denotes the simulation with the prior parameters, the red line denotes multi-site assimilation results with both soil moisture and FAPAR used for assimilation ('sm+fapar (multi)'), and the black dot denotes in-situ observations from the FLUXNET sites at the top layer.

Figure 3 .
Figure 3.Comparison of simulated FAPAR at site US-Kfs against JRC-TIP FAPAR.The blue dot-dash line denotes the simulation with the prior parameters, the red line denotes multi-site assimilation results with both soil moisture and FAPAR used for assimilation ('sm+fapar (multi)'), and grey dots error bars denote JRC-TIP FAPAR with uncertainty.

3. 3 .
Posterior Parameters BETHY has 23 global parameters.In Figure 4, changes of these global parameters (aggregated as mean and standard deviation across the different experiments) are shown.Most of the global parameters did not change much compared to the prior.

Figure 3 .
Figure 3.Comparison of simulated FAPAR at site US-Kfs against JRC-TIP FAPAR.The blue dot-dash line denotes the simulation with the prior parameters, the red line denotes multi-site assimilation results with both soil moisture and FAPAR used for assimilation ('sm+fapar (multi)'), and grey dots error bars denote JRC-TIP FAPAR with uncertainty.

3. 3 .
Posterior Parameters BETHY has 23 global parameters.In Figure 4, changes of these global parameters (aggregated as mean and standard deviation across the different experiments) are shown.Most of the global parameters did not change much compared to the prior.

Figure 3 .
Figure 3.Comparison of simulated FAPAR at site US-Kfs against JRC-TIP FAPAR.The blue dot-dash line denotes the simulation with the prior parameters, the red line denotes multi-site assimilation results with both soil moisture and FAPAR used for assimilation ('sm+fapar (multi)'), and grey dots error bars denote JRC-TIP FAPAR with uncertainty.

3. 3 .
Posterior Parameters BETHY has 23 global parameters.In Figure 4, changes of these global parameters (aggregated as mean and standard deviation across the different experiments) are shown.Most of the global parameters did not change much compared to the prior.

Figure 4 .
Figure 4. Posterior (i.e., optimized) values for global parameters in different assimilation experiments (bars).Values were normalized by prior uncertainty for representation in one figure (a); Λ ∼ is separately plotted in (b) due to its different range.Cross symbols denote the mean over all experiments and, error bars correspond to ±1 standard deviation.

Figure 5 .
Figure 5. Parameter value changes relative to prior for all assimilation experiments (29 in total), with a cross symbol indicating the mean of changes, box for standard deviation, and whisker for max and min values, parameter numbers correspond to the numbers listed in TableS2.

Figure 5 .
Figure 5. Parameter value changes relative to prior for all assimilation experiments (29 in total), with a cross symbol indicating the mean of changes, box for standard deviation, and whisker for max and min values, parameter numbers correspond to the numbers listed in TableA2.

Figure 6 .
Figure 6.Posterior (i.e., optimized) values for soil texture dependent parameters related to soil water processes, (a)  for different soil textures; (b)  for different soil textures.Normalization and symbols as in Figure 4.

Figure 6 .
Figure 6.Posterior (i.e., optimized) values for soil texture dependent parameters related to soil water processes, (a) s b for different soil textures; (b) s n for different soil textures.Normalization and symbols as in Figure 4.
Remote Sens. 2018, 11, x FOR PEER REVIEW 15 of 28 deviation.Both sites are dominated by C4 grasses (C4Gr) and evergreen shrubs (EvShr) and are characterized by an arid/semi-arid climate.Both sites are strongly influenced by human activities, such as farming and grazing.

Figure 7 .
Figure 7. Simulated (a) net ecosystem productivity (NEP); (b) gross primary productivity (GPP) compared against NEP and GPP data derived from FLUXNET (black dots) at site CA-Gro.The blue line denotes the simulation with the prior parameters, and the red line denotes multi-site assimilation results with both soil moisture and FAPAR used for assimilation ('sm+fapar (multi)').

Figure 7 .
Figure 7. Simulated (a) net ecosystem productivity (NEP); (b) gross primary productivity (GPP) compared against NEP and GPP data derived from FLUXNET (black dots) at site CA-Gro.The blue line denotes the simulation with the prior parameters, and the red line denotes multi-site assimilation results with both soil moisture and FAPAR used for assimilation ('sm+fapar (multi)').

Figure S1 .
Figure S1.Multi-year mean of simulated daily net ecosystem productivity (NEP) and observed data (dot) from FLUXNET.(a-j) corresponds to 10 sites listed in Table6.The black line denotes simulation with prior parameter values, the blue line denotes multi-site assimilation result with soil moisture and FAPAR, the green line denotes multi-site assimilation result with soil moisture, the red line denotes multi-site assimilation result with FAPAR, the cyan line denotes single-site assimilation result with soil moisture and FAPAR from each site grid, the magenta line denotes single-site assimilation

Figure A1 .
Figure A1.Multi-year mean of simulated daily net ecosystem productivity (NEP) and observed data (dot) from FLUXNET.(a-j) corresponds to 10 sites listed in Table6.The black line denotes simulation with prior parameter values, the blue line denotes multi-site assimilation result with soil moisture and FAPAR, the green line denotes multi-site assimilation result with soil moisture, the red line denotes multi-site assimilation result with FAPAR, the cyan line denotes single-site assimilation result with soil moisture and FAPAR from each site grid, the magenta line denotes single-site assimilation result with soil moisture from each site grid, and the dark yellow line denotes single-site assimilation results with FAPAR from each site grid.

Figure A2 .
Figure A2.Multi-year mean of simulated daily gross primary productivity (GPP) and observed data (dot) from FLUXNET.(a-j) corresponds to 10 sites listed in TableA3.The black line denotes simulation prior parameter values, the blue line denotes multi-site assimilation result with soil moisture and FAPAR, the green line denotes multi-site assimilation result with soil moisture, the red line denotes multi-site assimilation result with FAPAR, the cyan line denotes single-site assimilation result with soil moisture and FAPAR from each site grid, the magenta line denotes single-site assimilation result with soil moisture from each site grid, and the dark yellow line denotes single-site assimilation results with FAPAR from each site grid.

Figure S3 .
Figure S3.Comparison of observed CO2 concentrations from two sites from the Scripps Institution of Oceanography network with simulations from different experiments, (a) for site KUM; (b) for site SPO.'Prior' represents simulations with prior parameter values, 'sm+fapar (multi)' represents simulations with optimized parameter values from multi-site assimilation using SMOS soil moisture and FAPAR.

Figure A3 .
Figure A3.Comparison of observed CO 2 concentrations from two sites from the Scripps Institution of Oceanography network with simulations from different experiments, (a) for site KUM; (b) for site SPO.'Prior' represents simulations with prior parameter values, 'sm+fapar (multi)' represents simulations with optimized parameter values from multi-site assimilation using SMOS soil moisture and FAPAR.

Table 2 .
Summary of optimization results on cost function and gradient reduction for different assimilation experiments, 'sm' denotes soil moisture dataset, 'fapar' denotes FAPAR dataset, 'sm+fapar' denotes simultaneous assimilation soil moisture and FAPAR.

Table 3 .
Cost function values for soil moisture (J sm ) and parameters (J p ) for prior and different assimilation experiments, 'sm' with soil moisture only, 'fapar' with FAPAR only, 'sm+fapar' with soil moisture and FAPAR data combined, 'multi' denotes multi-site assimilation, and 'single' denotes single-site assimilation.

Table 4 .
R 2 and RMSE relative changes (%) to prior simulation for site soil moisture calculated against in-situ observations when SMOS soil moisture and FAPAR were assimilated simultaneously in multi-site case, denoted as sm+fapar (multi).

Table 5 .
Cost function values for FAPAR (J fapar ) and parameters (J p ) for prior and different assimilation experiments: 'Sm' with soil moisture only, 'fapar' with FAPAR only, 'sm+fapar' with soil moisture and FAPAR data combined, 'multi' denotes multi-site assimilation, and 'single' denotes single-site assimilation.

Table 6 .
RMSE relative changes (%, positive values mean RMSE was degraded) for net ecosystem productivity (NEP) in comparison with prior simulations, 'sm' with soil moisture solely, 'fapar' with FAPAR solely, 'sm+fapar' with combination of soil moisture and FAPAR data, 'multi' denotes multi-site assimilation, and 'single' denotes single-site assimilation.

Table 7 .
Summary of performance metrics for CO 2 concentrations and CO 2 annual growth rates at stations, KUM and SPO.