Estimation of FAPAR over Croplands Using MISR Data and the Earth Observation Land Data Assimilation System ( EO-LDAS )

The Fraction of Absorbed Photosynthetically-Active Radiation (FAPAR) is an important parameter in climate and carbon cycle studies. In this paper, we use the Earth Observation Land Data Assimilation System (EO-LDAS) framework to retrieve FAPAR from observations of directional surface reflectance measurements from the Multi-angle Imaging SpectroRadiometer(MISR) instrument. The procedure works by interpreting the reflectance data via the semi-discrete Radiative Transfer (RT) model, supported by a prior parameter distribution and a dynamic regularisation model and resulting in an inference of land surface parameters, such as effective Leaf Area Index (LAI), leaf chlorophyll concentration and fraction of senescent leaves, with full uncertainty quantification. The method is demonstrated over three agricultural FLUXNET sites, and the EO-LDAS results are compared with eight years of in situ measurements of FAPAR and LAI, resulting in a total of 24 site years. We additionally compare three other widely-used EO FAPAR products, namely the MEdium Resolution Imaging Spectrometer (MERIS) Full Resolution, the MISR High Resolution (HR) Joint Research Centre Two-stream Inversion Package (JRC-TIP) and MODIS MCD15 FAPAR products. The EO-LDAS MISR FAPAR retrievals show a high correlation with the ground measurements (r2 > 0.8), as well as the lowest average RMSE (0.14), in line with the MODIS product. As the EO-LDAS solution is effectively interpolated, if only measurements that are coincident with MISR observations are considered, the correlation increases (r2 > 0.85); the RMSE is lower by 4–5%; and the bias is 2% and 7%. The EO-LDAS MISR LAI estimates show a strong correlation with ground-based LAI (average r2 = 0.76), but an underestimate of LAI for optically-thick canopies due to saturation (average RMSE = 2.23). These results suggest that the EO-LDAS approach is successful in retrieving both FAPAR and other land surface parameters. A large part of this success is based on the use of a dynamic regularisation model that counteracts the poor temporal sampling from the MISR instrument.


Introduction
The Fraction of Absorbed Photosynthetically-Active Radiation (FAPAR) is recognised as an essential climate variable, and it plays an important role in biosphere and climate modelling [1].FAPAR is defined as incident solar radiation in the range 400-700 nm that is absorbed by the photosynthetic tissue of canopy [2] and, thus, is an important control on the photosynthetic activity of vegetation.FAPAR has been widely used for monitoring drought, biodiversity, land degradation, phenology, CO 2 emission studies and Dynamic Global Vegetation Models (DGVM) [3][4][5][6][7][8].Although we consider FAPAR to be a land surface parameter (e.g., only related to the land), the amount of direct and diffuse radiation affects its value [9,10].
In order to infer the state of the land surface, the inversion of physically-based models that describe the interaction of incoming radiation with the soil-leaf-canopy medium, typically based on radiative transfer (RT) theory, are generally used [11,12].The main benefits of using physically-based RT models is their ability to cope with different sensor properties (angular and spectral sampling characteristics, etc.) and that they are more generic than empirical approaches, as they incorporate basic physical laws (e.g., energy conservation) that are universally applicable, and should result in a more robust interpretation of the measurements.The retrieval of land surface parameters using RT models is complicated by the problem being 'ill-posed' [13].A well-posed problem is one that has a solution; the solution is unique and changes continuously with changing input.A problem that does not hold these conditions is an ill-posed problem [14].In the context of EO, it often means that an infinite number of land surface parametrisations results in equally likely predictions of the observations.Here, we can see "inputs" as the inputs to RT model, i.e., state variables (LAI, chlorophyll, leaf water content, etc.).One of the possible solutions for improving the situation is using a priori knowledge, such as physically-realistic parameter distributions and/or constraints on parameter smoothness (e.g., in time, space) [15,16].In practice, ill-posedness means that retrieved parameters have very large uncertainties.Prior information restricts the possible space of potential solutions.This strategy is deployed by the Joint Research Centre Two-Stream Inversion Package (JRC-TIP) product [17].Other sources of uncertainty in the retrievals arise from sparse observations, e.g., due to cloudiness or orbital and sensor design characteristics.These problems call for a credible and traceable uncertainty quantification framework that allows users to understand shortcomings in the inverted data.
A final comment on FAPAR products is that its magnitude is closely related to other biophysical properties, such as Leaf Area Index (LAI), leaf optical properties, single scattering albedo, etc.These parameters are often derived independently from the same original datasets in RT model inversion schemes, making a number of assumptions on, e.g., canopy structure, leaf optical properties, etc., that might result in inconsistencies between derived products' datasets.
In order to meet the requirements described above, this paper explores the use of the Earth Observation Land Data Assimilation System (EO-LDAS), a general purpose Data Assimilation (DA) framework, to invert a time series of surface directional reflectance observations from the Moderate-resolution Imaging Spectroradiometer (MODIS) and Multi-angle Imaging SpectroRadiometer (MISR) sensors to infer land surface parameters.We use these parameters (and associated uncertainties) to provide a consistent estimation of FAPAR.The results are compared with other products available over the same sites.Two recent papers provide an approachable and non-specialist overview of this area [18,19].
EO-LDAS is a system that allows interpreting spectral observations to provide an optimal quantitative estimate of the Earth surface state.It permits the combination of observations from different sensors despite differences in spatial and spectral resolution and acquisition frequencies.EO-LDAS is based on variational DA and uses physically-based RT modes to map from state (LAI, leaf and soil optical properties, for example) to observation space (in this case, surface directional reflectance).EO-LDAS essentially allows a flexible description of both the 'fit to the observations' using RT models and the prior information, either as parameter distributions, or temporal, or spatial regularisation constraints.
Previous EO-LDAS results have been validated using synthetic Sentinel-2 data [16,20].In [21], emulators (fast surrogate approximations to computationally-expensive physical RT models) are demonstrated within the EO-LDAS framework, showing that the addition of a simple regularisation dynamic model results in improved retrievals in a synthetic example that combines observations from Sentinel-2/MSI, Sentinel-3/SLSTR and Proba-V observations.In all of these studies, the authors found that adding temporal regularisation as an additional prior constraint resulted in a significant reduction of uncertainty in the estimates of the inferred land surface parameters.Here, we analyse the results of EO-LDAS temporal regularization with MISR observations by comparing them against ground-based FAPAR estimated over an agricultural test site [22] and against MEdium Resolution Imaging Spectrometer (MERIS) FAPAR at 300 m, MISR High Resolution (HR) JRC-TIP and MODIS MOD15 [23] products for 2001-2008.Multi-angular remote sensing of the land surface can help to reveal structural properties of the vegetation and thus to improve characterisation of the vegetation cover [24].Three widely-used instruments provide multi-angular information at the global scale: Polarization and Directionality of Earth Reflectance (POLDER) [25], Sea and Land Surface Temperature Radiometer (SLSTR) [26] and MISR [27].MISR has nine cameras pointed to directions from −70 • -70 • , four spectral bands from blue to near-infrared and spatial resolution at 275 m for the nadir camera and red band and at 1.1 km for others.A number of studies has demonstrated that exploiting multi-angular information of MISR can improve retrieval of LAI and FAPAR [28][29][30][31].One physically-based approach for deriving FAPAR from 275 m is the JRC-TIP approach that uses MISR resolution data [32,33] to invert a radiative transfer model.This package is based on a two-stream model [34] and uses prior information to constrain the RT model inversion.In addition, this application provides information about theoretical uncertainties for both output state parameters and output fluxes.JRC-TIP output was tested against independent parameter estimates over a range of different areas [17,35].
The next sections describe the test site, EO data and FAPAR retrieval algorithms with their respective definition and assumptions.Following this, we present a short summary of the theoretical basis of EO-LDAS and FAPAR estimation.We then present the retrieved values from EO-LDAS and proceed to compare them with ground measurements.We also show and discuss comparisons with other products and discuss these comparisons.We finally draw some conclusions.

Test Site Description and Ground-Based Data Collection
The study area is comprised of the three agricultural FLUXNET sites, US-Ne1 (41.165  [22,36].Fields US-Ne1 and US-Ne2 are irrigated, and US-Ne3 is rainfed.The growing season is approximately from May-October. In situ measurements of FAPAR were carried out from 2001-2008 between June and September/October with an interval of 2-6 days.These measurements were performed with a Li-Cor quantum sensor (LI-COR Inc., Lincoln, NE, USA) by detecting incoming photosynthetically active radiation (PAR inc ), PAR reflected by the canopy and soil (PAR out ), PAR transmitted through the canopy (PAR transm ) and PAR reflected by the soil (PAR soil ) (Equation ( 1)) [36].Hourly values of radiation measured throughout a day were integrated in order to get daily values when PAR inc was >1 µmol m −2 s −1 [22].A detailed description of the measurements protocols can be found in [22,36].In summary, the authors obtain the absorption PAR for only green elements, i.e., FAPAR green , with the following correction for which LAI green and LAI total have been measured through a destructive determination technique: It should be noted that this method does not take into account senescent leaves, which can lead to some underestimation of FAPAR at the end of the growing season.
One important technical issue concerns the different spatial scales of the ground and satellite-derived estimates, as well as the geo-location differences between in situ and MISR observations.Another problem is the way that in situ green FAPAR has been calculated (see Equation ( 2)): the common assumption of a linear correlation between LAI and FAPAR will inevitably introduce some errors.

MISR Observations
The EO-LDAS inversions use the MISR full resolution surface reflectance as inputs, i.e., at 275 m, using seven cameras acquiring in the nadir direction (camera: An), and (60  afterwards (cameras: Ca, Ba, Aa) and forward (cameras: Cf, Bf, Af).The acquired data have been pre-processed with the sharpening method of [33].MISR has four spectral bands with central wavelengths at 446 nm, 558 nm, 672 nm and 867 nm [37].The test area is acquired by MISR on the paths P27, P28 and P29.In every acquisition, the pixels that are closest to the positions of the flux towers were selected as being representative of the associated fields.This means that over the entire time period, the ground location of the selected pixels may change slightly.In order to maintain the native MISR pixel resolution, we did not re-sample pixels.Spatial standard deviation of surface Bidirectional Reflectance Factors (BRFs) in the 3 × 3 pixel area for eight years does not exceed 0.04, except for four dates (not shown here).
The surface BRF uncertainties were assumed Gaussian, with a zero mean and with a standard deviation of 0.05 units of reflectance for all bands.The uncertainty between bands was assumed uncorrelated.

The JRC-TIP MISR Product
The JRC-TIP approach generates surface biophysical products using broadband surface albedo in visible and near-infrared domains by the minimisation of a cost function J( x), derived from the log posterior distribution: where d is the MISR white sky albedo in visible and near-infrared broadbands; x t is the state vector, which is comprised of the inputs of the two stream model H t ( x t ) [35].The state parameters are: the effective Leaf Area Index (LAI); the background albedos (BA) r g (λ); the Single Scattering Albedo of leaf (SSA): and the backward/forward scattering efficiency r l (λ)/t l (λ) in each broadband.The ratio r l (λ)/t l (λ) < 1 (r l (λ)/t l (λ) > 1), then forward (backward) scattering is predominant; x t p is the state vector of prior information; C o is the covariance matrix of observations, which is responsible for the uncertainty in the observations; C p is the covariance of prior information.An efficient minimisation of Equation ( 3) is achieved by exploiting the adjoint code (the adjoint provides an efficient estimate of the gradient of a function and can be obtained using automatic differentiation tools [38]) of the two stream model [35], which allows the use of gradient descent methods.These methods exploit the availability of the gradient of the cost function provided by the adjoint to propose the search direction of the optimisation.Equation (3) can be seen as the sum of two terms: J obs ( x), which is responsible for the observations, and J prior ( x), which is the prior term: The surface reflectance values used in EO-LDAS and the surface albedo values used in the JRC-TIP correspond to the same pixel with a resolution of 275 m and have been derived with the same pre-processing chain used in the sharpening method and subsequent atmospheric corrections [33].The JRC-TIP assumes that the Leaf Angle Distribution (LAD) is spherical and that the a priori leaf spectra values correspond either to 'polychrome' (standard) or 'green' leaf.The polychrome leaf is vaguer about the leaf single scattering albedo, whereas the green leaf has a very tight prior distribution of single scattering albedo that is broadly consistent with a healthy green leaf [17].We will include this latter assumption, as it is consistent with choices in the MERIS algorithm [39,40].The JRC-TIP FAPAR corresponds to the post-processing absorbed fluxes in the visible broadband, i.e., in the PAR domain.However, we note that for the JRC-TIP, the absorption corresponds to the white-sky value, i.e., absorption under diffuse radiation.

The JRC MERIS FAPAR Product
The MERIS FAPAR Full Resolution products come from the operational ESA products.They do not yet contain associated uncertainties or/and updated cloud masking, which is planned for the fourth reprocessing [41,42].In this work, we use a version with uncertainties.The design of the MERIS FAPAR retrieval is based on a two-step procedure, where the spectral radiances measured in the red and near-infrared bands are first rectified in order to ensure their decontamination from atmospheric and angular effects.The outputs are then converted into FAPAR by using a function that has been defined by fitting it to pairings of input rectified reflectances and output FAPAR [40].This retrieval method assumes that the leaves are alive and photosynthesising, hence the name green FAPAR.We assume that ω l (λ) and the leaf single scattering albedo are fixed to a value representing standard leaves using standard biochemical leaf properties [41,42].In this study, we average MERIS values over 3 × 3 pixels around the central pixel in order to minimise the impact of remaining clouds and cloud shadow contamination [41,42].

The MODIS FAPAR Product
Finally, we also use the MODIS FAPAR product (Product MCD15A2H, Collection 6 [23]).The MODIS FAPAR product is produced at 500-m spatial resolution, which is a coarser resolution than the MISR (275 m) or MERIS (300 m) products, which can potentially lead to inconsistencies due to the spatial heterogeneity.The MODIS FAPAR product uses a look-up table approach to invert the reflectances acquired by the TERRA and AQUA sensors over a particular time window in the red and near-infrared bands [28,43].A radiative transfer model is used to populate the Lookup Table (LUT), with some assumptions on the optical properties of the leaves and soil used to constrain the inversion.

The EO-LDAS Approach
The EO-LDAS scheme [16] is a generic land Data Assimilation (DA) system, which uses a set of observational operators together with prior information and a process model to provide an inference on the state of the land surface that is a consistent interpretation of the observations, prior information and dynamic model.The inference on the different land surface parameters is quantified as a full-probability density function (pdf), which encodes the uncertainty in the state.EO-LDAS is implemented as a variational system, where the dynamic model is implemented as a weak constraint [44].In this work, we have used the eoldas_ng Python implementation available from [45].
A priori information reduces the volume of the solution space [15].The inclusion of a priori information derives directly from a Bayesian understanding of the inverse problem.In this sense, the a priori information is a probability density function (pdf) that describes the expected distribution of the state.Typically, normal or uniform distribution (the latter just indicating parameter boundaries) have been used [16,46].In this contribution, the prior distribution is Gaussian as required by the variational framework used and is made fairly uninformative (e.g., with a large variance) in order to test how the system generalises.
A particularly useful form of a priori information exploits the often smooth nature of the temporal or spatial evolution of the land surface state.These so-called "regularisation" methods [47][48][49][50][51][52][53][54] assume temporal and/or spatial correlation as part of the prior distribution, resulting in a much reduced uncertainty [16,21].In a similar vein, there are DA methods that exploit predictions of the land surface state from a dynamic vegetation model (typically a function of LAI, FAPAR) [55].A main disadvantage in the dynamic model approach is the lack of suitable models of the temporal and/or spatial evolution of many of the variables that have a direct control on the observations (e.g., equivalent leaf water or leaf chlorophyll content).Regularisation in this sense is the application of a zero-order model on the evolution of the land surface parameters [16,21,47].
The fundamental task of EO-LDAS is to infer the land surface state by minimising a cost function made up of three terms: where J obs ( x) is the observational constraint (or the fit to the data component); J prior ( x) is the prior constraint, which includes the departure of the state from its prior normal distribution, and J model ( x) is the dynamic model constraint, which penalises trajectories of the land surface that depart from those given by a dynamic model.The elements of x are shown in Table 2.In all three cases, the statistics are assumed to be Gaussian.Equation ( 6) is in effect the logarithm of the posterior of the inverse problem EO-LDAS tries to solve.Under the assumption of Gaussian statistics and weak non-linearities, the minimum of J( x) coincides with the Maximum A Posteriori (MAP) of the inferred state, and the uncertainty can be calculated as the inverse of the Hessian matrix at the MAP point [16,56].As in [16,21], we use the parameter transformations in [57] (shown in Table 2) to provide a quasi-linearisation of the model.The posterior uncertainties are determined as the main diagonal of the covariance matrix, which is an inversion of the Hessian matrix.i.e., uncertainties are estimated as level of steepness of curvature of decision space.The more data we have, the more chances to have a single deep minimum and correspondingly lower uncertainties.If we have less data or data that cannot be described by a model, the minimum becomes flatter, and the local minimum appears.The inverse of the Hessian provides a good approximation of covariance matrix for not too non-linear models [56].

Fit to Observations
The term responsible for fitting to the observations is J obs , in effect, the log-likelihood: where H( x) corresponds to an observation operator and C o is the covariance matrix describing the uncertainty in the observations.The observational operator H( x) is implemented as the coupling of the semi-discrete canopy RT model of [58], the spectral leaf optical properties RT model (PROSPECT) of [59] and an adapted version of the spectral soil model of Price [60].Note that this H( x) term is equivalent to the first term in the right-hand side of Equation ( 3), with the only difference here being that we operate with reflectance measurements, and the JRC-TIP uses broadband albedos.The observation operator is also consequently different.
The observational operator consists of three main terms: where R 0 corresponds to the BRF due to zero order scattering or in other words absence of scattering, only two transmissions through the canopy and reflection by soil background; R 1 and R m are the BRFs due to first order and multiple scattering, respectively.In EO-LDAS the soil background reflectance is implemented as: where φ 1 and φ 2 are Price's basis functions [60], weighted by two scalars, to be inferred.This model assumes a Lambertian soil.In order to infer the values of the Price functions, we have taken MODIS observations daily surface reflectance observations between 2009 and 2015 and selected observations between Days of Year 1 and 60, where NDVI was less than 0.25, assuming that these two conditions would result in observed bare soil.We fitted regularised linear kernel models [47] to calculate nadir illumination, nadir viewing reflectances and fitted the first two Price spectral basis functions over the seven MODIS bands using a standard least squares approach.The resulting values (see Table 2) were then used as a description of the soil throughout.We assume independence of observations for different bands and cameras.However, due to the atmospheric correction and sharpening procedures, the independence assumption between cameras and bands might not strictly hold.The magnitude of these potential correlations is unknown, so we ignore them here, i.e., C o is a diagonal matrix with per band variances on the main diagonal.

The Prior
The prior constraint is written as: where C p is the covariance matrix which describes uncertainty of the prior state; x p is the vector of prior means, again as it is for the JRC-TIP (Equation ( 3)).Equation (10) states that in EO-LDAS, the prior distribution must always be Gaussian.
In the case of LAI, we propose a simple model of temporal development based on a double logistic function [61], which can be appropriate for vegetation exhibiting a clear (single, annual) phenology in the Northern Hemisphere: where wP and mP are the expected minimum background and maximum values of LAI throughout the year; S is the spring 'green up' date (increasing point of inflection); A is the autumn 'senescence' or 'brown down' date (decreasing point of inflection); mS (mA) is related to the rate of increase (decrease) at the point of inflection S. Equation ( 11) provides an estimate of the mean LAI, and for the associated standard deviation, we have assumed a Gaussian temporal distribution: where b is the background value.SD is the LAI standard deviation associated with G, the Gaussian normal distribution: where µ is the mean of G and σ the standard deviation of G.We have used the same functional shape for LAI, leaf chlorophyll content and leaf equivalent water thickness.The values used in these functions are summarized in Table 1.These values come from fitting the double logistic function against AVHRR LAI over 20 years [62], and we thus obtain mP for LAI and S, A, mS and mA for LAI, chlorophyll and senescence.wP and mP are chosen as reasonable upper/lower bounds.Having this trajectory of standard deviation assumes that the model in Equation ( 11) is suitable for winter (low uncertainty (but also low, near-zero absolute values of vegetation cover), but the model is chosen to be uninformative in summer (very high uncertainty) (Figure 2).This is due to the fact that there are few observations during winter due to clouds, snow, etc.
We use this model to provide a prior mean and variance for LAI, leaf chlorophyll content and senescence.The remaining components of the state vector are assumed known and set to the values indicated in Table 2.The purpose of the model is to approximately predict seasonal development in the case where there are no satellite data for a long period, which is typically the case at the start and end of the year.Due to the large variance introduced by Equation ( 12) over the vegetation period, the prior only has an influence at the beginning and end of year, being otherwise uninformative when the vegetation is active.2.

Temporal Regularisation
The dynamic model provides a prediction of the temporal evolution of the land surface state.We make the simplest possible assumption, i.e., that over time, the land surface parameter does not change.However, clearly, this model has an error.In other words, we assume that the difference between the state between two consecutive time steps is Gaussian, with mean zero and a particular variance.The rationale behind this very simple model is that solutions where a large high frequency component is present are penalised, resulting in a smooth temporal evolution of parameters, as expected for the parameters of interest here and in the scales that are being considered.This constraint can be written as: where γ is the regularization parameter, which represents in this case the inverse of the model error variance and controls the smoothness of retrievals.D is the differential operator of the first order.Note that this constraint is also prior information (as stated above, it imposes a Gaussian distribution on first differences of parameters).We applied temporal regularisation to all state parameters listed as "dynamic" in the Table 2.The uncertainty associated with each parameter (i.e., γ in Equation ( 14)) was estimated by cross-validation.Each parameter was given a different value of γ, and the optimal value of γ changed from year to year, reflecting the different number of observations available every year.In this study, the Leaf Angle Distribution (LAD) is prescribed as a spherical distribution.Some testing of other distributions suggest that this was not a major influence, resulting in differences in retrieved mean a posteriori FAPAR by 3-5%.
To sum up, in this work, we have solved for LAI, leaf chlorophyll content and senescent material assuming these parameters evolve with time and using the regularisation assumption described above.All other parameters were assumed known and prescribed to values given in Table 2.The other prescribed spectral parameters have little or no effect in the spectral range of the considered MISR observations [63].The inferences on the three state components were done every five days.

Gaussian Process Emulators
A limitation of variational DA methods is that they require multiple evaluations of the cost function and its associated gradient within a gradient descent minimisation scheme.The use of advanced RT models in such schemes is therefore potentially computationally costly, further compounded by the fact that the gradient needs to be evaluated either numerically by finite differences [52], or using an adjoint of the RT model, which still takes significant time to evaluate.To overcome these limitations, we propose the use of Gaussian Process (GP) emulators [21,64,65].An emulator provides a prediction of the RT model output with respect to the input parameters.To do this, the emulator is trained with a limited set of RT model input/output pairs to be able to produce this mapping.GPs are fast, cope well with non-linear RT models and produce an estimation of uncertainty of the model output prediction that can be included in the DA scheme [21] (although in this case the uncertainty is very small compared to the observational error, so it is ignored).Additionally, GP emulators can be used to provide an estimate of the emulated model gradient.The emulation of various leaf, canopy and atmospheric RT models (semi-discrete, PROSAILand 6S) is demonstrated in [21], showing speed-up by a factor of 40,000 or more.In this contribution, we have used the Python implementation of the emulators provided in [66].

FAPAR
After the inference of land state parameters and associated uncertainties, the energy balance function of the semi-discrete model can be used for computing FAPAR.Note that this provides the absorption under direct illumination and requires as inputs the estimated LAI, height of canopy and leaf diameter values.In addition, the spectral parameters, i.e., soil albedo (rg), leaf reflectance (rl) and leaf transmittance (tl) from the PROSPECT model and the background albedo from the Price soil model, have to be converted into the Photosynthetically Active Radiation (PAR) region.The sun zenith angle has been computed for 12:00 local times over the three sites using [67].Figure 3 shows that the temporal trajectories of LAI, FAPAR, leaf chlorophyll content and the fraction of senescent leaves all broadly follow an annual pattern, with peaks in the summer months (around July-September).For LAI, the growing season is characterised by high uncertainties, a consequence of the saturation of reflectance for high LAI values and the small contribution of the prior term in this period (characterised by a large uncertainty).The temporal trajectories of the leaf pigments (chlorophyll and senescence) are characterised by large uncertainties, but the posterior mean shows a clear annual cycle, with chlorophyll leading senescence, as expected.This is a significant observation, as the temporal dynamics are not fixed by the prior term (the same functional form is used for all three parameters).The value of the proportion of senescent material does not go above 0.1, whereas at the end of the growing cycle one would expect there to be no senescent leaves.Large uncertainties in the parameters are due to complex interactions in how the parameters interact (with LAI and optical properties compensating each other).It is important to note that the dynamic model used in the eoldas_ng inversion results in temporal continuous inferences, even though no observations might be available on a particular date.The results from the other two fields (Figures 4 and 5) show a very similar behaviour, with both sites showing clear seasonalities for all retrieved parameters.

Results
The parameters retrieved and shown in Figure 3 were then combined with the parameters in Table 2 and used to run the semi-discrete model and predict FAPAR.The MISR time series of EO-LDAS derived FAPAR are compared to ground-based measurements over US-Ne1, US-Ne2 and US-Ne3 sites (see Figure 6).The retrieved FAPAR tends to track the ground observations, with most of the ground observations being within the uncertainty bounds during the peak vegetation period.The inferences tend to overestimate the start and end of the growing season, and to slightly underestimate at peak LAI.The reason for the underestimate at the start of the growing season is that there is a paucity of observations at this time, and the retrieval is governed by the dynamic model interpolating between the observation-rich high LAI period and the prior-driven period with no vegetation at the beginning of the year.Towards the end of the growing season, the dynamic model sometimes fails to track the fast changes in FAPAR, again due to poor observation availability.Uncertainties are much larger when observations are not available and dynamic model restores the data.This is especially noticeable when temporal gaps are larger than one month.For example at the beginning and end of year 2002 (Ne-1), beginning of year 2003 (Ne-1), end of year 2004 (Ne-3), etc.In Figure 7 we show the retrieved FAPAR, uncertainties as well as ground measurements for US-Ne1 for 2002 only.It is clear that there are no observations available between the beginning of May and mid-July, and so the regularisation results in an overestimation over that period.Note, however, that the paucity of observations results in a noticeable increase in uncertainty, which results in the ground measurements actually being within the 95% credible interval.
Figure 8 shows linear correlations between the retrieved FAPAR and the ground measurements for the three fields and all days for which ground measurements are available (top row) and also only for the dates where there are ground observations and coincident MISR overpasses (bottom row).It is clear from the top row in Figure 8 that the retrievals for low FAPAR are overestimated, and that this is caused by the interpolation provided by the dynamic model.When only points with observations are considered, the correlation increases, and the bias, slope and intercept all decrease, suggesting that the inversion works well where observations are available and the quality of the inferences drops as one moves away from the observations, with the dynamic model being too simple to track the changes in the rates of the process (as shown in Figure 7.In summary, we see that the correlation between retrieved and in situ FAPAR is high (r 2 > 0.8 for all dates, increasing to r 2 > 0.85 for retrievals coincidental with ground measurements) and the average RMSE is 0.14 (only at days of satellite acquisitions).The slope of the retrieved FAPAR is consistent with an overestimation of FAPAR at the start and end of the growing season, and a slight underestimate in summer.When MISR observations are coincident with ground measurements, the slope becomes closer to the 1:1 line and the bias in the linear model tends to vanish.
In Figure 9, we show the results of comparing ground-measured and retrieved LAI.The comparisons show that there is a strong correlation between retrieved and in situ LAI, but an important underestimation.One has to recall that any retrieval LAI from space or ground-based data correspond to an effective value which depends on the RT used during the retrieval or protocol in the case of in situ [68].In general, retrieved effective LAI is close to the in situ measurements when LAI is low (≤2).After that, the retrieved LAI is lower than the in situ measurements.As the canopy becomes optically thicker, the sensitivity of the MISR observations decreases.This is expected as the value is close to the theoretical limit of retrieval of LAI as described in [69].This is accompanied by an increase in uncertainty in the retrieved LAI estimate (see Figure 3).The summary statistics show that there is ample room for improvement: the correlations are 0.86, 0.80 and 0.68 (for US-Ne1, US-Ne2 and US-Ne3, respectively), slopes are 1.79, 1.86 and 1.55 (same order), and intercepts are −0.26,−0.36 and 0.28.RMSE values are 1.92, 2.03 and 2.16.
Although we have stated that the impact of the prior term should only be to tighten the retrieval in the period with no vegetation and few observations, this has not been demonstrated.In Figure 10, we show the comparison of retrieved FAPAR for the days of coincident satellite overpasses, where the prior is just set to a constant mean and uninformative (i.e., large) variance.The results are virtually the same as those shown in Figure 8, which demonstrates that the effect of the prior is minimal in the retrievals, as expected.

The JRC-TIP Results
The results of FAPAR from the JRC-TIP product is shown in Figure 11.We note that this product is only able to provide an estimate of FAPAR when satellite observations are available.Furthermore, we note that for a number of years, the JRC-TIP FAPAR results in a more peaky growing season, which often peaks earlier than the ground measurements.
Figure 12 shows the results from the JRC-TIP product assuming green leaves (in effect, a tighter prior on the leaf single scattering albedo).The results show a clear underestimate of FAPAR compared to the in situ measurements.We have also carried out a comparison between ground LAI and retrieved LAI, but we know that the retrieved LAI is effective [34,70].The RT model used in this retrieval has indeed been developed for climate modelling assimilation of space land products; therefore it is based on a two-stream theory and retrieval values from surface broadband albedo.In [34] is recalled the need for a correction of a structure factor at the pixel resolution associated with the heterogeneous nature of the canopy volume.In general, the correlation coefficient is low (0.21-0.64), and there is a large bias (>1.3), as well as a scatter of around 2. The large bias is due to the JRC-TIP saturating at around three, as explained in [17]

The JRC MERIS Product
Figure 13 shows a comparison of the JRC MERIS FAPAR product and the in situ measurements.In general, the MERIS product provides an accurate description of the FAPAR annual trajectory, although with some underestimation of FAPAR at the peak of the growing season, as well as some overestimates of the very low FAPAR values at the beginning and end of the growing season.

The MODIS FAPAR/LAI Product
Figure 14 shows FAPAR from the standard MODIS product MCD15A2H Collection 6.The results are generally good, the fine temporal sampling of using the two MODIS sensors resulting in a good coverage of the annual period.The estimates, however, tend to undershoot the peak FAPAR value consistently, and to overestimate the leading and trailing edges of the growing season.
The MODIS LAI product has a similar behaviour to the EO-LDAS LAI retrievals: with reasonably high correlations 0.84, 0.71 and 0.60 for US-Ne1, US-Ne2 and US-Ne3, respectively, but, again, resulting in an underestimate, particularly when the LAI is high.

Discussion
Comparisons of FAPAR retrievals for all products with ground-based data are shown in Figure 15 and Tables 3-5.The EO-LDAS results have the highest linear correlation among all the compared products, as well as the lowest root mean square error.There is a positive slope, in line with the comments made in the last section about underestimates of FAPAR during the start and end of the vegetation period.If only dates where MISR observations are available are taken into account (remember that of all these products, EO-LDAS is the only one that will produce estimates for the entire time series), then the correlation increases, the slope approaches unity, and the bias disappears.These are remarkable results since MISR has far fewer observations per year (around 18) than MODIS (around 200) or MERIS (around 50), and yet the retrieved EO-LDAS MISR FAPAR performance is better than the other products (Table 6).The small number of observations and the very different temporal sampling of the MISR data affect the values of the regularisation parameter γ retrieved by cross validation.In years with sparse observations, this can affect the impact of the dynamic model on the retrieved parameter trajectories.For example for US Ne-1, in 2001 (Figure 6), the regularisation parameter was found to be 3.6, in contrast to regularisation values around two orders of magnitude higher.This means that the character of the solution is likely to be less smooth and have higher associated uncertainties [48].
The poorest results in this comparison are with the JRC-TIP with green leaves, followed closely by the JRC-TIP with polychromatic leaves.The MERIS and MODIS products are comparable, with MERIS showing a slight negative bias.In general, all the tested products struggle to reach the highest in situ measured FAPAR, with a significant underestimate of FAPAR when in situ FAPAR is over 0.8.For EO-LDAS, this limitation can be traced back to the inability of the system to reach high values of LAI, a problem which is also expected with the JRC-TIP, where LAI is basically unconstrained whereas the optical properties of the leaves and soil are heavily constrained by priors.The other common striking feature of the comparisons is how the retrievals result in an overestimation of FAPAR when the in situ FAPAR is low (less than 0.3).This is a common feature of all products, but it is perhaps more exaggerated with the MODIS observations, which produce an overestimation of FAPAR where in situ values are below 0.6.In the case of the EO-LDAS retrievals, the effect is quite strong when all the in situ observations are considered, but the effect is attenuated when only the dates that have coincident observations are considered.By referring back to Figure 6, we explain this behaviour by pointing out that the low FAPAR values are usually in periods of very sparse MISR data (this is common to all years/sites), so that the effect is that of the system interpolating between two widely separated observations.An important feature of the EO-LDAS approach is the use of a dynamic model for the retrievals.As the uncertainty in the dynamic model (γ in Equation ( 14)) is constant, the trajectory of the state will tend to change not as rapidly as the observations on the ground, and this can lead to solutions that are too smooth in periods where the dynamics of the state are very fast.In our EO-LDAS results, the regularisation results in an increase in uncertainty, that although the MAP estimate overshoots the ground measurements, most of the measurements are within the uncertainty boundaries (except for a couple of cases of the earliest and latest stage of the growing season).Having more observations over those highly dynamic periods would alleviate this problem.In observation-sparse periods (e.g., 2001 for LAI in US-Ne1, Figure 3), observations produce a discontinuity in parameter trajectories, which is noticeable in the MAP solution, but small in terms of the total uncertainty.This artefact could be caused by locally defective convergence of the optimisation.In the case of abrupt changes, different techniques would need to be used, such as those presented in [71].
A further observation is that in the case at hand, we are using surface directional reflectance.The non-vegetative period is characterised by a rough soil surface with crop residue.The simple soil model is unable to properly model this period, for example being unable to, model BRDF effects present in the observations.The only freedom allowed by the model in this period is to modify canopy parameters to account for this, which is what causes the departure from zero of LAI in Figure 3 and the oscillations of C ab in the non-vegetative period.We can demonstrate this by running the following experiment: by fixing the leaf optical properties to sensible values and setting the LAI to be the (temporally interpolated) in situ LAI, we can then predict the observations.We can see the predictions and observations in the red and near-infra-red bands for 2002 and the US-Ne1 site in Figure 16.It is clear that the model fits the growing season reasonably well, but struggles with the bare soil period, indicating that here the RT model has problems replicating the data.Extending the RT model to have proper treatment of BRDF (for example, adding a Walthall [72] or Hapke [73,74] soil model) would alleviate this problem.Another approach would be to consider in the uncertainty budget in Equation ( 7) the model uncertainty.Under the assumption that the model uncertainty is (i) independent of the measurement uncertainty and (ii) normally distributed, we have that the observational constraint is given by: a case that is readily implemented in eoldas_ng, but where the model uncertainty encoded in C m might not be straightforward to assess.The effect of the prior introduced in this study goes somewhat to soften this problem, but leads to a dampening the contribution of J obs ( x) to the minimisation.It is instructive to compare retrievals from the JRC-TIP product and EO-LDAS.Both approaches share the philosophy of calculating FAPAR by running a RT model with a parametrisation of the land surface state derived from inverting observations.In Figure 17, it is clear that there is a strong correlation between the LAI value retrieved from both approaches (taking into account that the retrieved effective LAI), but it is also apparent that the Single Scattering Albedo (SSA) estimates from both approaches is anti-correlated.In Figure 18, we can see that the JRC-TIP estimation for the soil background albedo in the visible changes throughout the year, whereas in the EO-LDAS case we have just fixed a particular soil spectral model.LAI (middle panel in Figure 18) shows some agreement, with EO-LDAS providing a more realistic trajectory due to the interpolation.The uncertainties from both products have similar trends with low values when LAI is low and high values in summer.The temporal evolution of the SSA in interesting: the JRC-TIP solution barely shifts from the prior, whereas the EO-LDAS version has a clear seasonality as a consequence of leaf chlorophyll content and fraction of senescent leaves changing throughout the growing period.In this case, the use of spectrally-resolved observations results in a richer interpretation of the EO data, rather than working from an spectrally integrated broad band, as is the case with the inputs to the JRC-TIP product.The relatively good performance of the MERIS FAPAR algorithm is probably caused by its simplicity.Rather than targeting inference of a full set of land surface parameters, an equation is used to map from top of atmosphere reflectance to FAPAR, making strong assumptions on leaf single scattering albedo and soil background.Going through the EO-LDAS or JRC-TIP approaches needs the inference of a larger set of parameters, which necessarily results in larger uncertainties.However, a limitation of the MERIS algorithm is that it only produces an estimation where observations are present, and would need to be extended to cover different sensors simultaneously to produce a consistent FAPAR value.The method also does not estimate the underlying land surface parameters (e.g., LAI) which, if required for a particular application, would need to be obtained from a different product, potentially introducing inconsistencies e.g., due to different choices in the underlying RT model used for inversion.
The results from comparing the products that provide LAI estimates with ground LAI measurements show a common trend: all products underestimate LAI substantially.For the JRC-TIP, we note that this is expected as the LAI is effective [34], so this is not an entirely fair comparison.For the EO-LDAS MISR and MODIS products, we see an important underestimate of LAI when the ground value is high and effective LAI reach the saturation limits..The different spatial scales of the satellite and ground measurements can result in very different LAI values [70,75,76].It is interesting to note that the uncertainties in the retrieval of LAI with EO-LDAS MISR show a strong asymmetry (e.g., the uncertainty region above the posterior mean is much larger than the uncertainty region below the posterior mean) for high LAI (see Figures 3 and 4), which is a statement of the limited sensitivity of the observations to high LAI.

Concluding Remarks
The main focus of this study is the estimation of daily FAPAR values by using EO-LDAS on MISR data.The evaluation of results against JRC-TIP, MERIS FR, MODIS MCD15 and in situ measured 'green' FAPAR was carried out for single pixels covering three agricultural FLUXNET sites over eight years.We have compared results from a number of different approaches and sensors: the EO-LDAS and JRC-TIP approaches both use MISR data (surface directional reflectance for the former and broadbands bi-hemispherical reflectance for the latter).These two methods rely on the inversion of an RT model, auxiliated by prior parameter distributions, and in the case of EO-LDAS, a dynamic regularisation model that allows the inference even at times where no satellite observations are present.The JRC MERIS FAPAR product uses MERIS top of the atmosphere reflectance and a polynomial that maps these measurements to FAPAR, and the MODIS product uses a Lookup Table (LUT) that maps red and near-infrared surface reflectance to FAPAR.It is important to point out that the difference in temporal sampling between the three instruments: MISR has a much lower number of observations than MERIS, and MERIS has a much lower number of observations than MODIS.
We have compared the different products with in situ ground measurements, and these indicate that FAPAR is retrieved with reasonable accuracy for three products: EO-LDAS (r 2 > 0.8, RMSE < 0.18 in units of FAPAR), MERIS (r 2 > 0.6, RMSE < 0.2) and MODIS (r 2 > 0.8, RMSE < 0.16).The JRC-TIP results show a poor performance, with r 2 values between 0.3 and 0.6 and RMSE in excess of 0.26.If only dates with MISR overpasses are considered for EO-LDAS, the estimates from EO-LDAS improve to an r 2 between 0.85 and 0.94, an RMSE < 0.14 and with a bias that is between 2 and 7%.
All products have problems tracking the high FAPAR peak of the growing season, resulting in all products underestimating FAPAR at this point by 10-20%.Additionally, all products tend to overestimate FAPAR for the flanks of the growing season.
Three products (EO-LDAS, JRC-TIP and MODIS) also retrieve LAI (or effective LAI).The results all indicate that high LAI is underestimated.We propose that two processes are having an effect here: first, as the canopy becomes optically thicker, underestimation of LAI is expected [17], and secondly, the comparison of coarse resolution observations with point measurements introduces the effects of sub-pixel landscape heterogeneity [75,76].In the literature [77,78], the use of empirical methods that have been trained with ground observations of the same area limits the generality of the methods for global applications.Additionally, no simultaneous inferences on FAPAR are presented in either of these two references.For the three coarse-resolution products that we considered in this study, comparisons with in situ LAI result in (r 2 > 0.7, RMSE < 2.2) for EO-LDAS, (r 2 > 0.6, RMSE < 2.4) for MODIS and r 2 between 0.4 and 0.6, RMSE < 2.3 for the JRC-TIP (but again, note that LAI for the JRC-TIP is effective).The RMSE between all products and the ground observations is of the order of two units of LAI.In this study, we have used point measurements of LAI as a comparison.The recommended best practice is to use these measurements to provide a spatially explicit map, i.e., the minimum size of a validation site has to be compatible with resolution of satellite data [79,80].
Additionally, the MISR EO-LDAS approach also estimates the daily evolution of leaf chlorophyll content and fraction of senescent leaves.Both of these parameters show a credible temporal trajectory (although they have not been compared with any in situ measurements), with a clear arch for both when LAI is high, but showing leaf chlorophyll dropping off earlier than senescence and senescence growing later than leaf chlorophyll.This is a very encouraging result, as no timing information on these two parameters as provided in the priors pdf used, and only observations from four spectral bands in the visible-NIR region were available.
The MISR EO-LDAS approach is characterised by large uncertainties in both land surface parameters (LAI, C ab , C brown ), as well as fluxes (FAPAR).This is inevitable due to the poor temporal sampling.We note that high uncertainty in LAI during periods of high LAI is to be expected, and we note that this uncertainty (see Figure 3) is asymmetric: high above the MAP estimate and low below, encoding the inability of the RT model of interpreting the reflectance as a clear high value of LAI, but quite certain that it is larger than e.g., two.This is in marked contrast to the uncertainties in LAI for the JRC-TIP product (see Figure 18).The advantage of EO-LDAS is obtained by using the quasi-linearising transformations in Table 2.
In this paper, we deliberately only compare EO-derived estimates of FAPAR from approaches that rely on the inversion of RT models (and hence, ought to show some level of consistency and physical accuracy).However, we also note that there is a long history of exploiting the relationship between empirical vegetation indices (such as NDVI) and FAPAR.Recent work by [22] has used ground spectral measurements, as well as MODIS 250-m NDVI data to revisit these relationships for the sites that are considered in this study.In [22], different regressions relating NDVI to FAPAR are shown to produce a very large spread of possible values of FAPAR.The authors also derive a set of equations that can be applied to the site, but note that they need to be split by crop type (maize and soya) and by development (vegetative and reproductive) stage.The resulting correlations are similar to those reported in the present study, whereas other non-specific formulations of the mapping between NDVI and FAPAR show a large scatter, indicating that significant effort is required for local calibration to provide useful results.In [22], additional understanding of the crop type and phenology are required for optimal performance.Using the EO-LDAS approach, we have only used very generic priors detailing vegetation structure and leaf optical properties, although of course, more informative priors could be used, if they were available.As part of the process, estimates of widely-used land surface parameters have also been retrieved, along with well-quantified uncertainty.This is a particular feature of the EO-LDAS approach, which provides an important benefit over other methods.In particular, LAI has been retrieved with an error that is in the range of other typical global LAI products for this site, and more generally [81].The EO-LDAS approach also exploits temporal regularisation to provide a consistently gap-filled estimate of parameters even when no observations are available.The use of radiative transfer models within this regularisation framework allows for simple inclusion of observations from other sensors, as these are interpreted and assimilated in terms of common descriptions of the land surface.These are important advantages of the EO-LDAS approach presented here, and they have been proven to be useful: having a much poorer temporal sampling regime, results from the EO-LDAS MISR approach are in line with the JRC MERIS and MODIS FAPAR products.LAI retrievals are also in line with the other products, indicating that the approach based on interpretation of the observations using physical models succeeds in providing consistent estimates of FAPAR and other land surface parameters.
An additional important consideration is that the EO-LDAS retrieved parameters are self-consistent, i.e., the same physical (RT) model assumptions are made to retrieve e.g., both LAI and FAPAR.This is important if such parameter estimates are then used to drive models [8].Inconsistencies across suites of parameters can result in hard-to-trace model deviations and uncertainties in model predictions.Parameter consistency is hard, if not impossible, to ensure when using locally-calibrated empirical relationships where different parts of the state vector (e.g., LAI, leaf chlorophyll concentration or FAPAR) are derived separately.Further, the incorporation of priors makes explicit the assumption that both the chosen physical model and sets of priors are adequate for the task at hand.The use of Bayesian approaches with physical models necessitates clear statements about observational uncertainty [82,83].These ought to be provided as the standard with EO-derived surface reflectance products, but rarely are in practice due to the limitations of retrieval processes.Finally, although in this study we have ignored the uncertainty associated with using an RT model, this can be readily introduced in the EO-LDAS formalism, if information on the properties of this error were available.Unacknowledged model error can result in biases in the solution.
We demonstrate that at least for the particular sites shown here, EO-LDAS is able to estimate absorbed fluxes, other biophysical parameters and associated uncertainties relying on multi-angular surface reflectance observations.The EO-LDAS scheme also allows for a simple combination of other available observations, thus opening the door to multi-sensor estimates of fluxes and/or biophysical parameters.The estimation of uncertainties, as well as the retrieval of a complete set of ground biophysical parameters is also an important tool in providing data that allows us to learn more about the land surface.

Figure 1 .
Figure 1.Map of the test site (Landsat).Red crosses are the points of ground measurements.

Figure 2 .
Figure 2. Prior information for LAI (transformed units), leaf chlorophyll content (transformed units) and proportion of senescent material.The shaded region represents the uncertainty range in each case.Parameter transformations are shown in Table2.

Figures 3 -
Figures 3-5 show the evolution of LAI, leaf chlorophyll content, proportion of senescent material and FAPAR retrieved by EO-LDAS over the US-Ne1, US-Ne2 and US-Ne3 sites, respectively, between 2001 and 2008.

Figure 6 .
Figure 6.Comparison of FAPAR MISR time series between field measurements (yellow dots) and EO-LDAS predictions (red lines) over US-Ne1, US-Ne2 and US-Ne3 sites.Green lines indicate dates of MISR surface available data.Light and dark shaded areas correspond to uncertainties 95% and 75% credible interval, respectively.

Figure 7 .
Figure 7.Comparison of FAPAR MISR time series between field measurements (yellow dots) and EO-LDAS predictions (red lines) over the US-Ne1 site for year 2002.Green lines indicate dates of MISR surface available data.Light and dark shaded areas correspond to uncertainties 95% and 75% credible interval, respectively.

Figure 8 .
Figure 8. Scatter-plots of all available data pairs of in situ measured and MISR EO-LDAS derived FAPAR and field FAPAR for the investigated time period from 2001-2008.(Top panels) all available EO-LDAS data points; (bottom panels) only at days of satellite acquisitions.Blue lines were derived from least square regression, while 1:1 lines are depicted in grey.

Figure 15 .
Figure 15.Relationship between field measurements and JRC-TIP FAPAR for standard leaf (top panels), JRC-TIP Green FAPAR (second row of panels), 3 × 3 MERIS FR (third row) and MODIS MCD15A2H (bottom panels), the period between 2001 and 2008.Red lines were derived from least square regression, while 1:1 lines are depicted in grey.

Figure 16 .
Figure 16.Comparison of reflectance from MISR observations (An camera, US Ne1 site, 2002) and reflectance obtained from a forward run of the model with LAI fixed to the ground reference data.

Figure 17 .Figure 18 .
Figure 17.MISR: Comparison between EO-LDAS and JRC-TIP Leaf Area Index (LAI; upper panels) and leaf Single Scattering Albedo (SSA; lower panels) in the case of polychrome leaf assumption for the JRC-TIP.Red lines were derived from least square regression.

Table 1 .
Summary of the double logistic model and Gaussian distribution parameters.

Table 2 .
Summary of the state parameters.
Scatter-plots of data pairs of in situ measured and MISR EO-LDAS derived FAPAR without dynamical prior for the investigated time period from 2001 to 2008.Only data from days of satellite acquisitions are shown.Red lines were derived from least square regression, while 1:1 lines are depicted in grey.
(Top panels) all available EO-LDAS data points; (bottom panels) only at days of satellite acquisitions.Blue lines were derived from least square regression, while 1:1 lines are depicted in grey. .

Table 4 .
Summary statistics derived from the comparison of FAPAR field measurements (2001-2008) collected on the US Ne-2 site with all corresponding EO-LDAS derived values (EO-LDAS all), EO-LDAS derived values at days of satellite acquisitions (EO-LDAS Obs.) and values obtained from JRC-TIP, JRC-TIP green, JRC MERIS and MODIS MCD15A2H products.

Table 5 .
Summary statistics derived from the comparison of FAPAR field measurements (2001-2008) collected on the US Ne-3 site with all corresponding EO-LDAS derived values (EO-LDAS all), EO-LDAS derived values at days of satellite acquisitions (EO-LDAS Obs.), and values obtained from JRC-TIP, JRC-TIP green, JRC MERIS and MODIS MCD15A2H products.

Table 6 .
Average number of observations over all available years.