Sensitivity of Evapotranspiration Components in Remote Sensing-Based Models

Accurately estimating evapotranspiration (ET) at large spatial scales is essential to our understanding of land-atmosphere coupling and the surface balance of water and energy. Comparisons between remote sensing-based ET models are difficult due to diversity in model formulation, parametrization and data requirements. The constituent components of ET have been shown to deviate substantially among models as well as between models and field estimates. This study analyses the sensitivity of three global ET remote sensing models in an attempt to isolate the error associated with forcing uncertainty and reveal the underlying variables driving the model components. We examine the transpiration, soil evaporation, interception and total ET estimates of the Penman-Monteith model from the Moderate Resolution Imaging Spectroradiometer (PM-MOD), the Priestley-Taylor Jet Propulsion Laboratory model (PT-JPL) and the Global Land Evaporation Amsterdam Model (GLEAM) at 42 sites where ET components have been measured using field techniques. We analyse the sensitivity of the models based on the uncertainty of the input variables and as a function of the raw value of the variables themselves. We find that, at 10% added uncertainty levels, the total ET estimates from PT-JPL, PM-MOD and GLEAM are most sensitive to Normalized Difference Vegetation Index (NDVI) (%RMSD = 100.0), relative humidity (%RMSD = 122.3) and net radiation (%RMSD = 7.49), respectively. Consistently, systemic bias introduced by forcing uncertainty in the component estimates is mitigated when components are aggregated to a total ET estimate. These results suggest that slight changes to forcing may result in outsized variation in ET partitioning and relatively smaller changes to the total ET estimates. Our results help to explain why model estimates of total ET perform relatively well despite large inter-model divergence in the individual ET component estimates.


Introduction
Evapotranspiration (ET) has a critical influence on the cycling of water and energy and represents a crucial relationship for feedbacks and interactions between those cycles [1,2]. Future changes to regional and global climate are expected to significantly alter both the supply (precipitation, snow and groundwater) and demand (ET) side of the hydrological cycle [3,4]. These hydrological impacts

Models and Data
Three remote sensing-based evapotranspiration models were selected for this study: the Priestley-Taylor Jet Propulsion Lab (PT-JPL) model [28], the Penman-Monteith MODerate Resolution Imaging Spectroradiometer (PM-MOD) model [29] and the Global Land Evaporation Amsterdam Model v3.1 (GLEAM) [30,31]. These models are chosen because they have previously been compared against eddy flux ET measurements [25,32,33], as well as ET partitioning estimates from isotopes, sap flux and other methods [18]. Each model is widely used to estimate evapotranspiration and outputs the individual components of ET: transpiration, soil evaporation and interception evaporation. Both PT-JPL and GLEAM rely on the Priestley-Taylor equation to estimate a potential ET value [34] and then constrain that potential value to an actual estimate using multiplicative evaporative stress factors based on ancillary data. A detailed description of PT-JPL may be found in Fisher et al. (2008) [28], while GLEAM is described in   [31] and Martens et al. (2017) [30]. PM-MOD on the other hand uses a Penman-Monteith estimate of potential ET based on canopy and aerodynamic resistance parameterizations and then similarly constrains the potential value to an actual estimate. PM-MOD is described in depth by Mu et al. (2011) [29].
We used the WACMOS-ET forcing database (http://wacmoset.estellus.eu/) to run the three models, which is based on both remote sensing-based and atmospheric reanalysis data [17,25]. The WACMOS database also contains vegetation retrievals (Leaf Area Index (LAI), fraction of absorbed photosynthetically active radiation (f (APAR)) necessary to force the PT-JPL and PM-MOD models. However, these vegetation retrievals are inconsistent with the MODIS LAI and f (APAR) retrievals used to develop PT-JPL and PM_MOD and require rescaling before they can be used to force the models.
Here, we substitute the WACMOS-ET vegetation products with MODIS vegetation products originally used in the development of PT-JPL and PM-MOD (NASA LP DAAC, [35,36]). By using the MODIS vegetation products, we limit the possible ET errors associated with the adaptation and rescaling required to use the WACMOS-ET vegetation products in these models [25]. The WACMOS forcing dataset is presented at a common 0.25 • longitudinal × 0.25 • latitudinal grid, at 3-hourly temporal scale. However, GLEAM no longer provides sub-daily estimates of ET, while PT-JPL remains sensitive to maximum daily temperature and minimum daily humidity estimates. As such, we run GLEAM at a daily resolution using the aggregated forcing dataset and run both PT-JPL and PM-MOD at 3-hourly scale.
Remote Sens. 2018, 10, 1601 4 of 28 We force each model at 42 pixels over three years (2003)(2004)(2005), rather than the global dataset. These locations correspond to locations where previous studies have estimated ET components using field methods and have previously been used to compare modelled output against field based ET component estimates [18]. Despite the range of spatial support and uncertainty related to each field method, we assume that, in the aggregate, the field estimates offer a good means by which to evaluate the partitioning estimates of the remote sensing-based models. While limiting the dataset to these specific locations is not globally representative and neglects the model sensitivity in regions not represented in the data, it allows us to directly relate the results of this study to the results of Talsma et al. (2018) [18]. We can then determine if model propagation of forcing uncertainty found in our Monte Carlo analysis is consistent with error found between modelled and field estimates. Table 1 reports the latitude and longitude of each location used as well as the study reference, location, methodology, land cover type, precipitation and aridity. Many of the locations are within the mid-latitudes and may not reflect the sensitivity of the models in the tropics or sub-tropics. However, the dataset is not intended to be representative of any specific region and instead only presents a limited range of bio-regions and climates where field measurements have been made.
While some variables within the forcing dataset are shared among all three models, each model also requires unique variables. Table 2 shows the dynamic variables examined in this study that force each model and lists the original source product. In addition to the dynamic forcing, both PM-MOD and GLEAM require additional spatially-varying static variables. PM-MOD relies on the MODIS IGBP landcover type product to parameterize canopy resistances, while GLEAM uses static variables to describe soil properties (Global Soil Task Group, 2014) and vegetation fractions at the sub-pixel level (MOD44B) and adopts lightning frequency data to force the interception loss model. GLEAM also utilizes snow water equivalent and soil moisture data as dynamic variables. However, due to the length of the study period the soil moisture data assimilation component of GLEAM is not used and due to the sparse number of study locations the snow water equivalent was not analysed. Only the variables listed in Table 2 are analysed in this study. Table 1. List of 42 locations and references for studies using field methods to estimate ET components (transpiration, soil evaporation, interception). Aridity and annual precipitation are also listed. The majority of the sites are located in the mid-latitude Mediterranean and United States and are not globally representative. We use these locations to compare the results from our Monte Carlo analysis against the calculated difference between modelled and field estimates of ET components.

Monte Carlo Sensitivity Analysis
To assess each model's relative sensitivity to the forcing, we use a Monte Carlo style simulation [77,78]. This approach involves perturbing each variable by some random uncertainty drawn from a predetermined probability distribution and analysing how the final model output changes in reference to the perturbation. We can then compare the model sensitivity across variables, within variable ranges and across models. As such, the sensitivity of the models to the dynamic forcing ( Table 2) is examined. GLEAM, and, to a lesser extent, PM-MOD and PT-JPL use static parameters and variables that vary in space but do not vary in time. While these variables are important to the partitioning within the models, this study does not consider the sensitivity of the models to these constants.
Several of the forcing products in Table 2 are derived from re-analysis products that are the result of model simulations which incorporate observations from balloon soundings, satellite data and in situ measurements in a data assimilation system. Furthermore, the original products have been pre-processed further to create a common spatial and temporal scale. Therefore, data quality will change in time and space [73] and an accurate estimate of the product uncertainty is difficult to determine. To account for potential changes in forcing uncertainty, different Monte Carlo experiments will be performed with different perturbation strengths in order to analyse how the model outputs respond to various uncertainty ranges.
For unbounded forcing we assume a Gaussian error distribution to describe the errors, while we use a truncated-normal distribution to describe the errors of the bounded variables. Bounded variables included relative humidity (RH, 0-1), leaf area index (LAI, 0-∞ m 2 /m 2 ), normalized difference vegetation index (NDVI, −1.0-1.0), fraction of absorbed photosynthetically active radiation (f APAR, 0-1), precipitation (P, 0-∞ mm/day), Vegetation Optical Depth (VOD, 0-∞) and shortwave incoming radiation (Ku, 0-∞). Unbounded variables included net radiation (Rn) and temperature (Ta). For each time step, location and variable, we create a theoretical distribution from which we randomly sample. In all cases, the expected mean value comes from the value of raw forcing variable and the standard deviation of the distribution is a predetermined percentage of the raw value at eight stepwise intervals (0.1%, 1%, 5%, 10%, 15%, 20%, 30% and 40%). At an uncertainty level of 40%, many of the variables have collapsed to distributions in which more uncertainty would not significantly change the distribution of forcing data and model components have broken down to unrealistic flux estimates. Furthermore, the boundary effects of bounded variables could skew model sensitivity at large uncertainty.
If we assume that a large number of sources contribute to the uncertainty and noise of a remotely sensed variable, then by the central limit theorem we can approximate the uncertainty of that variable using a Gaussian distribution. The Gaussian distribution allows us to perturb the data while conserving the original unperturbed value as the mean of the distribution. Thus, any bias found in the distribution of the sensitivity results will reflect non-linear formulations within the model. While this distribution may not reflect the reality of the uncertainty contained within the data, it allows us to observe how the Remote Sens. 2018, 10, 1601 8 of 28 model formulation might propagate error in a way that would introduce bias to the final estimates. Thus, we assume that the forcing does not contain any systematic bias in its error.
The truncated-normal distribution is a combination of a mean conserving distribution with a mean changing shift, so that bounded variables may not preserve the original mean [79]. The truncated-normal distribution is advantageous over a beta or gamma distribution because we can specify large uncertainties without the distribution collapsing toward probability masses located only at one or both bounds. If a variable is bounded on one side or if the original mean is closer to one of the bounds, then the expected mean of the resulting distribution will shift away from that bound. The resulting magnitude of the shift is a function of the difference between the bound and the original mean and the level of added uncertainty. The resulting shift is therefore more likely to influence the results found here if the added uncertainty is high or if the original variable is closer to one of the bounds.

Error and Bias Assessment
As previously stated, we forced the three models for three years (2003)(2004)(2005) at 42 locations where model output has previously been evaluated against field-based estimates of ET components [18]. Random error was added to each forcing variable based on the described probability distributions and each data point was sampled 100 times. This process was repeated at each assumed uncertainty level and for each variable field. In this study, we cannot determine the exact uncertainty for each forcing variable or the correct combination of forcing variable uncertainty. We perturb only one variable at a time, as this allows us to disentangle the effect of the individual forcing variables and evaluate them in isolation.
After the model simulation, we then calculate the relative error for each model and evaporation component (RE): where is the perturbed estimate and is the unperturbed estimate of the ET component. We then pool the relative error at each time, location and random sample and analyse the results as a distribution of the relative error values. We analyse the resulting distribution using two objective functions: percent root mean squared deviation (%RMSD, Equation (2)) and percent mean bias deviation (%MBD, Equation (3)). Both functions are calculated at an added uncertainty (σ) of 10%

Error and Bias Assessment
As previously stated, we forced the three models for three years (2003)(2004)(2005) at 42 locations where model output has previously been evaluated against field-based estimates of ET components [18]. Random error was added to each forcing variable based on the described probability distributions and each data point was sampled 100 times. This process was repeated at each assumed uncertainty level and for each variable field. In this study, we cannot determine the exact uncertainty for each forcing variable or the correct combination of forcing variable uncertainty. We perturb only one variable at a time, as this allows us to disentangle the effect of the individual forcing variables and evaluate them in isolation.
After the model simulation, we then calculate the relative error for each model and evaporation component (RE): where y i is the perturbed estimate and y 0 is the unperturbed estimate of the ET component. We then pool the relative error at each time, location and random sample and analyse the results as a distribution of the relative error values. We analyse the resulting distribution using two objective functions: percent root mean squared deviation (%RMSD, Equation (2)) and percent mean bias deviation (%MBD, Equation (3)). Both functions are calculated at an added uncertainty (σ) of 10% and are normalized by the mean total ET estimate for the model across all standard outputs (i.e., no added uncertainty) and expressed as a percentage of total ET: And where . y i,x represents the estimated value after perturbation, y i,x is the original estimate, N is the total number of estimates, x is the specific location and i the specific time and ET is the model-specific mean ET over all locations and time. 10% was chosen because it represents significant added uncertainty while avoiding the boundary effects of larger uncertainties. Generally, a greater interquartile range signifies that the model is more sensitive to that forcing variable. The deviation of the median from a relative error of 0 signifies asymmetry in the model response to forcing uncertainty due to non-linearity in the model formulation. and are normalized by the mean total ET estimate for the model across all standard outputs (i.e., no added uncertainty) and expressed as a percentage of total ET:

Model Sensitivity Distributions
where , represents the estimated value after perturbation, , is the original estimate, N is the total number of estimates, is the specific location and the specific time and is the model-specific mean ET over all locations and time. 10% was chosen because it represents significant added uncertainty while avoiding the boundary effects of larger uncertainties.

Model Sensitivity Distributions
Figures 4-6 show the sensitivity of the model output to each of the forcing variables for different uncertainty levels. The plots show the interquartile range of the distribution of relative error between perturbed and unperturbed forcing for each uncertainty level, as well as the median of the distribution. Generally, a greater interquartile range signifies that the model is more sensitive to that forcing variable. The deviation of the median from a relative error of 0 signifies asymmetry in the model response to forcing uncertainty due to non-linearity in the model formulation.  (1)) is used to measure the difference in model output before  (1)) is used to measure the difference in model output before and after perturbing the forcing variable. The shaded region represents the area contained within the 25th and 75th percentile of resulting error distribution while the bold line represents the median. and after perturbing the forcing variable. The shaded region represents the area contained within the 25th and 75th percentile of resulting error distribution while the bold line represents the median.  While the interquartile range and median give us an idea of the distributions of error at across increasing amounts of uncertainty, they do not represent the entire distribution. Figures 7-9 show the probability distributions of the relative error for each forcing variable and each modelled ET component at an added uncertainty of σ = 10%. Table 3 shows the root mean squared deviations (%RMSD, Equation (2)) and mean biased deviation (%MBD, Equation (3)) for each of the variables and each modelled ET component at this uncertainty level. The values in Table 3 are expressed as a percentage of the mean ET estimate across all data points. While the interquartile range and median give us an idea of the distributions of error at across increasing amounts of uncertainty, they do not represent the entire distribution. Figures 7-9 show the probability distributions of the relative error for each forcing variable and each modelled ET component at an added uncertainty of σ = 10%. Table 3 shows the root mean squared deviations (%RMSD, Equation (2)) and mean biased deviation (%MBD, Equation (3)) for each of the variables and each modelled ET component at this uncertainty level. The values in Table 3 are expressed as a percentage of the mean ET estimate across all data points.        PT-JPL takes four forcing variables: net radiation, NDVI, temperature and RH. Each of the modelled ET components varies linearly with net radiation. Figure 4 shows that the relative error also increases linearly with increasing uncertainty in the net radiation term as expected due to the linear formulation of radiation within each model. Due to this linear relationship, there is little to no bias in the model output associated with uncertainty in net radiation evidenced by a MBD of 1% for all ET components. Similarly, the sensitivity of the model is unbiased to uncertainties in temperature, yet this sensitivity is much lower. For RH and NDVI, however, the model shows strong bias in relative error due to forcing uncertainty. The transpiration component of PT-JPL shows a strong negative bias (MBD = −51%) due to uncertainty in NDVI, while showing a positive bias (MBD = 42%) due to uncertainty in RH. The soil evaporation and interception components show a strong negative bias (MBD = −26%, −59%, respectively) due to uncertainty in RH. Interestingly, the contrasting biases in the ET components due to uncertainty in the RH term only result in a slight negative bias (MBD = −13%) in the total ET term and a much smaller RMSD (33%) compared to the components RMSD, which typically ranges between 61 and 151%. The asymmetry in PT-JPL's response to changes in NDVI and RH is clearly shown in Figure 7, as are the relative sensitivities to the input variables. Strong negative bias in the transpiration component due to NDVI and in the soil evaporation component due to RH are evident. Radiation shows a symmetric distribution with moderate sensitivity across all components, while temperature shows a similar symmetry with relatively subdued sensitivity.
PM-MOD has five temporally variable forcings: net radiation, RH, temperature, f APAR and LAI. Each of the ET components estimated by PM-MOD is exceedingly sensitive to RH and has negative bias associated with RH. Figure 5 shows that as uncertainty increases, the median of the RH and f APAR terms increases its deviation from 0. The total ET estimate is similarly sensitive to RH (RMSD = 122%), while the bias becomes somewhat mitigated when the components are aggregated (MBD = −3%). Figure 8 shows that large masses of probabilities associated with RH aggregate in the transpiration and interception terms at a relative error of −1, signifying that a change to RH caused the new model estimate to collapse to 0 flux. The soil evaporation estimate appears to be particularly sensitive to changes in the forcing variables, as perturbations to RH (RMSD = 174%), temperature (RMSD = 132%), net radiation (RMSD = 50%) and f APAR (RMSD = 84%) all exhibit their largest MBD and RMSD in the soil evaporation term.
GLEAM has seven temporally variable forcings: net radiation, temperature, precipitation, vegetation optical depth (VOD), shortwave incoming radiation (Ku), surface soil moisture and snow water equivalent. Only the first five are evaluated here. GLEAM, among the three models, shows the least sensitivity to its forcing, as evidenced by the low RMSD and MBD. Again, the daily resolution of the model likely plays a factor, as daily estimates of ET must be less variable than sub-daily estimates. Each of the variables show a slight negative bias in the transpiration estimate and precipitation shows a negative bias in the total ET estimate. Figure 9 shows a curious distribution of error associated with precipitation in the soil evaporation term with a large positive tail. The interquartile range for several variables looks quite skewed in Figure 6, including VOD, Ku and precipitation in many cases.
While the skew appears quite severe, the median remains near zero and the MBD remains quite small. Figure 9 shows that the vast majority of perturbations of VOD, Ku and precipitation result in little to no model response. Precipitation is the only dynamic variable influencing the GLEAM interception estimate while soil evaporation, transpiration and the overall ET estimates appear primarily influenced by net radiation in our set of study sites (RMSD = 8.2, 8.2 and 7.6% respectively).
PM-MOD and PT-JPL share many similarities both in their formulation and in the results found here. Both models show large biases and uncertainty in their components associated with RH. However, in both models these biases do not manifest to the same extent in the total ET estimates. A similar pattern is also evident for both models in their vegetation forcing. Both NDVI in PT-JPL and f APAR in PM-MOD show large uncertainties with some bias in their component terms, while the total ET term exhibits far less sensitivity. Temperature and radiation also show similar patterns in sensitivity across components, although PT-JPL generally exhibits a lower %RMSD. Both models show an increased sensitivity to temperature in the soil evaporation term and both have similar ranges of RMSD associated with net radiation. Figure 10 shows the %MBD from the Monte Carlo analysis at an added variable uncertainty of σ = 10% compared with the %MBD between modelled and field estimates of ET components [18]. The forcing data used in both analyses are identical and we can thus directly compare the results of the field comparison study with the Monte Carlo results presented here. The comparison between the two studies allows us to determine if uncertainty in the forcing of each model could be responsible for model deviations from observed field data. While we cannot be certain that the added uncertainty level of each forcing variable is correct, we can determine if the sign (positive/negative) of the %MBD from each analysis is consistent. The field estimates are subject to their own methodological errors and uncertainties and tend to be much smaller in spatial scale compared to the remote sensing estimates. However, we assume that aggregating these studies across many methods and study sites offers valuable information about the partitioning of the ET flux.

Model-Field Sensitivity Inter-Comparison
for model deviations from observed field data. While we cannot be certain that the added uncertainty level of each forcing variable is correct, we can determine if the sign (positive/negative) of the %MBD from each analysis is consistent. The field estimates are subject to their own methodological errors and uncertainties and tend to be much smaller in spatial scale compared to the remote sensing estimates. However, we assume that aggregating these studies across many methods and study sites offers valuable information about the partitioning of the ET flux.  [18]. %MSB is shown for transpiration (Eveg), soil evaporation (Esoil), interception (Eint) and total evapotranspiration (ET). The grey shaded regions depict inconsistency in the sign of model and field-based uncertainty estimates. Figure 10. The %MBD from the Monte Carlo analysis with σ = 10% added uncertainty is compared to %MBD calculated from comparisons between modelled estimates and field-based estimates from Talsma et al. (2018) [18]. %MSB is shown for transpiration (E veg ), soil evaporation (E soil ), interception (E int ) and total evapotranspiration (ET). The grey shaded regions depict inconsistency in the sign of model and field-based uncertainty estimates.
The Monte Carlo analysis of PM-MOD shows large negative bias due to the propagation of uncertainty in RH for soil evaporation and interception. However, Figure 10 shows that this negative bias is inconsistent with the positive bias found between the modelled soil evaporation and interception estimates with field estimates. Previous research shows that PM-MOD tends to overestimate these components [18], while added variable uncertainty would result in an underestimation. Thus, uncertainty in RH can be eliminated as the primary driver for the deviation between these modelled ET components and field estimates. The direction of the bias due to RH uncertainty is consistent with field comparisons in the transpiration term. This suggests that forcing uncertainty could indeed be responsible for the observed bias in this term.
Similarly, Figure 10 shows that much of the bias found in PT-JPL due to the propagation of forcing uncertainty is inconsistent with the observed bias between modelled and field-based estimates, particularly for RH. However, bias due to the propagation of added uncertainty in NDVI in transpiration, soil evaporation and total ET are consistent with the field bias. RH also shows consistent bias estimates within the PT-JPL total ET estimate. GLEAM shows very little bias due to the propagation of forcing uncertainty. Thus, forcing uncertainty is likely not responsible for the model deviation with field estimates shown in Figure 10. uncertainty is consistent with field comparisons in the transpiration term. This suggests that forcing uncertainty could indeed be responsible for the observed bias in this term.

Model Sensitivity across Forcing Conditions
Similarly, Figure 10 shows that much of the bias found in PT-JPL due to the propagation of forcing uncertainty is inconsistent with the observed bias between modelled and field-based estimates, particularly for RH. However, bias due to the propagation of added uncertainty in NDVI in transpiration, soil evaporation and total ET are consistent with the field bias. RH also shows consistent bias estimates within the PT-JPL total ET estimate. GLEAM shows very little bias due to the propagation of forcing uncertainty. Thus, forcing uncertainty is likely not responsible for the model deviation with field estimates shown in Figure 10.       PT-JPL shows a steady distribution with changing net radiation and temperature, while both the median and interquartile range of the error distribution changes with RH and NDVI. However, Remote Sens. 2018, 10, 1601 20 of 28 the changes to uncertainty in the estimated ET components again counteract each other and, as such, limit the uncertainty in the total ET estimate at each of the ranges in forcing values. Both PM-MOD and PT-JPL exhibit the same increasing uncertainty with increasing RH in their transpiration component and again show similar increasing uncertainty in soil evaporation with increasing vegetation indices. PM-MOD, shown in Figure 13, appears to be more sensitive to changes at extreme temperatures than to changes at more moderate temperatures. Interestingly, PM-MOD soil evaporation appears to be more sensitive to lower LAI values but also more sensitive to higher values of f APAR. GLEAM, shown in Figure 13, is overwhelmingly more sensitive to changes in low precipitation events than larger events. Net radiation and temperature show relatively unbiased and consistent distributions in error as those variables change, while VOD and Ku show somewhat more asymmetric distributions with an unbiased median.

Discussion
The results of this study are able to offer insights into how the uncertainty of temporally variable forcing influences the model estimates. Of particular interest are the differences in how the forcing affects the component estimates as compared to the total ET estimate and how uncertainty in the forcing combined with non-linear model formulations can introduce a bias. Also of interest is how this uncertainty in the forcing might manifest when comparing model estimates against field based data.

Model Sensitivity
Despite some similarities in model formulation, each model exhibits its greatest sensitivity to a different forcing variable. GLEAM interception is exclusively sensitive to precipitation, while the remainder of the GLEAM components are largely sensitive to net radiation, soil evaporation especially. These results are nonetheless dependent on the study locations adopted and previous studies have shown a larger sensitivity of GLEAM estimates on the precipitation and surface soil moisture (not considered here) forcing data in the case of drier regions [80]. While PT-JPL and PM-MOD are also sensitive to net radiation, they are far more sensitive in each of their component estimates to RH and vegetation forcing. The PT-JPL ET estimate appears to be very sensitive to changes in NDVI (RMSD = 1.01) likely due to the roll of NDVI in parameterizing vegetation characteristics. Similarly, the PM-MOD ET estimate is most sensitive to changes in RH (RMSD = 1.22) largely driven by the soil evaporation term. It should also be stressed that these results are representative of only the study sites used and the global sensitivity of each model could change with a more globally representative set of study sites.
The soil evaporation term of PT-JPL and PM-MOD show very little agreement and large variance when compared with field estimates [18]. Here, we find soil evaporation for those models to be generally quite sensitive to uncertainty in the variables of RH, temperature, NDVI and f APAR. These errors are distributed across impacts from the partitioning of net radiation between the canopy and soil surface using NDVI or f APAR and water availability limitations inferred from RH and temperature. Furthermore, the sensitivity of the soil evaporation estimate appears to vary depending on the value of the variable itself. These factors could explain the large variance in their agreement with the field studies. The use of proxies to estimate soil moisture by PM-MOD and PT-JPL likely introduces some uncertainty to the soil evaporation estimate and could explain the outsized role of soil evaporation in the global partitioning of ET for those models [17]. GLEAM uses soil moisture data assimilation to incorporate direct measurements of surface soil moisture. However, the data assimilation is not incorporated in this study.
Given the high sensitivity of soil evaporation to forcing uncertainty in each of the models, this component could be responsible for a large portion of the partitioning divergence between models. Improvement of the soil evaporation term, therefore, offers a potential area in which to focus efforts in order to improve model partitioning. Similarly, a study of North American Land Data Assimilation System (NLDAS) ET models shows that uncertainty in soil evaporation contributes significantly to uncertainty in the partitioning of the ET flux [81]. Those results also demonstrate that the ability of the model to estimate transpiration is largely responsible for its ability to estimate the total flux. Despite the inability to estimate soil evaporation or correctly partition the flux, the total ET estimate remained reasonably accurate. Furthermore, counteracting biases in the ET components from PT-JPL and PM-MOD show a decreased sensitivity of the total ET estimate to forcing as compared to the component estimates.
GLEAM, however, appears to be much more reliant on the underlying formulation of the model and model constants than on its forcing variables. The RMSD and MBD exhibited by GLEAM is roughly an order of magnitude smaller than that of PM-MOD and PT-JPL. Given the higher complexity of GLEAM compared to the other two models, it might be expected that GLEAM would be more sensitive to model formulations and constants, more constrained with respect to any single variable and rather robust to changes in forcing uncertainty. The daily resolution of GLEAM also means that estimates are likely less variable than sub-daily estimates and thus less sensitive to changes in forcing. This complexity also makes interpreting the biases of the error distributions in the model more difficult. Precipitation, Ku and VOD exhibit very asymmetric distributions in both Figures 6 and 9. GLEAM shows very few truly linear relationships, as the model relies on water balances, storage capacities and stress factors to account for the water available to various evaporative pathways. Here it results in somewhat skewed distributions in estimates but the various parameterizations within the model can likely be adjusted to accommodate those errors. The distribution of precipitation is also characterized by a large frequency of very small events, which may be prone to biases introduced by the truncated-normal distribution. When perturbed by an increasing level of uncertainty, the distribution collapses so that these small events become less frequent. Given the greater sensitivity to these smaller events shown in Figure 13, the distribution of error then becomes skewed for the various components.
Furthermore, models may be more sensitive to forcing errors at various extremes of that variable. The GLEAM model's sensitivity to low precipitation events opposed to larger events is due to the fact that large precipitation events wet the soil over the threshold in which evaporation becomes fully energy-driven. The results also show that perturbing precipitation shifts the partitioning of GLEAM from transpiration to soil evaporation at our study sites. Both PM-MOD and PT-JPL transpiration components are more sensitive at high RH values than at lower values. This is due to the fwet term, which exhibits a much higher slope as a function of humidity at high RH values and due to the negative bias introduced by the truncated-normal distribution. PM-MOD ET also shows increasing uncertainty at increasing values of both LAI and f APAR. In case of f APAR, the relationship appears to be driven by the soil evaporation term, which is substantially more uncertain at high f APAR values. PT-JPL exhibits a similar trend in soil evaporation but in the NDVI term.
Where trends exist in the uncertainty of the model with changing ambient conditions informs where these model estimates may be vulnerable to large uncertainties. The partitioning of both PM-MOD and PT-JPL seems particularly sensitive to extremes in vegetation and RH forcing, although RH could be influenced by the distributions used in this analysis. Again, the trends in component uncertainty tend to mitigate one another when aggregated in the total ET estimate of PT-JPL. However, PM-MOD exhibits larger uncertainty in its total ET estimate at elevated values of radiation, vegetation indices and temperature. This is somewhat worrisome considering that the largest fluxes coincide with large values of these variables and represent areas where inaccurate estimates of ET would likely impair global estimates.

Non-Linearity and Model Bias
The %MBD due to uncertainty in RH in both PT-JPL and PM-MOD is expected due to the non-linear dependence of the ET components to RH within these models. In both models, the model parameter fwet-calculated as a function of RH raised to the forth power-is used to approximate surface wetness. It is important to consider that the relationship between humidity and a given ET component is very likely to be non-linear, making non-linear formulations, as in PT-JPL and PM-MOD, appropriate [28,82]. However, large forcing uncertainties in non-linear models can introduce large biases to the model estimate and thus might debilitate the usefulness of the model. Figure 4 shows how increasing uncertainty in the RH introduces growing bias in the component estimates of PT-JPL and PM-MOD. The point at which this bias becomes detrimental is determined by the uncertainty of the forcing variable. It should be noted that bias could also be introduced to RH by the use of the truncated-normal distribution, although only near the upper bound (Figures 11 and 12). The effects of the truncated-normal on RH are thus difficult to disentangle, as we expect both large bias due to the non-linear formulation and bias due to the effect of the truncated-normal distribution.
Despite the similar use of RH in PT-JPL and PM-MOD, both models do have different error distributions. PM-MOD filters RH values less than 0.7 and masks them with a value of 0, while PT-JPL does not. This filter is likely responsible for the large probability mass in the interception component shown in Figure 8 at a relative error of −1 where added uncertainty has RH values greater than 0.7 to become filtered. Both models are also dependent on vapour pressure deficit (VPD), a function of RH and temperature. PM-MOD is dependent on VPD as part of the Penman-Monteith equation and both models incorporate VPD as part of their soil evaporation estimates. Notice that both PT-JPL and PM-MOD soil evaporation estimates are very sensitive to both RH and temperature and PM-MOD is generally more sensitive than PT-JPL to both RH and temperature, likely due to the employment of VPD.
Interestingly, in many of the instances where bias is exhibited in the modelled components due to the propagation of forcing uncertainty, the biases counteract each other in the aggregate total ET estimate. This is especially true for the terms used to partition the available water or radiation between ET components (RH, NDVI, f APAR). This could help to explain some of the divergences in partitioning between models as well as the negative bias previously found in the PT-JPL and PM-MOD transpiration estimates when compared to field estimates of transpiration [18]. The negative bias related to NDVI in the PM-MOD transpiration component could also help to explain the relatively small contribution that transpiration makes in the global ET partitioning of PM-MOD [17]. This also implies that uncertainty in the estimated ET components is likely larger than in the total ET estimates. Our results suggest that, for PT-JPL and PM-MOD, slight changes in forcing variables have the potential to vastly change the partitioning of ET while resulting in only modest changes to the overall ET estimate.

Model-Field Sensitivity Inter-Comparison
While large bias due to variable uncertainty is found in the components of both PM-MOD and PT-JPL, the direction of these biases is often inconsistent with comparisons between modelled and field estimates as shown in Figure 10. Where these biases are inconsistent, we can conclude that forcing uncertainty for that particular variable is not responsible for the deviation with field estimates. However, bias in PT-JPL soil evaporation and transpiration due to uncertainty in NDVI does show error consistent with field comparisons. This suggests that the uncertainty in NDVI may be responsible in contributing significant error to the partitioning of ET within PT-JPL. Given that this study is statistical and we assume the level of added uncertainty, only the direction of the bias and not the magnitude of the bias is important in interpreting Figure 10. Different combinations of forcing uncertainty could potentially lead to results similar to those found in Talsma et al., 2018 [18] and Figure 10 should be used only to identify where variables uncertainty may or may not be contributing to observed deviations. The field estimate themselves also contain errors in their methodology but very few techniques exist to directly measure ET components and the field data represents the best available means to validate the remote sensing-based component estimates.
The strengths of this study lie in its exploration of the sensitivity of the model components apart from total ET, as very few studies have focused on the sensitivity of transpiration, soil evaporation and interception within these models. The study is also able to use a theoretical Monte Carlo framework to show how the combination of forcing uncertainty and non-linear model formulation may introduce bias into the model estimate. We can then assess how this statistical approach may be reflected in the practical use of the models by comparing the Monte Carlo results against the results of Talsma et al. (2018) [18] and the global partitioning of the models in Miralles et al. (2016) [17]. While the study is able to explore the role of forcing uncertainty and sensitivity in these models, we are still unable to identify the definitive source of deviation with the field data. The study also relies on several assumptions about the distribution and magnitude of the forcing uncertainty. It is therefore important to focus on how this theoretical distribution of uncertainty is distorted through its propagation and the sensitivity of the variables relative to each other.

Conclusions
The demand for research and data concerning ET and ET components will continue to fuel efforts into various remote sensing-based models [2,23]. However, changes in climate and its related feedbacks will require models to show correct sensitivity to many variables [9,20]. Furthermore, the rapid expansion in remote sensing products, data and techniques [24,83,84] presents challenges in understanding and quantifying how errors and uncertainty within the forcing data will propagate to model estimates [26,27]. The sensitivity analysis in this study allows us to determine what variables drive the individual component estimates and helps to explain why the models achieve accurate ET while employing vastly different ET partitioning strategies. The results of the study present a means to assess the strategies of each model and to compare those strategies against our theoretical understanding of ET and its components. A key finding of this study is the elevated bias found in the component estimates of PM-MOD and PT-JPL related to RH (MBD = −59-41.9%) and NDVI (MBD = −51.4-5.1%) and how these biases are largely balanced when aggregated in the total ET estimate. We are also able to determine if these biases are consistent with model deviations from field estimates (RH for PM-MOD components, PT-JPL interception, PT-JPL soil evaporation) or inconsistent (NDVI for PM-MOD components and ET; RH for PT-JPL transpiration).

•
Both the components and the total ET estimate of PM-MOD are primarily sensitive to RH (RMSD = 122.3-174.2%). • GLEAM is comparatively less sensitive to changes in its forcing variables than PM-MOD and PT-JPL. The higher complexity and daily resolution of GLEAM makes it more constrained but potentially more sensitive to the model parameterizations. Note that the soil moisture data assimilation of GLEAM was not evaluated here. • Both PT-JPL and PM-MOD soil evaporation show large sensitivity to forcing inputs, creating greater overall uncertainty in the soil evaporation estimate for our study sites. • Non-linear formulations of RH and vegetation indices in PT-JPL and PM-MOD often result in large biases (|MBD| > 50%) in component estimates due to variable uncertainty. However, bias in the component estimates often balance each other and limit the bias and uncertainty of the total ET estimate (|MBD| ≤ 20%). Changes to forcing could cause large changes to model partitioning with comparatively smaller changes to the overall ET estimate. • Bias in PT-JPL due to uncertainties in NDVI are consistent with errors found when comparing model estimates to field estimates. This suggests that uncertainty in NDVI may be introducing significant error to the partitioning of PT-JPL.