Evaluation and Aggregation Properties of Thermal Infra-Red-Based Evapotranspiration Algorithms from 100 m to the km Scale over a Semi-Arid Irrigated Agricultural Area

: Evapotranspiration (ET) estimates are particularly needed for monitoring the available water of arid lands. Remote sensing data offer the ideal spatial and temporal coverage needed by irrigation water management institutions to deal with increasing pressure on available water. Low spatial resolution (LR) products present strong advantages. They cover larger zones and are acquired more frequently than high spatial resolution (HR) products. Current sensors such as Moderate-Resolution Imaging Spectroradiometer (MODIS) offer a long record history. However, validation of ET products at LR remains a difﬁcult task. In this context, the objective of this study is to evaluate scaling properties of ET ﬂuxes obtained at high and low resolution by two commonly used Energy Balance models, the Surface Energy Balance System (SEBS) and the Two-Source Energy Balance model (TSEB). Both are forced by local meteorological observations and remote sensing data in Visible, Near Infra-Red and Thermal Infra-Red spectral domains. Remotely sensed data stem from Advanced Spaceborne Thermal Emission and Reﬂection Radiometer (ASTER) and MODIS sensors, respectively, resampled at 100 m and 1000 m resolutions. The study zone is a square area of 4 by 4 km 2 located in a semi-arid irrigated agricultural zone in the northwest of Mexico. Wheat is the dominant crop, followed by maize and vegetables. The HR ASTER dataset includes seven dates between the 30 December 2007 and 13 May 2008 and the LR MODIS products were retrieved for the same overpasses. ET retrievals from HR ASTER products provided reference ET maps at LR once linearly aggregated at the km scale. The quality of this retrieval was assessed using eddy covariance data at seven locations within the 4 by 4 km 2 square. To investigate the impact of input aggregation, we ﬁrst compared to the reference dataset all ﬂuxes obtained by running TSEB and SEBS models using ASTER reﬂectances and radiances previously aggregated at the km scale. Second, we compared to the same reference dataset all ﬂuxes obtained with SEBS and TSEB models using MODIS data. LR ﬂuxes obtained by both models driven by aggregated ASTER input data compared well with the reference simulations and illustrated the relatively good accuracy achieved using aggregated inputs (relative bias of about 3.5% for SEBS and decreased to less than 1% for TSEB). Results also showed that MODIS ET estimates compared well with the reference simulation (relative bias was down to about 2% for SEBS and 3% for TSEB). Discrepancies were mainly related to fraction cover mapping for TSEB and to surface roughness length mapping for SEBS. This was consistent with the sensitivity analysis of those parameters previously published. To improve accuracy from LR estimates obtained using the 1 km surface temperature product provided by MODIS, we tested three statistical and one deterministic aggregation rules for the most sensible input parameter, the surface roughness length. The harmonic and geometric averages appeared to be the most accurate.


Introduction
Monitoring water uptakes from groundwater reservoirs is important for water resources managers as well as water allocation and water rights regulation authorities. This is especially true in semi-arid lands where many aquifers face an average depletion of 0.5 to 1.5 m/year due to unsustainable pumping for irrigation [1]. EvapoTranspiration (ET) is the largest water loss of most agricultural areas. In many cases, it represents the only loss of water on the long term if one assumes that irrigation excess mostly contributes to deep drainage and that the corresponding water flux eventually reaches the groundwater table and participates to recharge. Groundwater uptake can therefore be scaled to total cumulative ET on an annual basis at least. Up to now, regional scale evaluation of ET relies on distributed information obtained either through hydrological modeling, Remote Sensing (RS) or a combination of both [2].
In order to estimate ET from the sole RS and meteorological data, methods based on Thermal Infra-Red (TIR) information are therefore increasingly used in the context of irrigation monitoring [3,4]. Due to its tight coupling with surface energy balance and water stress, surface temperature information provides a way to estimate ET independently from any knowledge on the various components of the water balance.
Two kinds of methods are implemented to retrieve water stress and actual ET rates from a combination of TIR as well as Visible (VIS) and Near Infra-Red (NIR) RS data: those generally referred to as "contextual" methods, and those known as "residual" or "single pixel" methods [5].
"Contextual" methods estimate the water status of a given pixel by scaling its value between non-evaporating and fully-evaporating conditions according to maximum and minimum surface temperature values observed on the image. It assumes that cold and hot pixels correspond to wet and dry conditions, respectively. Since those extremes depend as well on the surface conditions, particularly the amount of vegetation, extremes are derived for each class of either the Normalized Differential Vegetation Index (NDVI) [6], albedo [7] or both [8]. These methods have operational applications for irrigated lands with large fields and high-resolution (HR) TIR data (hectometric resolution data from sensors on board LANDSAT platforms or from the ASTER sensor). Their applicability drops in the context of dryland agriculture and low-resolution (LR) data (e.g., kilometric resolution data from MODIS sensors) [9][10][11]. Those methods are therefore more adapted to HR TIR data, which are available at a weekly interval only, and usually at lower frequencies due to cloud conditions. In order to use the routinely available LR data acquired daily over the globe with a revisit consistent with typical timescales of the hydrological cycle (droughts of few days to several weeks), data sharpening techniques [12][13][14][15][16][17] can be used to infer surface temperature at a higher spatial resolution, but they all rely on an a priori link between vegetation cover and surface water status, whose validity can be challenged.
"Residual" or "single pixel" methods, on the other hand, derive an evaporation rate in actual conditions for each pixel independently from the others. Latent heat is computed as the residual term of the surface energy balance equation. The various terms of that energy balance (net radiation, soil heat flux, sensible heat flux) are derived for a single source representing the soil vegetation composite (single source models, e.g., SEBS [18,19]) or for two sources with two separate energy an average reflectance value obtained at low resolution. On the other hand, HR data in the VIS and NIR wavelengths are routinely available and lead to more robust methods to estimate land use and crop cover types as well as their corresponding biophysical parameters such as crop height. There is therefore a need to apply scaling procedures to some surface variables such as crop height or minimum surface resistance, which cannot be easily derived directly from existing LR RS data.
Moreover, since HR and LR products are not available at the same frequency, producing continuous ET though temporal interpolation will likely result in combining products at both scales. To do so, one must ensure the consistency in scaling products from HR to LR. Those products should at least be unbiased, i.e., aggregated HR fluxes to LR should match the LR ET products generated from LR data on the same date. Scaling properties are frequently studied by using LR RS data generated from HR data to get rid of discrepancies in the input variables from different sensors on different platforms. To analyse scaling properties and the subsequent biases, radiances are linearly aggregated to generate surface temperatures and reflectances at LR. McCabe and Wood [36] for one date only and, more recently Ershadi, McCabe et al. [37], for several dates within a season, have used the single-source model SEBS to assess those biases, and found scale discrepancies on instantaneous LE retrieval of up to 15%. They attributed this difference to a change in the roughness length parameters of the land surface due to the aggregation.
Similar work, within a theoretical frame (aggregation between contrasted patches rather than real RS data), were carried out by Kustas and Norman [38] using the dual source model TSEB. They concluded that, for those very contrasted conditions, there was a bias up to 50 W/m −2 on the instantaneous retrieval between the aggregated latent heat fluxes and the latent heat flux computed from surface average parameters at the km scale. On the other hand, the same authors have performed similar work on real data for a mixture of riparian trees and stressed shrubs [39] and a realistic contribution (fraction cover f c ) of both patches to the LR pixel (with a contribution of the most extreme conditions of less than 20%). They concluded that divergence was significantly less than 50 W/m 2 , which is the typical precision of most Energy Balance (EB) models [40] for retrieving instantaneous latent heat flux. They also found that simple averaging of displacement height and an averaging of roughness length based on a lognormal to the power −2 reduced the bias down to less than 10 W/m 2 . Both studies pointed out the need to use proper efficient scaling relationships in parameter estimation at LR. There has been abundant literature on the subject from back in the 1990s, mostly based on theoretical works deriving deterministic [41] or statistical [42] relationships between local and regional parameters for land-atmosphere exchange modelling. For roughness length, Wassenaar et al. [43] conclude that a geometric averaging of the roughness length is performing best; this is also the scaling proposed by Taylor [44].
Previous studies with SEBS [45,46] agree on the important role of roughness parameters in the estimation of ET. They provide at least three main findings: (1) that, for SEBS, the main uncertainty is linked to roughness estimation [47]; (2) that the original formulation of the roughness length for heat transfer tends to overestimate ET with the original use of the soil component of the kB −1 parameter [48]; and (3) that the limited accuracy of height estimate (and subsequently the roughness length for momentum transfer) from RS data is rarely taken into account, with few empirical relationships that tend to diverge for high NDVI values [49]. The latter becomes more significant when LR RS data are used to derive surface roughness lengths. However, none of those aggregation rules (apart from [43]) have been evaluated with actual data acquired at LR on very heterogeneous land surfaces.
TSEB appears to provide good ET estimates and to be less sensitive to roughness length parameters [22]. One expects flux conservation between HR and LR. The relative contribution of the soil and the vegetation temperatures through the vegetation fraction cover parameter can be the issue at LR and the separated soil and vegetation fluxes conservation would be affected [50].
Up to now, most studies on scaling properties have focused on one of the most common models only (SEBS, TSEB or SEBAL). The widely used SEBS and TSEB "single pixel" models have not been compared on the same landscape [22,51]. Furthermore, intercomparison has been carried out usually for a limited amount of HR images [46,52,53]. In addition, few studies were dedicated to the comparison of separate satellite sensors.
In this paper, we address two objectives: • To investigate ET flux scaling properties from HR to LR using data from the same sensor (i.e., ASTER), as well as data stemming from different sensors onboard the same platform (i.e., ASTER, MODIS).

•
To develop and evaluate new and existing scaling relationships based on easily obtainable RS quantities to relate local HR and LR roughness lengths.
We use two of the most common 'single-pixel' models, TSEB and SEBS, to estimate ET fluxes at the satellite overpass time and aggregate the values to a daily time scale using the algorithm presented in Delogu et al. [54]. The results are analyzed for seven dates throughout an entire growing season in a semi-arid irrigated area in northwest Mexico, for which both models have been evaluated using EC measurements in [22].

Study Area
The study area is located in the Yaqui region (27.4 • N, 19.9 • W), Sonora, in the northwest Mexico. The region covers 225,000 ha between the Cortez Sea (northeast) and the Sierra Madre mountains (southwest), and is the largest agricultural area in the country (Figure 1). More than half of the fields are cultivated with winter wheat, the dominant crop, and the rest with a mix of broccoli, beans, chili, potatoes, peas, safflower, orange trees and maize. When considering the type of crops and the spatial patterns, the study area is representative of the region and supports the idea that the results presented here can be extended to the whole agricultural region. The climate is considered as semi-arid with an annual potential ET of 2233 mm (mean value between years 1971 and 2000) and a low annual precipitation rate of 290 mm essentially spread between June and September. Only 42.8 mm of rain falls between January and June. About 90% of the water use concern irrigation, the water being withdrawn from the Alvaro Obregon Reservoir with a capacity of 3 km 3 . In this hydrological context, an accurate method to estimate water losses by ET is essential for managing the water resources at the regional scale. Between December 2007 and May 2008 an international field measurement campaign was set up on a 4 × 4 km 2 zone located in the South of Obregon city and an important HR and LR RS dataset covering the same period and area was collected.
Remote Sens. 2017, 9, 1178 5 of 27 In this paper, we address two objectives: • To investigate ET flux scaling properties from HR to LR using data from the same sensor (i.e., ASTER), as well as data stemming from different sensors onboard the same platform (i.e., ASTER, MODIS).

•
To develop and evaluate new and existing scaling relationships based on easily obtainable RS quantities to relate local HR and LR roughness lengths.
We use two of the most common 'single-pixel' models, TSEB and SEBS, to estimate ET fluxes at the satellite overpass time and aggregate the values to a daily time scale using the algorithm presented in Delogu et al. [54]. The results are analyzed for seven dates throughout an entire growing season in a semi-arid irrigated area in northwest Mexico, for which both models have been evaluated using EC measurements in [22].

Study Area
The study area is located in the Yaqui region (27.4°N, 19.9°W), Sonora, in the northwest Mexico. The region covers 225,000 ha between the Cortez Sea (northeast) and the Sierra Madre mountains (southwest), and is the largest agricultural area in the country (Figure 1). More than half of the fields are cultivated with winter wheat, the dominant crop, and the rest with a mix of broccoli, beans, chili, potatoes, peas, safflower, orange trees and maize. When considering the type of crops and the spatial patterns, the study area is representative of the region and supports the idea that the results presented here can be extended to the whole agricultural region. The climate is considered as semi-arid with an annual potential ET of 2233 mm (mean value between years 1971 and 2000) and a low annual precipitation rate of 290 mm essentially spread between June and September. Only 42.8 mm of rain falls between January and June. About 90% of the water use concern irrigation, the water being withdrawn from the Alvaro Obregon Reservoir with a capacity of 3 km 3 . In this hydrological context, an accurate method to estimate water losses by ET is essential for managing the water resources at the regional scale. Between December 2007 and May 2008 an international field measurement campaign was set up on a 4 × 4 km 2 zone located in the South of Obregon city and an important HR and LR RS dataset covering the same period and area was collected.

In Situ Measurements
The meteorological dataset was acquired at a height of 10 m from an automatic weather station located at the center of the 4 by 4 km 2 square (Figure 1c). Half hourly global solar and atmospheric incoming radiations, air temperature and relative humidity as well as wind speed were recorded. Eddy covariance data was acquired at 2 to 3 m at seven sites located on different crop plots to represent the diversity of crop types and phenological stages as well as contrasted simultaneous soil water conditions related to irrigation patterns (Figure 2). At each site, net radiation, soil heat flux (at 0.05 m depth), surface temperature and soil moisture (at 0.05 and 0.3 m depth) were measured each 10 s before being averaged over 30 min periods. The latent and the sensible heat fluxes were acquired at a frequency of 10 Hz, processed using the FLUXNET guidelines [55] and converted to 30 min flux average. The devices used for all the automatic measurements at the different EC and meteorological stations as well as the fluxes quality analysis were described in Chirouze et al. [22].

In Situ Measurements
The meteorological dataset was acquired at a height of 10 m from an automatic weather station located at the center of the 4 by 4 km 2 square (Figure 1c). Half hourly global solar and atmospheric incoming radiations, air temperature and relative humidity as well as wind speed were recorded. Eddy covariance data was acquired at 2 to 3 m at seven sites located on different crop plots to represent the diversity of crop types and phenological stages as well as contrasted simultaneous soil water conditions related to irrigation patterns (Figure 2). At each site, net radiation, soil heat flux (at 0.05 m depth), surface temperature and soil moisture (at 0.05 and 0.3 m depth) were measured each 10 s before being averaged over 30 min periods. The latent and the sensible heat fluxes were acquired at a frequency of 10 Hz, processed using the FLUXNET guidelines [55] and converted to 30 min flux average. The devices used for all the automatic measurements at the different EC and meteorological stations as well as the fluxes quality analysis were described in Chirouze et al. [22].

Remote Sensing Data
The ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) sensor on board the Terra satellite provided a dataset made of four products and eight dates ( (Table 1). All products are atmospherically corrected and include LST retrieval includes emissivity correction. The local overpass time was between 10:30 a.m. and 11:00 a.m. and the swath of each product was always the same (60 by 60 km 2 ).
The MODIS (MODerate resolution Imaging Spectroradiometer) sensor also onboard the Terra satellite offered a daily global coverage at LR. The MODIS dataset was a blend of four products at 1 km resolution directly usable in both energy balance models (Table 1). They were extracted for the same dates as the ASTER dataset. ASTER and MODIS overpass times are approximately coincident since they both are onboard the same satellite platform.

Remote Sensing Data
The ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) sensor on board the Terra satellite provided a dataset made of four products and eight dates ( (Table 1). All products are atmospherically corrected and include LST retrieval includes emissivity correction. The local overpass time was between 10:30 a.m. and 11:00 a.m. and the swath of each product was always the same (60 by 60 km 2 ).
The MODIS (MODerate resolution Imaging Spectroradiometer) sensor also onboard the Terra satellite offered a daily global coverage at LR. The MODIS dataset was a blend of four products at 1 km resolution directly usable in both energy balance models (Table 1). They were extracted for the same dates as the ASTER dataset. ASTER and MODIS overpass times are approximately coincident since they both are onboard the same satellite platform.

Remote Sensing Data Preprocessing
The quality information contained in each product was taken into account and led to the removal of the last acquisition date reducing the ASTER dataset to seven dates. Concerning MODIS, all synthesis products (NDVI, albedo) were linearly interpolated between the available dates.
In order to provide easier comparison, ASTER images and MODIS images were resampled at 100 m and 1 km resolution, respectively [16]. All ASTER products were resampled using a bicubic interpolation and reprojected to the UTM12 system on a~100 m resolution (100.49 m × 101.46 m) grid that fitted the study area. MODIS products were resampled at 1 km resolution on a grid based on the largest zone covered by the ASTER images using a bicubic method and reprojected in the UTM12 projection system. This projected the data on a squared regular grid instead of the irregular parallelepipoidal pixel shape of the original sinusoidal projection of the MODIS products. The grid defined by the sinusoidal projection of the MODIS products is centered on the intersection between the Equator and the Greenwich meridian where the sinusoidal grid produces square-like pixels, whereas our study area is located far from both latitudinal and longitudinal references. This resulted in a native resolution of about 900 by 1800 m 2 . Therefore, in what follows, the aggregation performance for MODIS will be assessed at a more representative 2000 m spatial resolution instead of the usual 1 km resolution, in agreement with previous work on surface temperature disaggregation using the same dataset and on the same area [16].

Available Energy
In both TSEB and SEBS, the available energy is considered as the difference between the total net radiation (R n ) and the ground heat flux (G). The net radiation is estimated with the same equation [22]: where R sw and R lw are, respectively, the shortwave and longwave surface incoming radiation, α the albedo, ε the emissivity and T surf the radiative temperature of the surface. R sw is taken from the meteorological station and R lw = 1.24(e a /T a )σ T a 4 , where e a and T a are the actual vapour pressure and the temperature of the air measured by the meteorological station. The ground heat flux (G) is estimated according to the SEBAL model formulation [4], which proved to be the more accurate in this context [22]:

SEBS Model
The SEBS model [19] derives the latent heat (LE) flux as the residual term of the energy balance SEBS computes turbulent exchanges between the source at the aerodynamic level and the reference level z, where atmospheric forcing data is available, usually just a few meters above the crop height. Heat transfers are based on the Monin-Obukhov Similarity Theory (MOST) in the Atmospheric Boundary Layer (ABL). Bulk ABL similarity functions proposed by Brutsaert 1999 [56] describe the wind and temperature profiles in the turbulent environment (Equations (3) and (4) (see [56,57]): where r a is the aerodynamic resistance to heat transfer at the surface-atmosphere interface, u is the wind velocity at level z, , z 0h is the roughness length for heat transfer and z 0m the roughness height for momentum transfer (z 0m ≈ h c × 0.123). T a and T aero are the temperature of the air at reference and aerodynamic levels, respectively, ρ is the density of the air and c p is the specific heat capacity of air.
Ψ m and Ψ h are the stability correction functions for momentum and sensible heat transfer and L is the Monin-Obukhov length.
The main originality of SEBS lies in the derivation of an equation that relates the roughness length for heat transfer to the roughness length for momentum transfer according to the soil and vegetation individual drag coefficients: where f c is the vegetation fraction cover (in a pixel), and f s the bare soil fraction cover with f c + f s = 1. A 1 describes the vegetation aerodynamic properties, kB s −1 the bare soil properties, and A 2 represents the interactions between vegetation and bare soil. All these terms are estimated as in Su et al. [19]. The second originality of SEBS is that the model provides bounding relationships to constrain the latent heat flux between two hypothetical extreme surface wetness conditions (dry and wet, which correspond to non-evaporative and potential conditions respectively): where LE dry and LE wet are the latent heat fluxes in dry and wet conditions, e s the saturation vapor pressure temperature T a , γ the psychrometric constant and ∆ the slope of the saturation vapor pressure at temperature T a . The relative evaporation (Λ r ) is then computed to estimate LE as LE = Λ r LE wet : which corresponds to the residual calculation of LE bounded by the dry and wet conditions' values. This potentially induces thresholds, which can affect the nonlinearity of the scaling properties between HR and LR (a threshold locally encountered at HR is possibly not present for average conditions at BR).

TSEB Model
TSEB [20] solves two different energy budgets for the soil and the vegetation components. The net radiation flux, estimated as described in Equation (1), is partitioned according to the vegetation and bare soil cover fractions following: H c and LE c are the sensible and latent heat fluxes at the vegetation/atmosphere interface and H s and LE s the same heat fluxes at the bare soil/atmosphere interface. T c and T s are, respectively, the vegetation and bare soil temperature, r s [21] is an additional resistance to describe the resistance to heat transfer in the ABL at the bare soil/atmosphere interface, and r a the atmospheric resistance to heat transfer at the surface-atmosphere interface is expressed according to MOST.
TSEB computes the soil and the vegetation temperatures from their respective energy balance equation as well as the link between the total radiative surface temperature and the component temperatures: TSEB first assumes that the vegetation is unstressed and transpires at a potential rate defined by the Priestley-Taylor formulation: where f g is the relative fraction of the vegetation that is green. Since LAI is estimated through the NDVI, which is an indicator of the green vegetation development, and f g is here fixed to 1. The canopy energy budget (Equation (11)) provides a first estimate of T c by introducing LE c in potential conditions (Equation (14)) in Equation (11). When T c is known, T s can be computed from Equation (13) and then H s from its formulation detailed in Equation (12). Finally, LE s is calculated as the residual term of the soil energy balance (Equation (10)). If the resulting LE s is positive, the solution is reached and the hypothesis of an unstressed vegetation is considered as valid even if neither the soil or the vegetation evaporate at a potential rate. If the resulting LE s is negative, the unstressed vegetation assumption is challenged, so LE c value is decreased until LE s is equal to 0. Then, new H s and T s values are computed from Equations (10) and (12) and new T c , H c and LE c values from Equation (13), Equation (12) and Equation (11), respectively. If LE c is positive, a new solution is reached. In addition, if LE c is negative, the vegetation part is also considered as fully stressed and dry and LE c is set to 0. H c can then be computed from Equation (11) and a new T c value from Equation (12). T s can then be estimated according to Equation (13) and new H s and G values from Equation (12) and Equation (10), respectively.

Daily ET Fluxes Generation
The latent heat fluxes computed by SEBS and TSEB are obtained at the time of satellite overpass. In order to compute daily evapotranspiration ET d , we used an extrapolation algorithm based on an empirical diurnal shape of the evaporative fraction during a day in clear sky conditions (see [54]).
If one obtains an instantaneous value of the ET flux and owns a daytime dataset that allows computing a cumulative available energy, the corresponding daily ET can be estimated as: where ET d is the daily cumulative ET value, ET i the instantaneous evapotranspiration flux, which corresponds to the latent heat flux ET i = LE λ (with λ being the latent heat of vaporisation), EF = ET i AE i is the evaporative fraction (considered as constant during daytime) defined as the ratio between the instantaneous ET value and the instantaneous available energy value, and AE d is the daily cumulative available energy computed as the daily cumulative net Radiation flux (R nd ), considering that the cumulative soil heat flux at the daily scale can be neglected. R nd is estimated assuming that the R nd /R ni ratio has a sinusoidal evolution along the year as stated in Gomez et al. [58]: where R ni is the instantaneous net radiation flux, JD is the day of the year and a 1 , a 2 and a 3 are obtained for each hour of the day using mean local measurement to calibrate the relationship (not shown).

General Design of the Experiments
In order to evaluate the impact of spatial resolution on flux estimates by the two models, SEBS and TSEB, we designed several numerical experiments based on the MODIS and the ASTER data. Figure 3 describes the flowchart of the four flux processing chains with each Energy Balance (EB) model (aggregated ASTER, LR ASTER, MODIS). The sole difference between those chains is the type and scale of the RS input data. The way one obtains those inputs is described in Sections 4.2-4.4 at and eventually the way one aggregates those inputs in the case of SEBS (Section 4.5).

•
In a preliminary step, HR maps of ET were computed with both models from ASTER products (dataset named 'HR-ASTER' including spectral surface reflectances, spectral surface emissivities, radiative surface temperature and surface fluxes). ET maps were evaluated against the eddy correlation measurements performed in seven crop fields. These maps were aggregated at the kilometric resolution to be used in the following steps as a reference dataset to evaluate ET maps obtained at low resolution. This aggregation was done considering that surface fluxes can be averaged using a simple arithmetic mean. The reference dataset at low resolution was named 'agg-ASTER'.

•
In the second step, LR maps of ET were produced from the high-resolution ASTER products aggregated at the kilometric resolution (equivalent to MODIS resolution). LR RS products were used for all inputs of both models. This dataset was named 'LR-ASTER'. ET maps at LR were evaluated against 'agg-ASTER' ET maps. Since both input datasets at LR and HR came from the same sensor, the biases between the two estimations of ET were only related to how the nonlinear relationships in the model translates the variability of inputs at HR into an average LR outputs that can be significantly different than the one generated using LR inputs.

•
In a third step, LR maps of ET were produced from the MODIS products (surface temperature, emissivities, albedo, NDVI) at 2 km resolution, as it would be done in a standard application of SEBS and TSEB using MODIS data. These dataset (named 'MODIS') was evaluated against 'agg-ASTER' ET maps. In this case, differences between ET maps were related to a combination of the differences in products, input parameter derivations and heterogeneity/nonlinearity issues.

•
In a fourth step, we analysed the possibility to derive SEBS and TSEB input parameters at low resolution by aggregating parameters estimated from high-resolution data. This scenario considered the possible use of high-resolution images in the solar domain from Earth observation satellites for deriving model inputs at the kilometric resolution. Decametric information are now increasingly available, in particular thanks to Sentinel2 satellites. We expected that this scenario would provide a deeper analysis of the potential source of biases in deriving ET and to develop more adequate aggregation rules for the relevant inputs. shown).

General Design of the Experiments
In order to evaluate the impact of spatial resolution on flux estimates by the two models, SEBS and TSEB, we designed several numerical experiments based on the MODIS and the ASTER data. Figure 3 describes the flowchart of the four flux processing chains with each Energy Balance (EB) model (aggregated ASTER, LR ASTER, MODIS). The sole difference between those chains is the type and scale of the RS input data. The way one obtains those inputs is described in Sections 4.2-4.4 at and eventually the way one aggregates those inputs in the case of SEBS (Section 4.5).   Several aggregation rules were tested by comparing the three LR ET maps generated by each model ('MODIS', 'LR-ASTER' with input parameters derived from LR-inputs and 'LR-ASTER' with effective values of the input parameters, derived from HR data) with the reference 'agg-ASTER' ET map for the same model.
The aggregation rules tested with ASTER input data were also evaluated with the MODIS inputs' dataset. It means that aggregated ("effective") surface roughness lengths values derived from ASTER data were used with the other MODIS inputs to produce ET maps. The results of this comparison will be useful to many MODIS data users, which, in general, benefit from a HR VIS/NIR dataset extracted from Landsat, for example. It can lead them to derive effective LR surface roughness lengths from HR data in order to improve the parameterization of the EB models at LR and then the ET output fluxes.

Estimating Surface Parameters at HR from ASTER Products
Reflectances for each band were combined to estimate the land surface properties (Figure 3) required by both models. Albedo (α) was computed according to Liang [59]: where ρ i referred to the spectral reflectance from band i. Broadband emissivity (ε) was computed from spectral emissivities ε i following Ogawa et al. [60]): NDVI was calculated from the NIR (ρ 3N ) and red-VIS (ρ 2 ) bands as: LAI and fraction cover (f c ) were estimated from NDVI values following [61]: with NDVI ∞ = 0.94 and NDVI 0 = 0.14, and: To ensure that all distributed parameters were derived from the sole remotely sensed information, crop heights (h c ) were also estimated from NDVI values. Knowing the repartition of the different land use classes on the 4 × 4 km 2 square, it was clear that the majority of the area was covered by cereals. Hence, for each pixel (i), crop height was estimated as a linear regression between its temporal maximum and minimum NDVI values (respectively, NDVI max (i) and NDVI min (i)) and realistic maximum and minimum cereal crop heights' values. The period covered by the ASTER dataset contained dates from the whole winter cereal growing period from seeding to harvesting. Therefore, the minimum height was that of a ploughed bare soil of 0.05 m equivalent height and a realistic maximum was set to 1.3 m, which led to: All ASTER data obtained from the original ASTER products constituted the hereafter called 'HR-ASTER' input dataset (~100 m resolution). Vegetation height translates directly into roughness length z om using the simple rule of the thumb: z om (i) = 0.123 × h c (i).

Estimating Surface Parameters at LR from ASTER Product
To compute the fluxes at LR using the Energy Balance (EB) models and low resolution ASTER data, ASTER products were linearly aggregated to~1000 m resolution, so that they approached the original MODIS product resolution (Figure 3). Special care was taken for the aggregation of surface temperature: radiative fluxes were linearly aggregated (εσT su 4 , with σ = 5.67 × 10 −8 Wm −2 K −1 the Stefan-Boltzmann constant) before estimating the surface temperatures at LR by inverting the Stefan-Boltzman equation.
In order to account for geolocation errors between the two sensors, ASTER NDVI and surface temperature data at low resolution were compared to the equivalent MODIS products by seeking the maximum correlation considering different spatial shifts from −1000 m to +1000 m. A maximum of correlation was reached for a unique spatial shift, on each direction, for each date and each product. This spatial shift was applied to the product when the maximum of correlation was obtained with coefficient exceeding 0.8. For dates with lower correlation coefficients, an average of the spatial shift obtained on the other dates was applied to each direction and each product.
The spatial shifts were applied before resampling. Final surface temperature products were then computed using resampled emissivity and radiances and led to the ASTER input dataset (Figure 3).
Eventually, input parameters for the two models (albedo, emissivity, LAI, fraction cover, vegetation height) were derived using the same procedures as used at HR. This new dataset was called 'LR-ASTER'.

Estimating Surface Parameters at LR from MODIS Product
Because MODIS LAI presented abnormally large values when compared to ASTER, it was estimated from the MODIS NDVI product using the same relationships as for the ASTER datasets.
This also applied to fractional cover f c . Albedo (blue sky value) was estimated from black-sky and white-sky MODIS values according to [62,63]: where α BlackSky (directional hemispherical reflectance) was the direct component and was a function of the solar zenith angle and α WhiteSky (bihemispherical reflectance) was the diffuse component. Black-sky and white-sky albedos corresponded to the extreme cases of purely direct or diffuse illumination. S is the fraction of diffuse solar radiation, which varied according to the atmospheric content in aerosol and water vapor and the solar zenith angle. S was fixed to 0.15 for all dates. The use of a constant value for S can introduce discrepancies in albedo and then in the available energy fluxes estimates. It could be estimated from MODIS aerosol product (MOD04-L2), but, in our area in dry season, it was usually low and had only a small impact on blue-sky albedo values. Broadband emissivity was computed from the emissivity products in band 31 and band 32 (ε b31 and ε b32 ) with the relationship detailed in Liang [64]: Crop height values at MODIS scale were computed similarly as for ASTER values using MODIS NDVI product (Equation (24)).

Aggregation Rules for the Input Parameters
The possibility to monitor ET with a daily time step relies on the use of LR data in the thermal infrared such as those provided by MODIS sensors (with a spatial resolution in the order of 1 km). Other required information can be derived at the same resolution from MODIS data, but other sensors with a higher spatial resolution, in particular on board of Earth Observation satellites such as Sentinel-2, Landsat 8, FORMOSAT or others, can be very useful to introduce the representation of the spatial variability within the MODIS pixels. This offers the possibility of deriving input parameters at LR from their estimation at HR. When combining several platforms or using both Sentinel-2 satellites, time revisit can be down to just a few days.
The spatial aggregation of input parameters may follow different rules depending on the linearity of the equations. Concerning surface energy balance, we expected that roughness length was a significant parameter in this issue, as its behaviour is nonlinear either in the calculation of the turbulent fluxes or in its estimation from remote sensing data [43,47,65]. We also know from the literature review in Section 1 that SEBS is known for its sensitivity to the parameterization of surface roughness lengths. It was therefore important to assess the impact of aggregation rules for this parameter. In the case of other parameters such as albedo, fraction cover or emissivities, an almost linear behaviour was expected.
First, three simple aggregation methods were used to generate LR roughness length from HR data: simple linear average, geometric mean (Equation (25)) and harmonic mean (Equation (26)) [66]: where α i is the relative pixel area of each individual roughness (or HR roughness). All three aggregation rules were implemented to derive effective input parameters at LR.
We also wanted to test more deterministic methods such as those proposed by [43,47], which were based on the inversion of the momentum flux equations in the boundary layer at LR over heterogeneous areas. Here, we considered the inversion of the sensible heat flux, which made it possible to estimate effective parameters by considering the aggregation of the aerodynamic conductance. If one assumes that surface temperatures were close to air temperature T a , the aerodynamic resistance in neutral conditions is a good approximation of the aerodynamic conditions with true stability corrections. In these conditions, the sensible heat flux expression reduces to: where g a0 is the aerodynamic conductance to heat transfer at the surface-atmosphere interface in neutral conditions (inverse of the aerodynamic resistance r a0 in the same conditions). If one assumed that, in those conditions, surface temperatures were similar at HR and LR, and our purpose was to find effective roughness values reducing as much as possible the difference between the HR and LR aerodynamic conductances g a0 . The solutions were reached numerically by finding the effective (LR) surface roughness lengths that solves Equation (30): or: where g a0i is the aerodynamic conductance of HR pixel i, g a0eff the effective value at LR, and kB −1 eff the effective kB −1 , which allows deriving the effective roughness z 0meff at LR (Equation (5)). Once computed, the effective surface roughness lengths were used as input into the models at low resolution together with the 'LR-ASTER' input dataset. The new ET maps were compared to the 'agg-ASTER' ET maps. For each model, the fluxes conservation at LR was quantified by looking at the difference between 'LR-ASTER' and 'agg-ASTER' ET maps when using either the LR roughness estimated at LR or effective parameter values produced from HR data using a simple linear averaging as well as Equations (25), (26) or (30).

Preliminary Step: Evaluation of Flux Estimations at High Resolution from ASTER Data
In a previous study [22], ET maps were obtained on the same area using SEBS and TSEB driven by a combination of the ASTER temperature product and the Vis-NIR Formosat satellite reflectances, as well as in situ crop height measurements. The resulting fluxes were evaluated against the eddy correlation (EC) measurements over seven crop fields whose size is large enough to respect fetch requirements for all dates. This gave an indication of the expected performances for considering the HR fluxes maps as reliable: absolute relative biases were about 18% for SEBS and 23% for TSEB.
In the present study, for a better consistency, ASTER products were used to derive all of the input parameters of SEBS and TSEB, including crop height. The parameters of the f c (NDVI) and LAI(NDVI) relationships (Equations (20) and (21), respectively) were drawn empirically from the in situ measurements performed in [61]. A comparison to the same EC measurements showed an absolute relative bias lower than 1% for SEBS and about 20% for TSEB (see Figure 4). We thus considered the HR ASTER fluxes as realistic. Moreover, we checked that the available energy fluxes compared well to the measured values (relative bias of 5% and 9% for R n and G, respectively).

Flux Estimation at Low Resolution from ASTER LR Data and MODIS Data
ET was estimated using SEBS and TSEB by considering input data at LR derived from ASTER products or MODIS products in a similar way. ET maps at LR were compared to the reference maps obtained by aggregating ET obtained at high resolution from ASTER products ('agg-ASTER').

Available Energy
Available energy was computed with the same method in both models. Very good performances were obtained at LR as well from LR ASTER data as from MODIS data: when compared to 'Agg-ASTER' fluxes, relative biases were smaller than 0.001 (ASTER and MODIS) for Rn and 0.04 (ASTER) and 0.002 (MODIS) for G. The relative bias is the mean relative difference between both estimates normalized to the mean value of the reference value ('Agg-ASTER' in our case). The corresponding Root Mean Square Error (RMSE) values were 0.03 mm·day −1 for Rn (ASTER and MODIS) and 0.09 mm·day −1 (ASTER) and 0.11 mm·day −1 (MODIS) for G. These good performances implied that, for both models, evaluating ET retrievals corresponded to evaluate their capacity to partition the available energy between sensible and latent heat fluxes.

Estimation of ET from LR ASTER with SEBS
The comparison of 'LR-ASTER' ET maps obtained using SEBS to the reference ET maps 'agg-ASTER' showed fairly good performances over the entire dataset, with a relative bias of −0.044 and an RMSE value of 1.0 mm·day −1 .
Those performances decreased for individual dates along the season, showing larger biases and RMSE values. More precisely, 'LR ASTER' ET values were smaller than 'agg-ASTER' ET during vegetation growth and higher when the vegetation was senescent (

Flux Estimation at Low Resolution from ASTER LR Data and MODIS Data
ET was estimated using SEBS and TSEB by considering input data at LR derived from ASTER products or MODIS products in a similar way. ET maps at LR were compared to the reference maps obtained by aggregating ET obtained at high resolution from ASTER products ('agg-ASTER').

Available Energy
Available energy was computed with the same method in both models. Very good performances were obtained at LR as well from LR ASTER data as from MODIS data: when compared to 'Agg-ASTER' fluxes, relative biases were smaller than 0.001 (ASTER and MODIS) for R n and 0.04 (ASTER) and 0.002 (MODIS) for G. The relative bias is the mean relative difference between both estimates normalized to the mean value of the reference value ('Agg-ASTER' in our case). The corresponding Root Mean Square Error (RMSE) values were 0.03 mm·day −1 for R n (ASTER and MODIS) and 0.09 mm·day −1 (ASTER) and 0.11 mm·day −1 (MODIS) for G. These good performances implied that, for both models, evaluating ET retrievals corresponded to evaluate their capacity to partition the available energy between sensible and latent heat fluxes.

Estimation of ET from LR ASTER with SEBS
The comparison of 'LR-ASTER' ET maps obtained using SEBS to the reference ET maps 'agg-ASTER' showed fairly good performances over the entire dataset, with a relative bias of −0.044 and an RMSE value of 1.0 mm·day −1 .
Those performances decreased for individual dates along the season, showing larger biases and RMSE values. More precisely, 'LR ASTER' ET values were smaller than 'agg-ASTER' ET during vegetation growth and higher when the vegetation was senescent ( Figure 5). The relative biases were −0.

Estimation of ET from LR ASTER with TSEB
The averaged results showed that TSEB fluxes at low resolution were more conservative across the different scales than SEBS, with a relative bias of 0.009 and an RMSE of 0.44 mm·day −1 , so that daily ET can be derived with a satisfying precision from LR inputs ( Figure 6). However, if the total flux was accurately retrieved using the same parameterizations at high and low resolutions, this did not fully apply to the partition between soil evaporation and plant transpiration, which both presented higher biases ( Figure 6). The transpiration obtained from LR inputs was higher than the aggregated HR fluxes (relative bias = 0.065 and RMSE = 0.46 mm·day −1 ) and the soil evaporation was lower (relative bias = −0.066 and RMSE = 0.38 mm·day −1 ). Compensation between evaporation and transpiration bias resulted in a very low bias for ET.

Estimation of ET from MODIS Data with SEBS
Results averaged for the seven dates over the 4 km by 4 km (i.e., 2 by 2 pixels at 2000 m spatial resolution) showed performances very close to those obtained with 'LR ASTER'. When SEBS was used to estimate evapotranspiration with MODIS inputs, the relative bias was −0.08 and the RMSE was 1.03 mm·day −1 , which was quite similar to the results obtained with 'LR-ASTER' (in particular when considering RMSE). Again, the low biases at the seasonal scale did not reflect the date to date performances and, as for 'LR-ASTER', more contrasted results were obtained for the different vegetation development stages as shown on

Estimation of ET from LR ASTER with TSEB
The averaged results showed that TSEB fluxes at low resolution were more conservative across the different scales than SEBS, with a relative bias of 0.009 and an RMSE of 0.44 mm·day −1 , so that daily ET can be derived with a satisfying precision from LR inputs ( Figure 6).

Estimation of ET from LR ASTER with TSEB
The averaged results showed that TSEB fluxes at low resolution were more conservative across the different scales than SEBS, with a relative bias of 0.009 and an RMSE of 0.44 mm·day −1 , so that daily ET can be derived with a satisfying precision from LR inputs ( Figure 6). However, if the total flux was accurately retrieved using the same parameterizations at high and low resolutions, this did not fully apply to the partition between soil evaporation and plant transpiration, which both presented higher biases ( Figure 6). The transpiration obtained from LR inputs was higher than the aggregated HR fluxes (relative bias = 0.065 and RMSE = 0.46 mm·day −1 ) and the soil evaporation was lower (relative bias = −0.066 and RMSE = 0.38 mm·day −1 ). Compensation between evaporation and transpiration bias resulted in a very low bias for ET.

Estimation of ET from MODIS Data with SEBS
Results averaged for the seven dates over the 4 km by 4 km (i.e., 2 by 2 pixels at 2000 m spatial resolution) showed performances very close to those obtained with 'LR ASTER'. When SEBS was used to estimate evapotranspiration with MODIS inputs, the relative bias was −0.08 and the RMSE was 1.03 mm·day −1 , which was quite similar to the results obtained with 'LR-ASTER' (in particular when considering RMSE). Again, the low biases at the seasonal scale did not reflect the date to date performances and, as for 'LR-ASTER', more contrasted results were obtained for the different vegetation development stages as shown on  However, if the total flux was accurately retrieved using the same parameterizations at high and low resolutions, this did not fully apply to the partition between soil evaporation and plant transpiration, which both presented higher biases ( Figure 6). The transpiration obtained from LR inputs was higher than the aggregated HR fluxes (relative bias = 0.065 and RMSE = 0.46 mm·day −1 ) and the soil evaporation was lower (relative bias = −0.066 and RMSE = 0.38 mm·day −1 ). Compensation between evaporation and transpiration bias resulted in a very low bias for ET.

Estimation of ET from MODIS Data with SEBS
Results averaged for the seven dates over the 4 km by 4 km (i.e., 2 by 2 pixels at 2000 m spatial resolution) showed performances very close to those obtained with 'LR ASTER'. When SEBS was used to estimate evapotranspiration with MODIS inputs, the relative bias was −0.08 and the RMSE was 1.03 mm·day −1 , which was quite similar to the results obtained with 'LR-ASTER' (in particular when considering RMSE). Again, the low biases at the seasonal scale did not reflect the date to date performances and, as for 'LR-ASTER', more contrasted results were obtained for the different vegetation development stages as shown on

Estimation of ET from MODIS Data with TSEB
The averaged results for the 4 × 4 km 2 showed fairly good performances when TSEB was used to estimate ET with MODIS inputs: relative bias on daily ET was 0.03, RMSE 0.96 mm·day −1 . However, these results were significantly less favourable than those obtained with 'LR-ASTER'. Model performances regarding soil evaporation and transpiration separately were of similar magnitude as for the total flux when considering RMSE: 0.81 and 1.04 mm·day −1 for daily evaporation and transpiration, respectively ( Figure 8). As for the results obtained with LR ASTER, the transpiration calculated from MODIS input was higher than the aggregated HR fluxes (relative bias was 0.21) and the evaporation was lower (relative bias was −0.19). Again, compensation between evaporation and transpiration bias resulted in a lower bias for ET. As for the total flux (ET), transpiration and evaporation obtained from MODIS products presented higher discrepancies than from 'LR-ASTER' when compared to 'agg-ASTER'.

Test of Effective Roughness Length Parameterization at Low Resolution
Since TSEB was fairly conservative across scales for total ET, we focused on reducing scale discrepancies for SEBS, using roughness length as a key parameter to explain those differences.

Evaluation
The four aggregation rules of roughness length (linear, geometric i.e Equation (25), harmonic i.e., Equation 26 and conductance average i.e., Equation (30)) were implemented to derive input roughness lengths for SEBS at LR. The resulting ET calculated by SEBS was compared to the

Estimation of ET from MODIS Data with TSEB
The averaged results for the 4 × 4 km 2 showed fairly good performances when TSEB was used to estimate ET with MODIS inputs: relative bias on daily ET was 0.03, RMSE 0.96 mm·day −1 . However, these results were significantly less favourable than those obtained with 'LR-ASTER'. Model performances regarding soil evaporation and transpiration separately were of similar magnitude as for the total flux when considering RMSE: 0.81 and 1.04 mm·day −1 for daily evaporation and transpiration, respectively ( Figure 8). As for the results obtained with LR ASTER, the transpiration calculated from MODIS input was higher than the aggregated HR fluxes (relative bias was 0.21) and the evaporation was lower (relative bias was −0.19). Again, compensation between evaporation and transpiration bias resulted in a lower bias for ET. As for the total flux (ET), transpiration and evaporation obtained from MODIS products presented higher discrepancies than from 'LR-ASTER' when compared to 'agg-ASTER'.

Estimation of ET from MODIS Data with TSEB
The averaged results for the 4 × 4 km 2 showed fairly good performances when TSEB was used to estimate ET with MODIS inputs: relative bias on daily ET was 0.03, RMSE 0.96 mm·day −1 . However, these results were significantly less favourable than those obtained with 'LR-ASTER'. Model performances regarding soil evaporation and transpiration separately were of similar magnitude as for the total flux when considering RMSE: 0.81 and 1.04 mm·day −1 for daily evaporation and transpiration, respectively ( Figure 8). As for the results obtained with LR ASTER, the transpiration calculated from MODIS input was higher than the aggregated HR fluxes (relative bias was 0.21) and the evaporation was lower (relative bias was −0.19). Again, compensation between evaporation and transpiration bias resulted in a lower bias for ET. As for the total flux (ET), transpiration and evaporation obtained from MODIS products presented higher discrepancies than from 'LR-ASTER' when compared to 'agg-ASTER'.

Test of Effective Roughness Length Parameterization at Low Resolution
Since TSEB was fairly conservative across scales for total ET, we focused on reducing scale discrepancies for SEBS, using roughness length as a key parameter to explain those differences.

Evaluation
The four aggregation rules of roughness length (linear, geometric i.e Equation (25), harmonic i.e., Equation 26 and conductance average i.e., Equation (30)) were implemented to derive input roughness lengths for SEBS at LR. The resulting ET calculated by SEBS was compared to the

Test of Effective Roughness Length Parameterization at Low Resolution
Since TSEB was fairly conservative across scales for total ET, we focused on reducing scale discrepancies for SEBS, using roughness length as a key parameter to explain those differences.

Evaluation
The four aggregation rules of roughness length (linear, geometric i.e., Equation (25), harmonic i.e., Equation (26) and conductance average i.e., Equation (30)) were implemented to derive input roughness lengths for SEBS at LR. The resulting ET calculated by SEBS was compared to the 'agg-ASTER' ET fluxes. Evaluation of the comparisons is presented in Table 2 when considering LR ASTER dataset and in Table 3 when considering the MODIS dataset. In the two cases, high-resolution roughness lengths were derived from the roughness length maps in the HR ASTER dataset.
For all dates except the 30 December 2007 and the 13 May 2008, it was possible to solve Equation (30) and to derive effective roughness values that reduced the difference between the HR and the LR conductances to zero. For the first and the last dates, no solution to Equation (30) was found using the numerical solver and only a minimum value for the first term of Equation (30) was found (effective roughnesses corresponding to these minimum were used). Compared to the results obtained when roughness lengths were derived directly from LR remote sensing data (Section 5.2), significant reductions in relative biases and/or RMSE values were obtained for most dates when considering the first three types of aggregation methods (either with LR ASTER dataset, Table 2, or MODIS dataset, Table 3). On average, the best results were obtained with the 'harmonic' aggregation rule, with the lowest RMSE (0.52 mm·day −1 with LR ASTER and 0.78 mm·day −1 with MODIS) and very low relative biases (0.03 and −0.02). The linear aggregation rule resulted only in slight reductions in RMSE and in biases (which, on average, were even increased). The geometric aggregation method gave RMSE on average only slightly higher and relative biases higher than the harmonic aggregation method. For all dates but the 27 April 2008 (low relative bias of 0.04), the relative biases were lower than the previous values obtained in Section 5.2.2. For each date, the relative biases were reduced. This was especially true for the dates that presented biases exceeding 10%. Table 2. Relative biases for the 'agg ASTER' vs. 'LR ASTER' ET fluxes produced with SEBS at low resolution for different methods to estimate the LR surface roughness length parameter. z om -RS refers to the "remotely sensed" roughness length derived from LR NDVI values, "agg" uses aggregated values from HR to LR using the four methods ("lin": linear, "geo": geometric, "har": harmonic, "grad": deterministic averaging).  The fourth aggregation method relied on deriving the effective surface roughness lengths from the aggregation of aerodynamic conductance in neutral conditions (Section 4.5). Using the LR ASTER dataset, it was possible to solve Equation (30) for all dates, except for 30 December 2007, and 13 May 2008, and to derive effective roughness values that reduced the difference between the HR and the LR conductances to zero. For the first and the last dates, no solution to Equation (30) was found using the numerical solver and only a minimum value for the first term of Equation (30) was found (effective roughness lengths corresponding to these minimums were used). RMSE were not improved for most dates and on average, RMSE and relative bias were higher than those obtained directly from LR ASTER data ( Table 2). The results obtained when using the MODIS dataset (Table 3) presented lower RMSE and lower relative biases than those obtained from the LR ASTER dataset. The performances were always as good as the geometrical aggregation performances and even better at the beginning of the growing season and the beginning of the senescent period.
When considering a comparison of the two datasets for each aggregation methods, it appeared that the use of MODIS data usually resulted in similar or lower performances than with LR ASTER for the direct use of LR data and for the first three aggregation rules. Conversely, in most cases, the fourth aggregation method provided significantly lower RMSE and bias when using MODIS data rather than LR ASTER data.

Discussion on Flux Estimation at Low Resolution
The results obtained using SEBS and TSEB with low resolution input data showed that • TSEB had significantly higher or similar scaling performances (flux conservation across scales) as SEBS, respectively, with LR ASTER data and MODIS data; • SEBS had similar scaling performances with LR ASTER and MODIS data; • TSEB scaling performances were significantly lower with MODIS data than with LR ASTER data (for ET mapping as well as for ET partitioning in E and T); • With TSEB, relative biases between the soil and the vegetation offset each other when one considers the whole season.
Altogether, TSEB was more conservative through scaling than SEBS (see Figure 6 vs. Figure 5). For TSEB, the issue of roughness length aggregation is expressed differently than in SEBS [67]. Whereas SEBS's sensitivity to roughness is explained by the fact that the roughness length for heat exchange (Equation (5)) is proportional to the mechanical roughness length z om , this is not the case for TSEB. In the latter, however, the aerodynamic temperature is tightly linked to the component (soil and vegetation) energy budgets and therefore very sensitive to the vegetation cover fraction f c . In our case, this translated into different evaporation/transpiration partitions across scales but did not much affect the total flux, which remained quite conservative across scales.
Both models used NDVI to compute input parameters, in particular LAI, vegetation height and then aerodynamic roughness and the fraction cover (Equations (20)- (22)). Significant differences existed between the NDVI for the two sensors ASTER and MODIS (see Figure 7). These differences were related in particular to differences in NDVI construction and in sensors' spectral bands. ASTER NDVI were derived from reflectance products (instantaneous acquisition) while MODIS NDVI were obtained from a 16-day composite product. Red and Near InfraRed spectral bands were much larger for ASTER than for MODIS. Higher reflectances in the red band and lower reflectances in the NIR bands were expected for ASTER sensor resulting in lower NDVI. This induced differences in fraction cover (Equation (21)) and vegetation height (and therefore the roughness length, Equation (22)), somewhat enhanced by the fact that common maximum and minimum NDVI values are used for both sensors for LAI retrieval.
Even though NDVI differences between both sensors were significant, the impact on scaling between 'LR-ASTER' and 'MODIS' was not significantly different for SEBS. As explained in Section 3.2, surface roughness length influences also the additional resistance to ET (the kB −1 term in Equation (6)). Roughness lengths values are derived from NDVI products coming from reflectances measured by two different sensors at different spatial resolutions. Both sensors will differ in their representation of the subscale heterogeneity and its associated vegetation pattern. It is also clear that the model sensitivity to the surface roughness lengths induces the high ET flux differences in the middle of the season when the heterogeneity is the most pronounced.
For TSEB however, it was obvious that the use of MODIS inputs reduced the scaling performances obtained with LR ASTER inputs when looking at evaporation and transpiration separately. The partition was indeed very sensitive to fraction cover and thus to discrepancies in reconstructed NDVI values across scales. This illustrated the sensitivity of TSEB to the partition of the available energy between the two sources through the f c parameter.
As a result, there were higher transpiration and lower soil evaporation for MODIS compared to LR ASTER. However, these differences of opposite signs led to a fairly good estimate of the surface ET flux. For instance, evapotranspiration fluxes were highly biased on 13 May 2008 (relative bias = −0.13 and 0.37, RMSE = 0.49 and 0.84 mm·day −1 , respectively, for soil evaporation and transpiration fluxes), but ET was well estimated with MODIS data (see Figure 8).
On the 30 December 2007 and the 5 May 2008 most of the pixels were covered with a mix of green vegetation and bare soil. These two dates were representative of two phenological stages during the growing season (early growth and senescence) when the heterogeneity reached its maximum. The 30 December 2007 corresponded to a sowing period. Hence, depending on the vegetation type and the time-lapse since the sowing date, each plot presented different NDVI values at HR. The 5 May 2008 corresponded to the beginning of the senescence and was close to harvesting. On this day, the area was covered by a mix of more or less senescent vegetation and bare soil for plots already harvested with a wide range of NDVI values at HR. This information about the heterogeneity of the crop height parameter was somewhat taken into account by the LR ASTER inputs, albeit imperfectly, because it came from the HR ASTER inputs whose spatial resolution allowed a faithful representation of this heterogeneity. The MODIS NDVI values and their low spatial resolution were only able to represent roughly the main vegetation status present in a LR pixel (Figure 9): On this day, the area was covered by a mix of more or less senescent vegetation and bare soil for plots already harvested with a wide range of NDVI values at HR. This information about the heterogeneity of the crop height parameter was somewhat taken into account by the LR ASTER inputs, albeit imperfectly, because it came from the HR ASTER inputs whose spatial resolution allowed a faithful representation of this heterogeneity. The MODIS NDVI values and their low spatial resolution were only able to represent roughly the main vegetation status present in a LR pixel ( Figure 9): ASTER HR ASTER LR MODIS Figure 9. Discrepancies between LR and HR NDVI for three dates within the growing season.
These differences between LR ASTER NDVI and MODIS NDVI directly implied discrepancies in the respective fc values and, hence, biases in the available energy partition (Rnc relative biases = −0.36 and −0.52; RMSE = 1.00 and 1.42 mm·day −1 , respectively, for the 30 December 2007 and the 6 May 2008) that resulted in biases in transpiration between 'agg-ASTER' and MODIS instantaneous fluxes (LEc relative biases = 0.77 and 1.04; RMSE = 1.05 and 1.64 mm·day −1 , respectively, the 30 December 2007 and the 6 May 2008). Biases on soil evaporation were smaller and did not offset the bias in transpiration on that particular day, but this did not affect the seasonal average performance of the TSEB model.
Convergence of fc and roughness probability distributions between LR ASTER and MODIS using adapted maximum and minimum values of NDVI for MODIS could probably decrease this difference. Since the scaling analysis reies solely on the 'LR-ASTER' vs. 'agg-ASTER' . Biases on soil evaporation were smaller and did not offset the bias in transpiration on that particular day, but this did not affect the seasonal average performance of the TSEB model.
Convergence of f c and roughness probability distributions between LR ASTER and MODIS using adapted maximum and minimum values of NDVI for MODIS could probably decrease this difference. Since the scaling analysis reies solely on the 'LR-ASTER' vs. 'agg-ASTER' intercomparison, the 'MODIS' vs. 'agg-ASTER' intercomparison is meant only to provide an illustration of a "real world" application of the models to moderate resolution data; therefore, the intermediate exercise consisting in matching MODIS and 'agg-ASTER' NDVI distribution while keeping MODIS LST was not considered in this study.

Discussion on the Efficiency of Aggregation Rules
Results showed that if HR NDVI data were available, the linear aggregation rule and the deterministic method were less accurate than the two other averaging methods tested in this study. Both the geometric and harmonic aggregation methods reduce all daily biases and RMSE values with an advantage for the harmonic aggregation rule, which also reduces considerably the average bias. This is consistent with [43] who evaluated the geometric average as efficient to estimate effective surface roughness to reduce the error on fluxes estimation. The deterministic rule performed well but did not outperform the harmonic mean, which is a clear indication that aggregation of the roughness length is the main but not the sole element to account for in scaling from HR to LR.

Conclusions
EC measurements acquired in an extensive experiment in the Yaqui valley in Mexico over a 4 × 4 km 2 agricultural area were used to validate HR (100 m) ET maps computed with the SEBS and the TSEB models forced by ASTER data, local meteorological data and in situ measured crop heights in [22]. In order to bypass local crop height evaluation, we implemented a simple approach to derive crop heights (thus, indirectly surface roughness lengths) based on a priori maximum and minimum values scaled by NDVI levels. A common relationship valid for cereal covers in this area dominated by cereals or crops with similar height was derived. The performances of the SEBS and the TSEB models, when forced by remotely sensed vegetation height, were found to be similar to those obtained when in situ crop height measurements were used.
The satisfying intercomparison between measured ET and simulated ET at EC flux stations within the test area led to consider the HR ET maps as reliable datasets to study scaling properties between ET estimation at HR and LR (1000 m) in order to take advantage of the abundant availability of data at LR. A reference dataset was thus built by aggregating linearly ET fluxes from 100 m to 1000 m resolution maps for each model. Scaling properties were thus assessed by comparing ET fluxes computed from LR data only, and these reference maps deduced from the sole HR data. LR input data was first produced from HR ASTER data in order to investigate scaling properties with the same sensor, and then from the MODIS sensor in order to investigate the impact of the radiometric differences on the resulting LR ET product from one sensor to the other. We confirmed that the discrepancies between the reference maps deduced from SEBS HR outputs and all maps obtained by running SEBS with LR input data was mostly due to improper scaling of roughness length (i.e., vegetation height), which was consistent with the high model sensitivity to this parameter recognized in many previous studies [36,37]. TSEB was found to be more conservative between scales than SEBS, discrepancies on total ET flux being much smaller than for SEBS in particular when using LR ASTER dataset. However, the flux partitioning between evaporation and transpiration was very sensitive to small differences between fraction cover patterns at high and low resolutions.
Since surface properties such as vegetation height and fraction cover could be derived from time series of HR images because they can provide information for each field (Land Use type, LAI, and fraction cover), we investigated the use of aggregation rules to derive more spatially resolved estimates of those quantities to be used by either model at LR. Four aggregation rules were tested, linear, harmonic and geometric averaging, as well as a method that numerically found roughness lengths that minimize the difference between aggregated aerodynamic conductances in neutral conditions and the effective aerodynamic conductance at LR. The harmonic mean appeared to be the most accurate both for LR ASTER dataset and MODIS dataset. This exercise showed how the proper scaling of input parameters can improve the consistency of ET estimates across scales.
Those results must, however, be scaled to the large differences between both models as well as between the observations and the model outputs at High Resolution. By providing the best aggregation rule, one ensures that at least the aggregation error does not add up to the other error sources.
This study can be considered as representative of typical situations for semi-arid environment and irrigated crops. It will be interesting to evaluate these results in other climatic conditions, over an area presenting different irrigation practices or without irrigation practices and, hence, submitted to different stress conditions.
It would be of interest to evaluate the performances of these models on a study area with even more contrasting roughness values. The parameterization of the surface roughness lengths has been a real issue and its small range over the study area made the latter relatively more reachable. In an environment presenting crop heights ranging from bare soil to high trees as well as intermediate values (e.g., vegetables and cereals), the exercise would be more difficult.
Radar measurements could help in this way through their ability to go through different vegetation layers to estimate properly the crop height and the vegetation fraction cover parameters.
In the future, the model developed by [47] will be tested to estimate an effective surface roughness length through HR land use, surface temperature, LAI products, and local meteorological observations. This model includes an additional resistance representing the effects of the molecular diffusion through the viscous sub-layers on the sensible heat flux.