Evapotranspiration Estimation in the Sahel Using a New Ensemble-Contextual Method

In many tropical areas, evapotranspiration is the most important but least known component of the water cycle. An innovative method, named E3S (for EVASPA S-SEBI Sahel), was developed to provide spatially-distributed estimates of daily actual evapotranspiration (ETd) from remote sensing data in the Sahel. This new method combines the strengths of a contextual approach that is used to estimate the evaporative fraction (EF) from surface temperature vs. albedo scatterograms and of an ensemble approach that derives ETd estimates from a weighted average of evapotranspiration estimated from several EF methods. In this work, the two combined approaches were derived from the simplified surface energy balance index (S-SEBI) model and the EVapotranspiration Assessment from SPAce (EVASPA) tool. Main innovative aspects concern (i) ensemble predictions of ETd through the implementation of a dynamic weighting scheme of several evapotranspiration estimations, (ii) epistemic uncertainty of the estimation of ETd from the analysis of the variability of evapotranspiration estimates, and (iii) a new cloud filtering method that significantly improves the detection of cloud edges that negatively affect EF determination. E3S was applied to MODIS/TERRA and AQUA datasets acquired during the 2005–2008 period over the mesoscale AMMA-CATCH (Analyse Multidisciplinaire de la Mousson Africaine—Couplage de l’Atmosphère Tropicale et du Cycle Hydrologique) observatory in South-West Niger. E3S estimates of instantaneous and daily available energy, evaporative fraction, and evapotranspiration were evaluated at a local scale based on two field-monitored plots representing the two main ecosystem types in the area—a millet crop and a fallow savannah bush. In addition to these ground-based observations, the local scale evaluation was performed against continuous simulations by a locally-calibrated soil-vegetation-atmosphere transfer model for the two plots. The RMSE (root mean square error) from this comparison for E3S’s ETd estimates from combined AQUA/TERRA sources was 0.5 mm·day−1, and the determination coefficient was 0.90. E3S significantly improved representation of the evapotranspiration seasonality, compared with a classical implementation of S-SEBI or with the original EVASPA’s non-weighted ensemble scheme. At the mesoscale, ETd estimates were obtained with an average epistemic uncertainty of 0.4 mm·day−1. Comparisons with the reference 0.25◦-resolution GLEAM (global land evaporation Amsterdam model) product showed good agreement. These results suggested that E3S could be used Remote Sens. 2020, 12, 380; doi:10.3390/rs12030380 www.mdpi.com/journal/remotesensing Remote Sens. 2020, 12, 380 2 of 34 to produce reliable continuous regional estimations at a kilometric resolution, consistent with land and water management requirements in the Sahel. Moreover, all these innovations could be easily transposed to other contextual approaches.


Study Area and Nested Sites
The study area is the 20,800 km², rectangle meso-site of the AMMA-CATCH observatory in the southwest of the Republic of Niger (1.55°-3.15° E; 12.85-14.15° N; Figure 1a,b), encompassing the capital city of Niamey [8,49]. It is characterized by a tropical semi-arid climate with average annual rainfall decreasing from 570 mm to 470 mm from South to North (1990-2007 average) and a mean temperature of ~30°C. The West African monsoon brings precipitation within a single wet season from June to September essentially, followed by a long dry season. The meso-site is crossed in its western part by the Niger River valley and to the east by the large Dallol Bosso fossil valley (Figure 1b). Elevations range from 177 to 274 m above mean sea level. The soils are sandy and poor in nutrients in most of the study area. To the North and East of the Niger River, hydrology is essentially endoreic, whereas the entire South-West bank drains to the river. The main ecosystems are rainfed millet crop and fallow of shrubby savannah, alternating in an agropastoral rotation system. Two single-ecosystem plots in the ~2 km² Wankama pilot catchment-located in between the two alluvial valleys-were used as local-scale references for the study (Figure 1c). These 15-ha plots have been equipped for EC measurements by the AMMA-CATCH observatory since 2005. During the study period, ecosystems were millet crop and shrubby fallow savannah, respectively. A detailed description of the study area and nested sites can be found in Cappelaere et al. [49].  The meso-site is crossed in its western part by the Niger River valley and to the east by the large Dallol Bosso fossil valley (Figure 1b). Elevations range from 177 to 274 m above mean sea level. The soils are sandy and poor in nutrients in most of the study area. To the North and East of the Niger River, hydrology is essentially endoreic, whereas the entire South-West bank drains to the river. The main ecosystems are rainfed millet crop and fallow of shrubby savannah, alternating in an agropastoral rotation system. Two single-ecosystem plots in the~2 km 2 Wankama pilot catchment-located in between the two alluvial valleys-were used as local-scale references for the study (Figure 1c). These 15-ha plots have been equipped for EC measurements by the AMMA-CATCH observatory since 2005.
Remote Sens. 2020, 12, 380 5 of 34 During the study period, ecosystems were millet crop and shrubby fallow savannah, respectively. A detailed description of the study area and nested sites can be found in Cappelaereet al. [49]. Table 1 describes the 1-km resolution MODIS TERRA and AQUA products used as E3S inputs, for remotely-sensed leaf area index (LAI), normalized difference vegetation index (NDVI), albedo (α), surface temperature (Ts), and emissivity (ε) of land surface (Version 5.5, from the Land Processes Distributed Active Archive Center, https://lpdaac.usgs.gov). Temperature and emissivity were acquired every day at satellite overpass time, but they were subject to many temporal gaps due to cloud cover. To match that resolution, 4-day LAI was daily interpolated by cubic interpolation. The 8-day albedo and 16-day NDVI composites were taken as daytime values over each day within the 8-day or 16-day period. Additional information was required for the E3S method. Rainfall occurrence was obtained from the daily rainfields produced by the AMMA-CATCH observatory from its mesoscale raingauge network [50]. Short and longwave incoming radiations were taken from the ground data at the two Wankama EC stations (next section) and applied uniformly over the mesosite. This upscaling was made possible by the limited spatial variability of incoming radiation under the predominantly clear sky conditions required for image usability. Note that the ground-based information used here could be replaced with alternative sources, such as RS or global or regional modeling. The terrain was described using the DEM GTOPO30 (http://earthexplorer.usgs.gov/). It was used to filter out MODIS pixels based on maximum slope and elevation criteria, albeit with no effect in this study area.

Satellite and Other Mesoscale Data
Finally, the global land evaporation Amsterdam model (GLEAM) global ET products were used for mesoscale comparison with E3S results [47]. GLEAM was chosen for this comparison as it is one of the very few RS-ET products available daily and performing well in semi-arid African areas at a regional scale [48]. It is based on a modified Priestley-Taylor formulation with a stress factor based on soil moisture and vegetation water content, coupled with the Gash interception model [51]. In this study, both versions 3.1a and 3.1b were considered [46]. Both versions were available at a native resolution of 0.25 • and a daily timestep. They were downloaded in January 2018 from the GLEAM server (https://www.gleam.eu/).

Local Scale Data
The two reference local plots in Wankama are equipped with EC systems, measuring in particular high frequency (20Hz) 3D wind, air temperature, and water vapor, as well as four-component radiation and soil heat flux (Table 2). A full description of all sensors, setup, and other measured variables can be found in Ramieret al. [52] or Velluetet al. [4]. Turbulent fluxes, including ET, were derived as described by Ramieret al. [52], using EdiRe software (R. Clement, University of Edinburgh) and following CarboEurope recommendations [53]. All water and heat fluxes were aggregated to 30-min timestep. Mean energy balance residuals for raw daily daytime fluxes were 3.6 and 14.2 W·m −2 at the fallow and millet sites, respectively [52]. They were attributed, at least, in part to unaccounted-for Remote Sens. 2020, 12, 380 6 of 34 positive daytime storage in the shallow soil, for lack of shallow soil temperature measurements. EC measurements were not corrected for energy balance closure as residuals were reasonably small and could be attributed to a variety of sources. Because these field measurements inevitably included a significant fraction of gaps in the various time series, using the datasets directly for comparison with the RS-based estimates had limitations. This was especially true for daily-integrated fluxes because of the high dependency of the EC method upon aerologic conditions over the day, which seriously reduced the number of days available for direct comparison. Therefore, to allow for a more complete evaluation, the results of the E3S method were also compared with simulations by the physically-based SiSPAT land surface model over 2005-2008. These simulations might be considered as reliable proxies of the local plots fluxes, as the model was thoroughly calibrated and validated by Velluetet al. [4] against the multi-variable dataset from the two EC plots for the 7 years 2005-2012. The model was shown to perform very well against the observations over the wide range of conditions encountered during those 7 years, whether in the calibration or the validation sub-period. Table 3 displays the model's skill scores calculated for the present study period (2005)(2006)(2007)(2008), for 30-min net radiation, soil heat, and latent heat fluxes. All these values were highly similar to those obtained for the entire model calibration and validation periods (see Table B1 in Velluetet al. [4]). The excellent performance of the model for the very complete set of water and energy variables supported its use as a trustworthy extrapolator of field data for E3S evaluation. Further, the model was suggested as providing a reliable alternative to measurement correction techniques (such as energy balance closure enforcement), or estimation of non-directly measured variables, such as evaporative fraction and surface soil heat flux (at zero depth). Model outputs were also aggregated to the 30-min timestep. While both field-measured and modeled net radiation and evapotranspiration were used for E3S evaluation at satellite overpass, only model outputs were used for day-integrated fluxes. Table 3. Skill scores-bias, root mean square error (RMSE), and correlation coefficient (R) of SiSPAT-modeled 30-min net radiation (Rn), soil heat flux at 5-cm depth (G5cm), and latent heat flux (LE), against field observations for this study period (2005)(2006)(2007)(2008) and the two Wankama plots.

Millet
Shrubby Savannah

The S-SEBI Method
The S-SEBI model [17] was developed to solve the surface energy balance from RS data. Neglecting canopy storage, the instantaneous energy balance equation, at satellite overpass, could be written as: where Rn is the net radiation, G the surface soil heat flux, H the sensible heat flux, and LE the latent heat flux (all in W·m −2 ). In ordinary daytime situations (e.g., no rain, no oasis effect), H and LE are positive (upward flux) or nil. This and Equation (1) imply that the evaporative fraction EF (-), defined as LE relative to available energy Rn-G (Equation (2)), varies between 0 and 1 from very dry to very wet soil conditions [54].
In the S-SEBI method, LE is derived from Equation (2) by estimating Rn, G, and EF from RS data, given this above property for EF.
Rn is calculated from the radiation balance for incoming and outgoing radiation at the satellite overpass time: where α is the surface albedo, Ts is the surface temperature, ε is the emissivity, σ = 5.67 × 10 −8 W·m −2 ·K −4 is the Steffan-Boltzmann constant, and Rg and Ra are the incoming shortwave and longwave radiation, respectively. Used for the derivation of instantaneous but not daily ET, the instantaneous soil heat flux was approximated with an empirical relationship to Rn and the NDVI [55]: EF was derived from surface temperature Ts and surface albedo α. For each pixel "i" of a satellite scene associated with a given α i albedo value and Ts i surface temperature value, EF was assumed to vary linearly with Ts between two extreme temperatures Ts dry and Ts wet , corresponding to dry (EF = 0) and wet (EF = 1) areas respectively ( Figure 2): Remote Sens. 2020, 12, 380 7 of 33 In ordinary daytime situations (e.g., no rain, no oasis effect), H and LE are positive (upward flux) or nil. This and Equation (1) imply that the evaporative fraction EF (-), defined as LE relative to available energy Rn-G (Equation (2)), varies between 0 and 1 from very dry to very wet soil conditions [54].
In the S-SEBI method, LE is derived from Equation (2) by estimating Rn, G, and EF from RS data, given this above property for EF.
Rn is calculated from the radiation balance for incoming and outgoing radiation at the satellite overpass time: where α is the surface albedo, Ts is the surface temperature, ε is the emissivity, σ = 5.67 × 10 −8 W·m −2 ·K −4 is the Steffan-Boltzmann constant, and Rg and Ra are the incoming shortwave and longwave radiation, respectively. Used for the derivation of instantaneous but not daily ET, the instantaneous soil heat flux was approximated with an empirical relationship to Rn and the NDVI [55]: EF was derived from surface temperature Ts and surface albedo α. For each pixel "i" of a satellite scene associated with a given αi albedo value and Tsi surface temperature value, EF was assumed to vary linearly with Ts between two extreme temperatures Tsdry and Tswet, corresponding to dry (EF = 0) and wet (EF = 1) areas respectively ( Figure 2): Tsdry(αi) and Tswet(αi) were inferred from the upper and lower edges of the scatterplot in the (α, Ts) space for that scene, denoted as dry and wet edges, by assuming linear relationships to αi ( Figure 2): where adry, awet, bdry, and bwet are the linear fitting coefficients. For a given scatterplot, however, there are many different ways of setting these linear edges, yielding as many different implementations of the S-SEBI method.
Note that application of the S-SEBI method is based on three main assumptions: (i) uniformity of atmospheric conditions over each image; (ii) any variation in surface temperature for a given albedo is related to surface water status; (iii) simultaneous presence of a large range of wet and dry areas in each image. In comparison with other evapotranspiration mapping approaches, such as SEBAL [56] or its derivative METRIC [57], S-SEBI does not account for variations in aerodynamic behavior driven by roughness length, wind speed, and air temperature (in relation to the two first hypotheses).

Figure 2.
Principles of S-SEBI representation of the relationship between surface albedo, surface temperature, and an evaporative fraction (adapted from Roerinket al. [17]). Ts dry (α i ) and Ts wet (α i ) were inferred from the upper and lower edges of the scatterplot in the (α, Ts) space for that scene, denoted as dry and wet edges, by assuming linear relationships to α i ( Figure 2): where a dry , a wet , b dry , and b wet are the linear fitting coefficients. For a given scatterplot, however, there are many different ways of setting these linear edges, yielding as many different implementations of the S-SEBI method.
Note that application of the S-SEBI method is based on three main assumptions: (i) uniformity of atmospheric conditions over each image; (ii) any variation in surface temperature for a given albedo is related to surface water status; (iii) simultaneous presence of a large range of wet and dry areas in each image. In comparison with other evapotranspiration mapping approaches, such as SEBAL [56] or its derivative METRIC [57], S-SEBI does not account for variations in aerodynamic behavior driven by roughness length, wind speed, and air temperature (in relation to the two first hypotheses).

Implementation in the EVASPA Tool and Adaptation to the Sahel Context: EVASPA S-SEBI Sahel (E3S)
The S-SEBI method is implemented in the EVASPA tool among several RS-based ET estimation methods. EVASPA was developed by Gallego Elviraet al. [34] and Oliosoet al. [35] for continuous estimation of daily ET from RS data in the visible and thermal infrared spectral domains, for hydrological and agronomical applications. It was designed for computing an ensemble estimation of ET with an associated uncertainty from the combination of several surface energy balance models and input data from different space sensors. MODIS products were the first ones to be used as inputs in EVASPA, for models based on the derivation of the evaporative fraction from the combination of either albedo and surface temperature (as done in S-SEBI) or of NDVI or vegetation cover fraction and surface temperature (as done in the triangle approach). For each model, several methods for deriving the wet and dry edges have been implemented, and additional methods could be easily included in the EVASPA ensemble.
In this study, only S-SEBI was considered. To improve estimations in the Sahelian conditions, two significant modifications were introduced in the EVASPA processing. The first one concerns the handling of the wet and dry edge determination algorithms (Section 2.4.1): in the new E3S version, twelve additional algorithms were included for a total of seventeen (Appendix B), and prior ensemble member weighting was introduced. This weighting was set to depend on the stage in the seasonal cycle (Section 2.4.1). The second modification was the introduction of a cloud edge filtering step in the processing chain, to avoid outlier pixels that could distort the Ts-α scatterplot shape (Section 2.4.2). Additionally, a field-calibrated relationship was established to allow for flux upscaling from instant to day values (Section 2.4.3), as proposed by EVASPA. Figure 3 displays a comprehensive flowchart of E3S, showing the main processing steps, with MODIS input data used. In this study, all input variables required by E3S were taken from RS data except incoming radiation and rainfall fields, which were ground-based (Section 2.2.2). was introduced. This weighting was set to depend on the stage in the seasonal cycle (Section 2.4.1). The second modification was the introduction of a cloud edge filtering step in the processing chain, to avoid outlier pixels that could distort the Ts-α scatterplot shape (Section 2.4.2). Additionally, a field-calibrated relationship was established to allow for flux upscaling from instant to day values (Section 2.4.3), as proposed by EVASPA. Figure 3 displays a comprehensive flowchart of E3S, showing the main processing steps, with MODIS input data used. In this study, all input variables required by E3S were taken from RS data except incoming radiation and rainfall fields, which were ground-based (Section 2.2.2).

Ensemble Enrichment and Weighting of Edges Determination Methods
A major assumption of the S-SEBI method is that at any time of year, sufficient fully wet and dry pixels would coexist in the same image to allow for the correct determination of the true wet and dry edges. Because of the strong seasonal contrast in the Sahel climate, only one of the two edges can generally be correctly identified, namely the dry edge during the dry season and the wet edge during the wet season. For lack of sufficient informative pixels, application of the standard edge determination algorithms generally leads to a too high wet edge in the dry season and to a too low dry edge in the wet season (denoted as "observed wet edge" and "observed dry edge" in Figure 4a,b, respectively), through undue removal of seeming outliers. Only during the short transition between the wet season and the next dry season, the two extreme conditions coexist in sufficient numbers to allow for reliable determination of both limits with the existing algorithms ( Figure 4c). To improve edge determination during the dry or the wet season, new sets of algorithms were introduced, denoted as "dry" and "wet" methods, respectively. "Dry methods" associate the standard dry edge determination algorithms with a simplified wet edge definition consisting of a constant Ts wet equal to the absolute minimum observed Ts in the domain (denoted as "observed Ts min " in Figure 4a). Similarly, "wet methods" associate the standard wet edge algorithms with a constant Ts dry equal to the absolute maximum observed Ts for dry edge definition (denoted as "observed Ts max " in Figure 4b). Initial methods with the existing standard wet and dry edge determination algorithms are called "transition methods". Appendix B details the list of 17 edge determination algorithms used in this study. Note that despite this large number, this set does not strictly cover all existing methods in the current literature (e.g., [58]).
consisting of a constant Tswet equal to the absolute minimum observed Ts in the domain (denoted as "observed Tsmin" in Figure 4a). Similarly, "wet methods" associate the standard wet edge algorithms with a constant Tsdry equal to the absolute maximum observed Ts for dry edge definition (denoted as "observed Tsmax" in Figure 4b). Initial methods with the existing standard wet and dry edge determination algorithms are called "transition methods". Appendix A2 details the list of 17 edge determination algorithms used in this study. Note that despite this large number, this set does not strictly cover all existing methods in the current literature (e.g., [58]). To reflect the varying degree of suitability of the different algorithms with the seasonal stage, a seasonally-variable weighting procedure was applied. Through the dry season, only the six "dry methods" were given a weight of 1, all eleven others having a weight of 0 ( Figure 4d). Conversely, during the wet season, a weight of 1 was assigned to the five "wet methods", and the twelve other methods being assigned a weight of 0 ( Figure 4d). Whereas the shift from the dry to the wet season could be rather precisely dated to the onset of mesoscale rainfall, the presence of significant water storage in the root zone for several weeks after the last rainfall required to handle specifically the wet-dry transition, until the end of the vegetation cycle. At the beginning of this transition period, the concomitant presence of sufficient wet and dry pixels, due to differences in drying rate between vegetated and bare areas, allowed assigning a weight of 1 to the transition methods. Then, as vegetation gets sparser and wet pixels regress, the weight of the transition methods was gradually lowered based on mesoscale average MODIS LAI ( Figure 4d). Symmetrically, the weight of the dry methods was gradually raised from 0 to 1 over the transition period based on the same index, and the wet methods were weighted 0.
Accordingly, E3S best estimates were obtained as weighted averages of ensemble members. To express the associated epistemic uncertainty estimates, the range of ensemble members was used in this study, restricted to those members with non-zero weights only and denoted as "conditional ensemble range", i.e., Range (ET k | weight k > 0), where k designates the kth member in the ensemble.

Cloud Edge Filtering
The contextual methods require efficient outlier filtering, especially due to clouds. Unidentified cloudy pixels can generate false wet pixels and distort the Ts-albedo scatterplot (see Figure A1a,d in Appendix C), causing systematic biases in EF estimations. We, thus, developed two levels of cloud edge filtering: (i) The first level filter was based on the quality flag provided by the MOD11A1 and MYD11A1 Ts products for TERRA and AQUA, respectively. A preliminary study showed that cloud edges often corresponded to pixels flagged with a Ts error higher than 1 K, in agreement with Williamsonet al. [44]. Thus, all pixels bordering clouds and associated with a Ts error >1 K were removed (see Figure A1b,e in Appendix C); (ii) The first level filter was not always sufficient to eliminate all pixels potentially contaminated by clouds, especially during the dry season ( Figure A1b in Appendix C). A second level filter was thus introduced, extending the first level filter to cloud-bordering pixels with Ts below the first Ts quartile in the entire image ( Figure A1c,f in Appendix C). This quantile level selection was calibrated on all the available images outside the rainy seasons in 2005-2008, and we suggested that this could be site-and climate-specific.
Both filter levels needed to be implemented at a scale larger than the study area since pixels flagged as cloudy could be outside the study area. At least, doubling the size of the study area is advisable. The second level filter was only applied outside the rainy period to avoid underestimating wet-season Ts variability, as it was difficult to differentiate between true wet pixels and cloud contaminated pixels passing the first-level filter. After cloud edge filtering, images with less than 8% of remaining pixels (corresponding roughly to 2400 pixels found necessary for the convergence of all edge methods) were eliminated.

Upscaling from Instantaneous to Day-Integrated Fluxes
From the estimation of instantaneous EF and ET at satellite overpass, EVASPA provides time-integrated evapotranspiration for the corresponding day by assuming conservation of the evaporative fraction over the whole day [59]. Neglecting the ground heat flux relative to net radiation at daily scale, the mean daily evapotranspiration (ET d ) was then computed as scaled by the ratio Cdi (-) between mean daily net radiation (Rn d ) and instantaneous net radiation at satellite overpass (Rn): where λ is the latent heat of vaporization (λ = 2.45 MJ.kg −1 ). The Cdi coefficient varies with the day of the year (DOY) and the solar time at a given location according to Wassenaaret al. [60]: where a 1 , a 2 , and a 3 are parameters that depend on geographical location and solar time of satellite acquisition. In this study, these parameters were obtained by calibrating Equation (9) on the net radiation averaged from the two EC sites in Wankama from 20 April 2005 to 1 May 2013. Half-hourly and daily observed radiations were used for Rn and Rn d , respectively. Calibrated values and R 2 scores are shown for each half-hour of interest in Appendix D.

Minimizing Gaps in ET d by Combining RS Sources
In order to increase the number of days with available ET d estimates, the AQUA-and TERRAbased, ensemble space-time ET d distributions were merged into a single, blended ensemble ET d product. When and where both sources were available the same day, the two estimates were simply averaged for each ensemble member. This fusion of sources reduced gaps not only in time but also in space on bi-source days. Ensemble ET d estimates were retrieved at 398 dates from 535 MODIS images, including 99 from AQUA only, 162 from TERRA only, and 137 from both satellites. Time coverage by the blended product was 29%, for an average spatial coverage of 79%.

Evaluation Strategy of E3S Estimates
To compare E3S results with available series at the Wankama field plots, and in order to minimize the problem of footprint discrepancy between satellite pixels and EC footprints, it was decided to compare averages of the two adjacent MODIS pixels containing the flux towers with averages of either the measurements at the two plots or of the SISPAT simulations for the two plots. The rationale behind this choice is discussed in detail in Section 4.1.
As a first step, the benefit of the prior weighting of ensemble members introduced in E3S was evaluated. To do so, the E3S estimation was compared both to the use of a single edge determination algorithm, namely the ensemble member denoted as SPLIT, and to the original unweighted ensemble average of EVASPA, hereafter denoted as "EVASPA S-SEBI". The SPLIT algorithm of Verstraetenet al. [23] (see Appendix A) was used as a benchmark method as it is the most fully described method for the determination of the wet and the dry edges in the literature.
Estimations of blended ET d were finally compared with the GLEAM products, which might be considered as a reference ET d product at coarse resolution (0.25 • ) (Section 2.2.1). To this end, E3S estimations were first upscaled at a 0.25 • spatial resolution, considering a minimum threshold of 25% of available MODIS pixels in a GLEAM pixel to perform the upscaling by averaging available pixels. Available upscaled ET d was then compared to the corresponding values from the two latest versions of GLEAM (v3.1a and v3.1b). Figure 5 compares the values of the evaporative fraction (EF) obtained with the SPLIT algorithm, EVASPA S-SEBI, and E3S to those simulated by the SiSPAT model, as scatterplots for the whole study period (left) and as time series for the year 2008 (right). The single wet and dry edge determination algorithm of SPLIT systematically overestimated EF during the dry season (red dots in Figure 5a,b). This was induced by the difficulty in isolating sufficient wet pixels and, as a consequence, to determine the wet edge of the Ts-α scatterplot (Section 2.4.1). Conversely, underestimated EF was obtained during the wet season (blue dots in Figure 5a,b), reflecting the difficulty of the SPLIT approach to identify sufficient dry pixels during this period to determine the dry edge. During the transition period, although conditions of the application were better fulfilled, SPLIT was not able to provide a more accurate EF estimation. In general, the original EVASPA provided slightly better results than SPLIT, even if large EF overestimation in dry and transition periods and large underestimation in the wet period were still observed. Moreover, the uncertainty generated by the use of the unweighted 17 methods (ensemble range, represented as vertical bars, Figure 5d) was usually very large, close to the full 0-1 EF range in many cases. Such large uncertainties suggested that the two approaches were unsuitable to describe the seasonality of EF in the Sahel context. the application were better fulfilled, SPLIT was not able to provide a more accurate EF estimation. In general, the original EVASPA provided slightly better results than SPLIT, even if large EF overestimation in dry and transition periods and large underestimation in the wet period were still observed. Moreover, the uncertainty generated by the use of the unweighted 17 methods (ensemble range, represented as vertical bars, Figure 5d) was usually very large, close to the full 0-1 EF range in many cases. Such large uncertainties suggested that the two approaches were unsuitable to describe the seasonality of EF in the Sahel context. In comparison, E3S provided much better results. The seasonality of EF was well represented, and the weighted mean EF was close to SiSPAT simulations most of the time (Figure 5e,f). Very low EF was observed during the dry season, revealing the efficiency of the newly introduced "dry methods", with wet edge determination adapted to the lack of wet pixels, in strongly reducing EF overestimation compared with SPLIT and EVASPA. Similarly, E3S significantly reduced EF underestimation in the wet period by the introduction of "wet methods" adapted to the lack of dry pixels. Estimation was also significantly In comparison, E3S provided much better results. The seasonality of EF was well represented, and the weighted mean EF was close to SiSPAT simulations most of the time (Figure 5e,f). Very low EF was observed during the dry season, revealing the efficiency of the newly introduced "dry methods", with wet edge determination adapted to the lack of wet pixels, in strongly reducing EF overestimation compared with SPLIT and EVASPA. Similarly, E3S significantly reduced EF underestimation in the wet period by the introduction of "wet methods" adapted to the lack of dry pixels. Estimation was also significantly improved during the transition period, showing that the weighted averaging allowed minimizing the impact of seasonally less suitable methods in this period as well. E3S uncertainty, defined as the conditional ensemble range, i.e., the difference between the maximum and the minimum of all those ensemble members whose weight is non-zero (Section 2.4.1.), is shown as vertical bars in Figure 5f. During both the dry and the wet season, uncertainty was considerably reduced, providing much more confidence in the estimation. The uncertainty was still important during the transition period. This was, at least, partly explained by the larger proportion (>2/3) of ensemble members in the conditioned ensemble in that period than in each other period (~1/3), reflecting a larger intrinsic uncertainty (see also discussion in Section 4.3).

Impact of Edge Determination Approach
The E3S weighted mean of blended ensemble ET d estimates always showed better agreement with the SiSPAT reference than any single ensemble member whatever the period considered. ET d skill scores of each ensemble member are given in Appendix E, together with those of unweighted (EVASPA) and weighted (E3S) ensemble means. In the rest of this paper, members are no longer considered individually, and the expression "E3S estimates" refers to the weighted ensemble means. Table 4 displays skill scores of instantaneous and daily fluxes obtained with the new E3S method against both SiSPAT simulations and local plot observations for the 4-year study period, separately for TERRA and AQUA satellite sources. For more detail on E3S performance for daily ET, an analysis of the variations of residual ET d errors (estimated versus SiSPAT-simulated) with the main input variables, is provided in Supplementary Material S1. Table 4. Skill scores of E3S (EVASPA S-SEBI Sahel) flux estimates versus SiSPAT simulations or plot observations (when available, in brackets), for instantaneous fluxes at satellite overpass (top five lines) and day-integrated fluxes (last three lines), with TERRA and AQUA satellite sources, over 2005-2008. Determination coefficient (R 2 ) is given without unit whereas Bias and root mean square error (RSME) are given in W·m −2 or in mm·day −1 or without unit according the variable.

Instantaneous Fluxes
In general, the satisfying performance was obtained for instantaneous net radiation estimated with E3S against both ground observations and SiSPAT simulations, at TERRA as at AQUA overpass times. No significant differences were observed between TERRA and AQUA-based estimations, with RMSE (root mean square error) between 35 Wm −2 and 40 Wm −2 . A large part of the RMSE was produced by significant positive bias in E3S, in particular for the comparison to ground measurements. This significant Rn overestimation was mainly explained by a negative bias (~−0.01) and relatively high RMSE (~0.02) in MODIS albedo compared to ground observations and SiSPAT simulations from both local plots (not shown). E3S estimations of instantaneous latent heat flux were also in good agreement with observations ( Figure 6a SiSPAT simulations showed only very slightly larger values in bias and RMSE (Figure 6c,d), but correlation remained particularly good (R² of 0.81 and 0.86 for TERRA and AQUA-based estimations, respectively). Better estimation with AQUA, particularly against observations, was mainly due to more underestimation of instantaneous soil heat flux and hence larger overestimation of the available energy at TERRA overpass (bias of 43 W·m −2 in Rn-G). Note that poor correlation was obtained for both TERRA and AQUA-based instantaneous G estimates, showing the limitations of the empirical formulation used in this study.

Evaporative fraction (EF and EFd)
Instantaneous E3S estimation of EF showed good performance in comparison with SiSPAT-simulated instantaneous EF for the two MODIS satellites (Figure 7a,b), with maximum bias and RMSE of 0.12 and 0.16, respectively, and an R² of 0.83. No significant differences were obtained between the two satellite sources: both provided an overestimation of instantaneous EF, in relation to the overestimation of LE. The better agreement was observed with daily EFd simulated by SiSPAT (Figure 7bottom,d), with a large reduction in this overestimation. This might suggest that the hypothesis of conservation of EF over the daytime might not be fully satisfied (see discussion in Section 4.2, Subsection "Daily fluxes"). This observation was valid whatever the overpass time considered, as a comparison of AQUA and TERRA-based EF estimates (when both were available) showed low bias (−0.02) and RMSE (0.10).

Evaporative Fraction (EF and EF d )
Instantaneous E3S estimation of EF showed good performance in comparison with SiSPAT-simulated instantaneous EF for the two MODIS satellites (Figure 7a,b), with maximum bias and RMSE of 0.12 and 0.16, respectively, and an R 2 of 0.83. No significant differences were obtained between the two satellite sources: both provided an overestimation of instantaneous EF, in relation to the overestimation of LE. The better agreement was observed with daily EF d simulated by SiSPAT (Figure 7d), with a large reduction in this overestimation. This might suggest that the hypothesis of conservation of EF over the daytime might not be fully satisfied (see discussion in Section 4.2, Subsection "Daily fluxes"). This observation was valid whatever the overpass time considered, as a comparison of AQUA and TERRA-based EF estimates (when both were available) showed low bias (−0.02) and RMSE (0.10).  (Table 4) were well correlated to SiSPAT simulations (R² = 0.87), with a slight and similar overestimation for AQUA and TERRA (bias of 6 and 5 W·m −2 , respectively), and similar RMSE (13-14 W·m −2 ). Similar behavior was observed for daily evapotranspiration ETd, with an R² value of up to 0.89 for the TERRA source ( Figure 8). No significant difference in E3S skill scores was obtained between TERRA and AQUA, which was in agreement with results for the other variables used for ETd estimation, namely Rnd and EF. The RMSE of 0.5 mm·day −1 for ETd was equivalent to that for Rnd (converted into water evaporation). RMSE and R² scores obtained with the AQUA/TERRA blended ETd product against SiSPAT simulations were 0.5 mm.day −1 and 0.90, respectively, with virtually no bias. Quite similar results were obtained when considering each of the two Wankama plots with their corresponding satellite pixel individually (RMSE = 0.5-0.7 mm·day -1 , R² = 0.83-0.87, and absolute bias ≤ 0.1 mm.day -−1 ). The annual time series of the blended ETd product with its associated epistemic uncertainty is shown in Figure 9 for the period 2005 to 2008. Patterns of variation of ETd errors with albedo, surface temperature, global or atmospheric radiation, and ETd magnitude are described in Supplementary Material S1.  (Table 4) were well correlated to SiSPAT simulations (R 2 = 0.87), with a slight and similar overestimation for AQUA and TERRA (bias of 6 and 5 W·m −2 , respectively), and similar RMSE (13-14 W·m −2 ). Similar behavior was observed for daily evapotranspiration ET d , with an R 2 value of up to 0.89 for the TERRA source ( Figure 8). No significant difference in E3S skill scores was obtained between TERRA and AQUA, which was in agreement with results for the other variables used for ET d estimation, namely Rn d and EF. The RMSE of 0.5 mm·day −1 for ET d was equivalent to that for Rn d (converted into water evaporation). RMSE and R 2 scores obtained with the AQUA/TERRA blended ET d product against SiSPAT simulations were 0.5 mm·day −1 and 0.90, respectively, with virtually no bias. Quite similar results were obtained when considering each of the two Wankama plots with their corresponding satellite pixel individually (RMSE = 0.5-0.7 mm·day -1 , R 2 = 0.83-0.87, and absolute bias ≤ 0.1 mm·day -−1 ). The annual time series of the blended ET d product with its associated epistemic uncertainty is shown in Figure 9 for the period 2005 to 2008. Patterns of variation of ET d errors with albedo, surface temperature, global or atmospheric radiation, and ET d magnitude are described in Supplementary Material S1.      Figure 10a-c shows sample ET d maps produced by E3S (weighted ensemble mean) with the TERRA source over the study meso-site, in the dry season, in the wet season, and the transition period of 2007, respectively. During all dry seasons and for both RS sources, ET d remained very low over most of the study area, with the exception of the Niger River floodplain where ET d remained substantial throughout the year (see, e.g., Figure 10a). Conversely, during the wet seasons and transition periods, patterns of high ET d values were obtained over the whole study area, but never covering its entirety (e.g., Figure 10b,c). Maximum values were observed for the exoreic subarea (right bank of the Niger River) and in the southern part of the left bank of the Niger River, which is characterized by a shrub layer interspersed with bare soil and grass patches. The Dallol Bosso fossil valley, near the east edge of the mesosite, appeared as a dry area whatever the season considered (e.g., Figure 10b), in relation to high albedo values observed in this area. Corresponding albedo and Ts maps are provided in Appendix F for broader understanding. A North-South general positive gradient was also observed, especially during the transition period (Figure 10c), which is consistent with the latitudinal gradient of rainfall and vegetation cover fraction (not shown). In the wet season (Figure 10b), the high local-scale variability-with particularly sharp gradients on short distances-over large areas that shift in time, confirmed the need for at least the moderate spatial resolution used in this study.

E3S Daily ETd
Figure 10a-c shows sample ETd maps produced by E3S (weighted ensemble mean) with the TERRA source over the study meso-site, in the dry season, in the wet season, and the transition period of 2007, respectively. During all dry seasons and for both RS sources, ETd remained very low over most of the study area, with the exception of the Niger River floodplain where ETd remained substantial throughout the year (see, e.g., Figure 10a). Conversely, during the wet seasons and transition periods, patterns of high ETd values were obtained over the whole study area, but never covering its entirety (e.g., Figure 10b,c). Maximum values were observed for the exoreic subarea (right bank of the Niger River) and in the southern part of the left bank of the Niger River, which is characterized by a shrub layer interspersed with bare soil and grass patches. The Dallol Bosso fossil valley, near the east edge of the mesosite, appeared as a dry area whatever the season considered (e.g., Figure 10b), in relation to high albedo values observed in this area. Corresponding albedo and Ts maps are provided in Appendix A6 for broader understanding. A North-South general positive gradient was also observed, especially during the transition period ( Figure  10c), which is consistent with the latitudinal gradient of rainfall and vegetation cover fraction (not shown). In the wet season (Figure 10b), the high local-scale variability-with particularly sharp gradients on short distances-over large areas that shift in time, confirmed the need for at least the moderate spatial resolution used in this study.

ETd Uncertainty
For each E3S estimation date, a map of associated uncertainty, defined as the conditional ensemble range, was also produced. Figure 11 displays, for each period of the year, the spatial distribution of temporal mean uncertainty over the study period, for the AQUA/TERRA blended ETd product. It revealed patterns of highest uncertainty for ecosystems associated with the highest ET values, in particular, in the Niger River floodplain, the exoreic subarea on the Niger River's West bank, and the southern part of the East bank. Largest uncertainty contrasts were obtained in the transition period (Figure 11c).

ET d Uncertainty
For each E3S estimation date, a map of associated uncertainty, defined as the conditional ensemble range, was also produced. Figure 11 displays, for each period of the year, the spatial distribution of temporal mean uncertainty over the study period, for the AQUA/TERRA blended ET d product. It revealed patterns of highest uncertainty for ecosystems associated with the highest ET values, in particular, in the Niger River floodplain, the exoreic subarea on the Niger River's West bank, and the southern part of the East bank. Largest uncertainty contrasts were obtained in the transition period (Figure 11c).  Figure 10a-c shows sample ETd maps produced by E3S (weighted ensemble mean) with the TERRA source over the study meso-site, in the dry season, in the wet season, and the transition period of 2007, respectively. During all dry seasons and for both RS sources, ETd remained very low over most of the study area, with the exception of the Niger River floodplain where ETd remained substantial throughout the year (see, e.g., Figure 10a). Conversely, during the wet seasons and transition periods, patterns of high ETd values were obtained over the whole study area, but never covering its entirety (e.g., Figure 10b,c). Maximum values were observed for the exoreic subarea (right bank of the Niger River) and in the southern part of the left bank of the Niger River, which is characterized by a shrub layer interspersed with bare soil and grass patches. The Dallol Bosso fossil valley, near the east edge of the mesosite, appeared as a dry area whatever the season considered (e.g., Figure 10b), in relation to high albedo values observed in this area. Corresponding albedo and Ts maps are provided in Appendix A6 for broader understanding. A North-South general positive gradient was also observed, especially during the transition period ( Figure  10c), which is consistent with the latitudinal gradient of rainfall and vegetation cover fraction (not shown). In the wet season (Figure 10b), the high local-scale variability-with particularly sharp gradients on short distances-over large areas that shift in time, confirmed the need for at least the moderate spatial resolution used in this study.

ETd Uncertainty
For each E3S estimation date, a map of associated uncertainty, defined as the conditional ensemble range, was also produced. Figure 11 displays, for each period of the year, the spatial distribution of temporal mean uncertainty over the study period, for the AQUA/TERRA blended ETd product. It revealed patterns of highest uncertainty for ecosystems associated with the highest ET values, in particular, in the Niger River floodplain, the exoreic subarea on the Niger River's West bank, and the southern part of the East bank. Largest uncertainty contrasts were obtained in the transition period (Figure 11c).   Figure 12 displays the distributions of temporal mean uncertainty on ET d as boxplots per period, distinguishing the satellite sources (TERRA, AQUA, and combined TERRA/AQUA). As already reported in Section 3.1.1, uncertainties were significantly greater during the transition period. No significant differences were observed between uncertainties associated with TERRA-and AQUA-based estimates, with only slightly larger values for TERRA during the wet season. Uncertainties on the blended AQUA/TERRA ET d product were logically also very similar to those for the separate sources, with overall means of 0.2 mm·day -1 for the dry season, 0.3 mm·day −1 for the wet season, 1.2 mm·day −1 for the transition period, and 0.4 mm·day −1 for the whole-year period. Uncertainty estimates are discussed in Section 4.3.
Remote Sens. 2020, 12, 380 18 of 33 Figure 12 displays the distributions of temporal mean uncertainty on ETd as boxplots per period, distinguishing the satellite sources (TERRA, AQUA, and combined TERRA/AQUA). As already reported in Section 3.1.1, uncertainties were significantly greater during the transition period. No significant differences were observed between uncertainties associated with TERRA-and AQUA-based estimates, with only slightly larger values for TERRA during the wet season. Uncertainties on the blended AQUA/TERRA ETd product were logically also very similar to those for the separate sources, with overall means of 0.2 mm·day -1 for the dry season, 0.3 mm·day −1 for the wet season, 1.2 mm·day −1 for the transition period, and 0.4 mm·day −1 for the whole-year period. Uncertainty estimates are discussed in Section 4.3.  Figure 13 compares maps of mean seasonal ETd from E3S, GLEAMv3.1a, and GLEAMv3.1b homogenized products (see Section 2.4.5), for the dry season, the wet season, and the transition period. In all three products, spatial patterns were dominated by the regional North-South gradient, whatever the period. Despite the coarse resolution, the presence of the Niger River valley could also be detected, at least, in its northwestern part, in the dry season (Figure 13a,c). During the wet season, spatial variability was quite comparable between upscaled E3S and GLEAMv3.1b, but lower in GLEAMv3.1a. Conversely, upscaled E3S showed substantially lower spatial variability than both GLEAM products during the dry season and the transition period.  Figure 13 compares maps of mean seasonal ET d from E3S, GLEAMv3.1a, and GLEAMv3.1b homogenized products (see Section 2.4.5), for the dry season, the wet season, and the transition period. In all three products, spatial patterns were dominated by the regional North-South gradient, whatever the period. Despite the coarse resolution, the presence of the Niger River valley could also be detected, at least, in its northwestern part, in the dry season (Figure 13a,c). During the wet season, spatial variability was quite comparable between upscaled E3S and GLEAMv3.1b, but lower in GLEAMv3.1a. Conversely, upscaled E3S showed substantially lower spatial variability than both GLEAM products during the dry season and the transition period.   (Figures 14 and 15), E3S appeared to agree particularly well both spatially and temporally with GLEAMv3.1b during the wet season, when evapotranspiration was most important.

Limitations of Satellite-Versus-Field Estimates Comparison
A general difficulty for ground validation of RS products lies in the rarely sufficient availability of reliably-estimated, comparable field variables. Section 2.2.2 explains and justifies why both station data    (Figures 14 and 15), E3S appeared to agree particularly well both spatially and temporally with GLEAMv3.1b during the wet season, when evapotranspiration was most important.

Limitations of Satellite-Versus-Field Estimates Comparison
A general difficulty for ground validation of RS products lies in the rarely sufficient availability of reliably-estimated, comparable field variables. Section 2.2.2 explains and justifies why both station data

Limitations of Satellite-Versus-Field Estimates Comparison
A general difficulty for ground validation of RS products lies in the rarely sufficient availability of reliably-estimated, comparable field variables. Section 2.2.2 explains and justifies why both station data and process model simulations were used here to address this difficulty. In any case and as in any similar study, skill scores have to be interpreted with some caution. Uncertainty in field-measured or even more so, in model-simulated fluxes have to be kept in mind. Discrepancies in footprints between satellite pixels and flux tower footprints, both of which vary over time, may blur the direct comparison of these two sources of flux estimates [61,62]. Given the subkilometric scale of the fields mosaic in the study area (Figure 1b), MODIS pixels could hardly be expected to sample as homogeneous ecosystems as ground stations do. This is all the more true as the contributing area of a MODIS pixel is actually larger than the "theoretical" 1 km 2 area, and as its actual position may be somewhat different [63,64]. Hence, rather than comparing directly and separately the estimations for a MODIS pixel, supposedly including a flux station, with those derived from this specific station, it was decided for the comparison to lump estimates from the two stations on one hand and the two corresponding, theoretical MODIS pixels on the other hand, by simple averaging (Section 3.1). The two flux stations sample, respectively, the two main ecosystems (millet crop and shrubby fallow savannah) in the catchment [65] as in the study area. These ecosystems fill most of the surface area covered by the two MODIS pixels, in relatively similar fractions. The sensitivity of this distribution to possible errors in the delineation of these two pixels appears to be rather small. Furthermore, surface response contrasts between cover types are mild enough [4] to dampen possible differences in upscaling performance between averaging schemes of EC stations [66]. Hence, this simple averaging strategy, applied to both the satellite pixels and the EC stations, is expected to minimize possible biases that could arise with respect to lack of colocation or underlying ecosystem similarity in the compared satellite and ground series. Note, however, that E3S estimation performance for daily ET d was essentially conserved when measured individually for each pixel against its corresponding field plot (Section 3.1.2, Subsection Daily fluxes).

Quality of E3S-Estimated Flux Distributions
To our knowledge, previous applications of the S-SEBI method generally dealt with forests or well-watered croplands in higher latitudes, and this study was the first to test the method in the Sahelian context. In comparison with previous studies, the evaluation was performed over a larger number of dates (~400 days) distributed over a longer period (4 years).

Instantaneous Fluxes
E3S was found to provide acceptable performance for instantaneous net radiation (overall RMSE of 35-40 W·m −2 , Table 4), better than usually reported in the literature for RS estimates [67][68][69]. However, direct comparison is difficult as ground data were used here for incoming shortwave and longwave radiation, for which variables generally bring substantial uncertainty to Rn estimation [70]. Using this ground forcing, entailed a loss of spatial variability information relative to satellite-derived inputs, but this loss should be small (Section 2.2.1) and avoid the RS biases. Either source could be used for E3S implementation. The significant overestimation in Rn estimates from both TERRA and AQUA sources was mainly explained by the negative bias (~−1%) in MODIS albedo compared to albedo simulated by SiSPAT, as incoming radiation was the same. Bias and RMSE values estimated here for MODIS albedo (not shown) were similar to those reported for other Sahelian areas [71].
The instantaneous soil heat flux was always poorly estimated (Table 4). Other empirical G formulas, based on vegetation cover fraction [72], LAI [73], or evaporative fraction [74], were tested against SiSPAT simulations, showing a similar performance (not shown). In general, these empirical formulas were calibrated for specific eco-climatic conditions and specific overpass times, causing their lack of transferability. Large differences were observed in G estimates between TERRA and AQUA sources, with opposite bias and quite a good correlation for AQUA. This was due to the phase shift between G and Rn cycles that make the efficiency of empirical parameterizations, such as Equation (4), depend on the overpass time [75].
E3S estimates of evaporative fraction EF showed positive bias (Table 4), occurring in the dry season essentially ( Figure 5). In this period, soil moisture is very low in most of the area, except for the irrigated Niger River floodplain (also around ponds, but these areas are not significant within~1 km 2 MODIS pixels), and the wet edge is the most difficult to identify precisely. EF should be predominantly nil or close to, including in the Wankama evaluation pixels; consequently, errors tended mechanically to be generally positive, leading to this bias. However, error magnitudes were comparatively generally low (Figure 5e) because (i) the above few wet pixels, with appropriate EF values of 1 or near to (unstressed conditions, allowing evapotranspiration at potential rate), were relatively informative of wet edge location and (ii) the relative proximity of most pixels to the dry edge (Figure 4a) attenuated EF sensitivity to wet edge determination.
Oppositely, in the wet season, totally dry pixels rarely existed at MODIS resolution, leading to false zero EF values and an imprecise dry edge. Soil moisture and hence dependent Ts being more widely distributed between the two edges (Figure 4b), sensitivity to dry edge determination might be greater than in the dry season, leading to some more significant error magnitudes (Figure 5e). However, there no longer was any determinism on error signs, leading to smaller bias than in the dry season. This also generally applied to the transition period.
RS-based EF estimates are rarely evaluated in the literature, limiting possible comparison with results for similar eco-climatic and sensor conditions. RMSE values obtained here (0.15-0.16, Table 4) were higher than those reported by Verstraetenet al. [23] for European forests (in the~0.06-0.09 range depending on the site, using the S-SEBI approach and the SPLIT edge determination method). Applying S-SEBI with MODIS data to five contrasted sites in India, Eswaret al. [76] reported an RMSE value of 0.14, similar to the one obtained here.
The significant positive bias found in instantaneous latent heat flux (LE, Table 4) was mainly explained by the overestimation of both EF and available energy (Rn-G). Differences between TERRA and AQUA-based estimates (lower bias and RMSE values for AQUA than for TERRA) came from the above-mentioned difference in efficiency on G. RMSE values (43-56 W·m −2 ) fell in the range reported in the literature with the S-SEBI method (see Appendix A).

Daily Fluxes
Better evaporative fraction estimates were obtained when upscaling to daily EF d, with both the positive bias and the RMSE being reduced. This behavior should be related to the observation that the upscaling hypothesis of EF daytime invariance was not fully verified in SiSPAT simulation outputs: generally, higher modeled daily EF d than overpass-time EF (not shown, also reported by Deloguet al. [77] or Gentineet al. [78]) was thus less overestimated by E3S. Figure 7 shows that improvements were mainly obtained in the wet season (blue dots) and not in the dry season (red dots). This was in agreement with the reported better daytime conservation of EF in drier situations [77][78][79], water becoming then a stronger constraint than insolation.
Accuracy of Rn d estimates depends in part on the efficiency of the Cdi approximation (Equation (9)). One source of error is the temporal difference between instant Rn and discrete, time-integrated Cdi. A sensitivity analysis showed that accuracy sharply increased with Cdi resolution at times away from noon (RMSE of 14 and 32 W·m −2 for TERRA-based Rn d estimation at half-hourly and hourly resolution, respectively, but little impact on AQUA-based estimation). The Cdi polynomial formulation of [22] was also tested, showing poorer calibration than the sinusoidal formulation of [60] used here. Overall, E3S performance for Rn d was quite satisfying (Table 4), with better correlation (R 2 = 0.87) than for Rn, and low RMSE and bias (14 and 6 W·m −2 for TERRA, slightly better for AQUA).
Finally, ET d estimates showed an RMSE value (0.5 mm·day −1 ;~14 W·m −2 ) in the range of values reported in the literature for the S-SEBI method (see Appendix A). Remarkably, quasi unbiased ET d ( Table 4) resulted from the product of both slightly positively biased EF d and Rn d . At the mesoscale, the good agreement of E3S with the reference GLEAM product, together with its much finer spatial resolution, strongly encourages to extend its application to larger scales and to other settings in West Africa.

Benefits of Weighted Ensemble Approach
As documented in Appendix E and illustrated by Figure 5a,b, no single edge determination method did as well as E3S for estimation of instantaneous or daily EF or ET. Further, Figure 5c,d highlighted limitations of the unweighted ensemble approach, as the resulting estimate appeared to be strongly biased most of the year and the uncertainty to be quite large (Section 3.1.1). Allocating dynamic weights to ensemble members, depending on method suitability to prevailing conditions, allowed to enhance estimation accuracy and to reduce uncertainty drastically. The seasonal weighting approach proposed in this study always outperformed both non-ensemble and unweighted ensemble approaches, including in transition periods through gradually-varying weights.
Beyond this study case, weighted ensemble RS estimation of ET should be preferred to single estimation as it seems impossible to determine a "best method" for a sufficiently large range of eco-climatic contexts and for most conditions possibly encountered in a given context. We suggested that the seasonal weighting approach introduced here could be transposed to any other climatic region with marked seasonality, for ensemble RS-based ET estimation using any contextual method. It could be easily applied, for instance, with the Ts-VI Triangle method [16]. When possible, weight allocation might be performed through objective statistical techniques, such as BMA (e.g., [40]). However, the frequent lack of ground ET measurements is usually very limiting for the training of such statistical methods over a sufficient range of conditions, especially in data-poor regions like most of the African continent. In this respect, more empirical approaches for assigning weights heuristically according to prior knowledge on methods characteristics and behaviors, as introduced in this study, have great potential.

Uncertainty
During the dry and wet seasons, estimated uncertainty rarely exceeded E3S error computed from ground estimates (Figure 9). Oppositely, during the transition period, the uncertainty was generally much larger than the error. These observations highlighted the limitations of the current estimation of uncertainty by E3S and underlined that E3S uncertainty, as expressed by the conditional ensemble range, should by no means be interpreted as some sort of confidence interval. Indeed, Ts-α edge determination methods are not the only source of uncertainty in ET estimation, even though they are likely to be one of the largest [24]. Other sources, such as uncertainties in input data or on the degree of satisfaction of the method's underlying hypotheses, should also be accounted for. For epistemic uncertainty itself, contributions from the other component parameterizations (e.g., Equations (4), (5), (8) and (9)) should also be considered.
Regarding the satisfaction of hypotheses, window choice is an important uncertainty factor for contextual methods [80,81]. This choice must arise from a compromise between conflicting requirements for both meteorological uniformity and sufficient diversity in soil moisture status at all application times. Both are affected oppositely by window size, but the second one depends primarily on landscape configuration, as well as on-sensor spatial resolution. Interactions between these three factors would make it a heavy task to quantify effects on estimation accuracy.
If one sticks to uncertainty in Ts-α domain edges, as focused on here, using as a criterion the ensemble range certainly is a simple, convenient, and expressive way to convey the information; however, the range's drawback is to statistically increase with sample size. This contributes to the higher E3S uncertainty in the transition period (Figures 6f and 10); the conditional ensemble range accounting then for more non-zero weighted members (12) than in the two other periods (5 and 6 members). Another limitation of the conditional range criterion is that it only partially reflects the weighting scheme, assimilating all fractional weights to 1. This affects the transition period, during which all non-zero weights vary between 0 and 1: corresponding members, even with tiny weights, turn out always fully accounted for in the conditional range. This probably further unduly increases the estimated uncertainty in that period. Given these various limitations, better criteria should be investigated to more faithfully reflect the epistemic uncertainty associated with a weighted ensemble scheme, even if less readily interpretable than the range.
Whatever the statistical criterion, larger uncertainties in the transition period probably largely come from the fact that for half the non-zero weighted members, the algorithms vary for both edges, whereas, in the other periods, only one edge is member-dependent (Section 3.2.1 and Appendix A). This probably largely reflects an intrinsic uncertainty difference between periods. However, it would be interesting to introduce uncertainty also on the forced, non-member-dependent edge location in the wet and dry seasons, even if the sensitivity to that location should be smaller, particularly in the dry season (see discussion of instantaneous EF in Section 4.2).
In spite of all above-discussed limitations of the uncertainty definition used here, a more detailed error-versus-uncertainty analysis of E3S daily ET d ( Figure S2) showed that outside the transition period, the error range increased rather linearly with the estimated uncertainty, suggesting that during most of the year, the latter could be used as a meaningful indicator of the risk of error associated with a given ET d estimation.

Conclusions
E3S was developed as an adaptation of the EVASPA S-SEBI method of RS-based ET d estimation to the Sahel context. The original EVASPA ensemble was enriched with new Ts-α scatterplot edge determination methods. Prior, season-and time-dependent weighting of the ensemble members was introduced. The conditional range of ensemble estimates was used to provide an expression of epistemic uncertainty. Quality filters on Ts were included to remove residual cloudy pixels that could distort the Ts-α scatterplot and cause systematic EF biases. E3S was evaluated at local and mesoscale in South-West Niger from 2005 to 2008. ET d maps were produced for 398 days using MODIS images from TERRA and/or AQUA satellites. Local evaluation against SiSPAT model simulations showed RMSE values of 0.5 mm·day −1 , R 2 of 0.90, and no bias for the ET d "best estimate", combining the two satellite sources, even though there was some bias at instantaneous scale. The proposed ensemble weighting scheme improved restitution of ET d seasonality compared to the reference version of S-SEBI with the single-member SPLIT method, and to the original EVASPA method with unweighted ensemble averaging. It also strongly reduced epistemic uncertainty in ET d estimates, except in the short transition period. On average, uncertainty on E3S ET d estimates, expressed by the conditional ensemble range, was 0.4 mm·day -1 . Comparison to the GLEAM global evapotranspiration products showed good agreement in terms of mesoscale mean, as well as of wet season spatial patterns with the 3.1b version. These results (i) revealed the high potential of the S-SEBI approach for the Sahel region, and (ii) suggested that E3S could be used in the production of continuous long-term regional ET d estimates, with the important benefit of a finer resolution compared to the current continental reference that is GLEAM. In that perspective, interpolation schemes are being introduced in the E3S algorithm to provide a continuous ET d product that should prove useful for many applications, such as assimilation in spatially-distributed hydrological models. Evaluation of E3S is also being extended both South and North to a wetter and a drier site under Sudanian and Sahelo-Saharan climates, respectively; these three sites ensuring a good sampling of the West African eco-climatic gradient.
Errors are taken with respect to SiSPAT-modeled ET d . Uncertainty is defined as the conditional ensemble range. Colors are relative to the three defined periods (dry season, wet season, transition period). Funding: The first author's Ph.D. was financed by a student research grant from the GAIA Doctoral School at the University of Montpellier (https://gaia.umontpellier.fr/). This study also received support from the PNTS program of the French Centre National de Recherche Scientifique (PRETAO project-PRoduits d'EvapoTranspiration en Afrique de l'Ouest), as well as from CNES (French Centre National d'Études Spatiales) through the TOSCA projects EVASPA V3.0 and PITEAS (Production par Interpolation Temporelle de l'Evapotranspiration journalière issue de l'infrarouge thermique à l'Aide de données auxiliaireS), and from the CRITEX program of the Agence Nationale de la Recherche (grant ANR-11-EQPX-0011).

Acknowledgments:
The ground data set was obtained in the framework of the AMMA (African Monsoon Multidisciplinary Analysis) Program and the AMMA-CATCH observatory (www.amma-catch.org), with funding from IRD, CNRS-INSU, OREME, and HSM. Special thanks are due to the local IRD team in Niger as well as to F. Arpin-Pont, H. Barral, N. Boulain, G. Charvet, and D. Ramier for the ground data acquisition and processing. Provision of the MODIS products by the Land Processes Distributed Active Archive Center (LP DAAC) and of the GLEAM evapotranspiration products by the Global Land Evaporation Amsterdam Model team is gratefully acknowledged. The authors also wish to thank the three anonymous reviewers for their valuable comments.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.   Appendix D Table A3. Calibrated values of the parameters of the Cdi formulation used in this study.