Inﬂuence of Spatial Resolution on Remote Sensing-Based Irrigation Performance Assessment Using WaPOR Data

: This paper analyses the e ﬀ ect of the spatial assessment scale on irrigation performance indicators in small and medium-scale agriculture. Three performance indicators—adequacy (i.e., su ﬃ ciency of water use to meet the crop water requirement), equity (i.e., fairness of irrigation distribution), and productivity (i.e., unit of physical crop production / yield per unit water consumption)—are evaluated in ﬁve irrigation schemes for three spatial resolutions—250 m, 100 m, and 30 m. Each scheme has varying plot sizes and distributions, with average plot sizes ranging from 0.2 ha to 13 ha. The datasets are derived from the United Nations Food and Agricultural Organization (FAO) water productivity through open access of remotely sensed–derived data (the Water Productivity Open Access Portal—WaPOR) database. Irrigation indicators performed di ﬀ erently in di ﬀ erent aspects; for adequacy, all three resolutions show similar spatial trends for relative evapotranspiration (ET) across levels for all years. However, the estimation of relative ET is often higher at higher resolution. In terms of equity, all resolutions show similar inter-annual trends in the coe ﬃ cient of variation (CV); higher resolutions usually have a higher CV of the annual evapotranspiration and interception (ETIa) while capturing more spatial variability. For productivity, higher resolutions show lower crop water productivity (CWP) due to higher aboveground biomass productivity (AGBP) estimations in lower resolutions; they always have a higher CV of CWP. We ﬁnd all resolutions of 250 m, 100 m, and 30 m suitable for inter-annual and inter-scheme assessments regardless of plot size. While each resolution shows consistent temporal trends, the magnitude of the trend in both space and time is smoothed by the 100 m and 250 m resolution datasets. This frequently results in substantial di ﬀ erences in the irrigation performance assessment criteria for inter-plot comparisons; therefore, 250 m and 100 m are not recommended for inter-plot comparison for all plot sizes, particularly small plots ( < 2 ha). Our ﬁndings highlight the importance of selecting the spatial resolution appropriate to scheme characteristics when undertaking irrigation performance assessment using remote sensing.


Introduction
Irrigation is typically performed in areas with arid climates, low precipitation, and/or frequent droughts, which makes water management both complex and important. On a continental scale, irrigation consumes approximately 86% of freshwater withdrawals in Africa [1]. This is already higher than the global average and is expected to increase with increasing prosperity and therefor food demands. Water governance in Africa must be performed in such a way to provide for increasing demands in food and simultaneous increasing demands from industry, the environment, and municipalities.
Quantifying water balance components and land productivity for irrigation schemes scales has a wide variety of applications. This includes but is not limited to initiating and evaluating water conservation practices, evaluating equitable water distribution [2][3][4][5], assessing water and land productivities [6,7], input to water policy and resource management [8,9], and improving irrigation management and systems [2,10].
Remote sensing is a powerful tool to understand agricultural performance at high spatial and temporal resolutions. The application of remote sensing in estimating agricultural performance indicators is becoming more prolific as it provides more information, in both time and space, than can be provided by traditional methods, such as water balance or ground measurements [11]. Remote sensing can provide insight into various aspects of agricultural production, including estimation of actual evapotranspiration (ETa) and biomass production. With the increase in open access to satellite imagery and retrieval algorithms, remote sensing provides effective yet spatially and temporally extensive options to estimate agricultural indices, which are especially beneficial for evaluating irrigation performance in data-scarce regions, like Africa.
The use of remote sensing to estimate and evaluate various indicators for irrigation performance at the irrigation scheme level (i.e., the irrigation perimeter) has been tested and reported for multiple spatial and temporal resolutions (e.g., spatial resolution refers to the pixel size, and temporal resolution refers to the satellite revisit time). These include studies on indicators including equity [2,6,12], adequacy [6,13,14], sustainability [2,13], and water productivity [15][16][17]. These studies used datasets with input sensors of varying resolution, including Landsat TM with a resolution of 30 m and 16-day revisit and moderate resolution imaging spectroradiometer (MODIS) Terra with resolutions of 250 m and 1-day revisit or 500 m and 8-day revisit. The accuracy of these studies is often not reported, and when it is, it varies considerably [11]. Studies utilising global datasets show a large range in errors in ETa, net primary productivity (NPP), or biomass production and crop water productivity (CWP) [11]. Local studies, with parametrised models typically have a higher accuracy. However, it is not clear how the resolution of the dataset influences the accuracy or the potential application of irrigation performance indicators. For example, at what resolution can different indicators be assessed for a given scale of application (e.g., inter-or intra-scheme) or a given scale of the irrigation scheme (e.g., plots size)?
The variation in the satellite revisit period can lead to different irrigation performance indicator values when interpolating images, particularly in rainy periods and during the growing season [18]. Uncertainty in ETa of up to 40% has been attributed to the difference in a 16-day revisit as compared to 4-day revisit, depending on climate and season with no assimilation of daily meteorological data [19]. Likewise, spatial resolution is highly important when using non-linear models in heterogeneous areas due to pixel purity [20,21]. Therefore a higher spatial resolution, when analysing irrigation performance indicators, is expected to improve the assessment in areas of higher spatial heterogeneity [22].
Several studies have investigated the relationship between spatial resolution or temporal resolution on accuracy of different input parameters [23,24] used to estimate irrigation performance by remote sensing-for example, normalized difference vegetation index (NDVI) [25,26], energy balance components [27], ETa [28], or net (NPP) or gross Primary Productivity (GPP) [29]. Further, much effort has been placed on continuously improving the resolution of these products [30] or aggregating images with varying resolutions to maximise information (i.e., aggregating high-spatial and low-temporal-resolution images with low-spatial and high-temporal-resolution images) [31]. A recent study compared the accuracy of wheat yield with varying spatial and temporal resolutions and found that, for yield (or NDVI trends), a higher spatial resolution was preferred to a higher temporal resolution [20]. Another study suggested at a regional scale Landsat ETM+ 30 m and MODIS 250 m data can both estimate agricultural area and NDVI to a reasonable level of accuracy [32].
However, despite some investigations into these parameters and spatial resolutions, there is gap between research projects and the impact of image resolution on the quality and accuracy of irrigation performance indicators at medium to high resolution scales for a given irrigation scheme or farm scale.
The recent online United Nations Food and Agricultural Organization (FAO) portal to monitor water productivity through open access of remotely sensed-derived data (the Water Productivity Open Access Portal-WaPOR) avails evapotranspiration and interception (ETIa) and yield related datasets in Africa and the Middle East at dekadal time steps at three different spatial resolutions (250 m, 100 m, and 30 m), depending on geographic location. The availability of these datasets provides an opportunity to monitor farming practices and irrigation management from space and is thus ideal for regions with limited in-situ or local datasets. In this study, we utilize WaPOR datasets to derive three irrigation performance indicators-namely, adequacy, equity, and productivity-in five irrigation schemes-Wonji, Metehara, Koga, Zankalon, and Office du Niger (ODN)-in Africa. We evaluate the impact on each of the indicators at inter-plot and scheme levels and suggest the best resolution for each based on farm plot size.

Scheme Descriptions
Five irrigation schemes in Africa were selected-the Wonji (Ethiopia), the Metehara (Ethiopia), the Koga (Ethiopia), Zankalon (Egypt), and ODN (Mali). The Wonji and Metehara are located in the Awash Basin, the Koga scheme is located in the Upper Blue Nile, the Zankalon scheme is located in the Nile Delta Basin, and the ODN scheme is located in the Niger Basin ( Figure 1). The description of each irrigation scheme, standard deviation (SD) of plot sizes used in the evaluation, the major crops cultivated, the scheme area used in the evaluation, and the minimum and maximum elevation of each scheme are given in Table 1

Input Datasets
Three input datasets to assess the irrigation performance were collected in the five study areas. The three input datasets were the actual evapotranspiration or ETIa, net primary productivity or NPP, and reference ET or ETo. All data was sourced directly from the WaPOR portal (version 2) (https://wapor.apps.fao.org/home/WAPOR_2/1). The WaPOR datasets are produced by the FRAME  In the Wonji and the Metehara, plot boundaries were provided by the irrigation scheme managers in AutoCAD [33] and converted to shapefile. The Koga, Zankalon, and ODN plots were digitally delineated using google Earth. Statistics were derived using the R software [34], and visualizations were done in ArcMap Thematic Mapping [35].

Input Datasets
Three input datasets to assess the irrigation performance were collected in the five study areas. The three input datasets were the actual evapotranspiration or ETIa, net primary productivity or NPP, and reference ET or ETo. All data was sourced directly from the WaPOR portal (version 2) (https://wapor.apps.fao.org/home/WAPOR_2/1). The WaPOR datasets are produced by the FRAME Consortium, led by eLEAF and comprised of the Flemish Institute for Technological Research (VITO), the International Institute for Geo-Information Science and Earth Observation at the University of Twente, and the WaterWatch Foundation. WaPOR data are gap-filled and smoothed, so all products are void of data gaps [36]. The dataset was assessed from a near 10-day dekadal to annual temporal scales over a 10-year period (2009-2018).
The WaPOR database provides ETIa and NPP in three spatial resolutions (Table 2) for the five selected schemes (Table 1). Reflectance bands for the continental (L1), national (L2) and sub-national (L3) datasets are currently retrieved from the MODIS, MODIS/Project for On-Board Autonomy-Vegetation (PROBA-V), PROBA-V, and Landsat satellites respectively ( Table 2). Information is embedded in visible, near-infrared, and thermal (Landsat only) infrared bands, which were used to retrieve the land surface temperature (LST), vegetation indices, and atmospheric temperature.
WaPOR ETIa and NPP further relies on input from weather, digital elevation, precipitation and transmissivity data from other sources. The weather data (i.e., air temperature, relative humidity wind 2014 [38]. The weather data is resampled using a bilinear interpolation method to the 250 m resolution. The temperature data is additionally resampled based on the DEM (90 m) of the SRTM. Atmospheric transmissivity is taken from the Meteosat Second Generation (MSG) [36]. Precipitation data are used as a limiting factor of soil moisture availability and interception and is sourced from the Climate Hazards Group Infrared Precipitation with Stations (CHIRPS) dataset [39]. ETo is estimated using the weather, atmospheric transmissivity, and solar radiation data and has no satellite data input. Detailed information on the methodology and processing can be found in the WaPOR methodology documents [36].

Evapotranspiration and Interception
The ETIa in the WaPOR portal is based on a modified version of the ETLook model (ETLook-WaPOR) [40]. The ETLook model uses Penman-Monteith (PM) to estimate ETa adapted to remote sensing input data [36]. The PM approach uses the combined approaches of the energy balance equation and the aerodynamic equation and is described in the FAO-56 drainage paper [41]. The ETIa defines soil evaporation and transpiration separately using Equations (1) and (2). The interception is a function of the vegetation cover, leaf area index (LAI), and precipitation. The ETI-WaPOR is then calculated as the sum of evaporation, transpiration, and interception.
where E and T are the soil evaporation and plant transpiration rates in (kg.m −2 .s −1 ), respectively, and λ is the latent heat of vaporization (MJ/kg). Rn of the soil (Rn,soil) and canopy (Rn,canopy) is the net radiation (MJ/m 2 /day) and G is the ground heat flux (MJ/m 2 /day). ρ air is the density of air (kg/m 3 ), C p is the specific heat of air (MJ/kg/ • C), (e s − e a ) is the vapour pressure deficit-VPD (kPa), r a is the aerodynamic resistance (s/m), r s is the soil resistance (s/m), or canopy resistance when using the PM-model to estimate evaporation or transpiration respectively. ∆ = d(e sat )/dT (kPa/ • C) is the slope of the curve relating saturated water vapour pressure to the air temperature, and γ is the psychometric constant (kPa/ • C). This approach partitions the WaPOR ETIa to evaporation and transpiration using the modified versions of PM, which differentiate the net available radiation and resistance formulas based on the vegetation cover according to the ETLook model [40]. A major difference between ETLook-WaPOR and ETLook is the source of remote sensing data for the soil moisture. In the original ETLook, soil moisture is derived from passive microwave, and in the WaPOR approach, soil moisture is derived from LST. Interception is the process where leaves intercept rainfall and is evaporated before it reaches the soil surface. Intercepted rainfall evaporates directly from the leaves and requires energy that is not available for transpiration. Interception (mm/day) is a function of the vegetation cover, LAI, and precipitation.
Fc (-) is the vegetation cover calculated from NDVI, and LAI (-) is the leaf area index converted from Fc. P is precipitation (mm/day).

Aboveground Biomass Productivity
Net Primary Productivity (NPP) is a fundamental characteristic of an ecosystem, expressing the conversion of carbon dioxide into biomass driven by photosynthesis. NPP is the GPP minus Remote Sens. 2020, 12, 2949 6 of 19 autotrophic respiration, with the losses caused by the conversion of basic products (glucose) to higher-level photosynthesis (starch, cellulose, fats, proteins) and the respiration needed for the maintenance of the standing biomass. The NPP, as defined in WaPOR, is expressed as where Sc [-] is the scaling factor from dry matter productivity (DMP) to NPP, Rs is the total shortwave incoming radiation (MJT/ha/day), ε p is the fraction of photosynthetically absorbed radiation (PAR) (0.4-0.7 µm) in total shortwave with a value of 0.48 (J Par /J Total-sw ). fAPAR (-) is the PAR-fraction absorbed by green vegetation. SM (-) is the soil moisture stress reduction factor. ε lue (-) is the light use efficiency (LUE) (DM = dry matter) at optimum (kgDM/GJPA), ε T (-) is the normalized temperature effect, ε CO 2 (-) is the normalized CO 2 fertilization effect, the ε AR (-) is the fraction kept after autotrophic respiration, and ε RES (-) is the fraction kept after residual effects (including soil moisture stress). A look-up table based on the land use classification is used to determine the LUE for a given pixel. When total biomass (TBP) or aboveground biomass productivity (AGBP) is derived from the continental NPP data (without prior information on crop type), the following conversions are used in the WaPOR database [36].
where 0.65 is the conversion fraction from total to above ground biomass, and 22.22 is the conversion from gC/m 2 /day NPP to DMP (above and below ground dry biomass) in kg/ha/day, assuming a carbon fraction of 0.45 in the organic matter.

Reference Evapotranspiration
The reference ETo, based on FAO- 56 (1996), was derived from the same gridded meteorological data, transmissivity data, and digital elevation data described for the ETIa and NPP.
where ρ a is air density (kg/m 3 ), c p is the specific heat (J/ • C), r s is the bulk surface resistance (s/m) (as compared to the soil or canopy resistance described in Equations (1) and (2)), ∆ is the slope vapour pressure curve versus temperature curve [kPa/ • C], and γ is the psychrometric constant [kPa/ • C]. r s is taken as a constant 70 s/m, and the r a is taken as 208/u obs . u obs is the observed wind speed (m/s) at 10 m. The ETo is provided at a 25 km daily resolution in the WaPOR portal.

Performance Indicators
Three performance indicators (Table 3) were studied at different spatiotemporal resolutions. These indicators are selected based on the data available directly from the WaPOR portal: • Adequacy-The sufficiency of water use to meet the crop water requirement (CWR) or potential evapotranspiration; • Equity-The fairness of irrigation water distribution; • CWP-The unit of physical crop production or yield per unit water consumed.
The indicators are applied to the scheme based on available data relating to crop information. The crop coefficient is only known for two schemes, and therefore, that indicator is only applied in those two schemes. In plots where the pixel size is less than the plot size, only pixels fully contained within the plot are considered. In plots where the pixel size is greater than the plot size, the data is extracted as point data to a point in the middle of the plot.

All
Productivity CWP bene f icial biomass ETIa

All
Adequacy was assessed using the relative evapotranspiration (relative ET) indicator, which is estimated through the ratio of seasonal ETIa to the seasonal CWR [2,42] and is assessed on an annual basis. The CWR was taken at the seasonal relative ET multiplied by the average seasonal crop coefficient (Kc). Due to data limitations in the WaPOR database, there is insufficient information on crop growth cycle to consider Kc at a higher temporal resolution. The single Kc method, rather than the dual crop coefficient method, was used, as the purpose of this manuscript is not to test the actual adequacy but the ability of datasets with different resolutions to capture the variation. This indicator was assessed in the Metehara and the Wonji, where the schemes are heavily dominated by one major crop type at two levels, between irrigation plots and between irrigation schemes. The average annual sugarcane Kc for a sugarcane ratoon crop was taken as approximately 0.95 [41].
Equity is a measure of irrigation uniformity [43,44]. It is assessed through estimating the coefficient of variation (CV) of ETIa at two levels-between irrigation plots and between irrigation schemes. The ETIa is assessed at annual and dekadal scale.
CWP is defined here as the ratio of beneficial biomass to ETIa. The CWP can vary based not just on management and irrigation practices [45][46][47] but on climate [48,49], crop cultivar [50], and other environmental factors [51][52][53]; therefore, it is assessed at only one level-between irrigation plots. The beneficial biomass is taken as the AGBP. The CWP is assessed at a seasonal or annual scale.

Validation
Plot data were collected at two irrigation schemes-AGBP data in the Wonji, and ETa data in the Zankalon. The observed AGBP (AGBP a ) data was collected from farmers for the period 2009-2016 and were provided by the sugarcane estate managers in the Wonji. In-situ ETa data was collected through eddy covariance (EC) observations, which was operated through the University of Tsukuba in partnership with Cairo University, the National Water Research Center, Qalubia, Egypt, and the Agriculture Research Center, Giza, Egypt [54]. Farmer estimates of fresh AGBP a were available for 66 plots in the Wonji within the WaPOR data timeframe. The reported AGBP a (with planting and harvest dates) were used for a direct NPP comparison. The average dekad value for each NPP pixel falling within the plot was aggregated over the growing season for each plot. The WaPOR NPP was converted to WaPOR AGBP (dry matter) using Equation (6) and then converted to a fresh AGBP (AGBP e ) using conversion factors shown in Table 4 (i.e., AGBP e = WaPOR AGBP multiplied by each of the conversion factors). The reported yield is fresh matter (compared to WaPOR, which is dry); therefore, a conversion for moisture content was also required. The EC site is an irrigated site in the Nile Delta and is under rotation with three major summer crops-rice, maize and cotton-and four major winter crops-wheat, berseem (Trifolium alexandrinum), fava beans, and sugar beet [54]. The field is 200 m × 200 m, surrounded by agricultural land, extending at least 800 m in the dominant (northwest) wind direction. Suitable data for validation covers 2010-2013. The WaPOR-derived ETIa dataset was spatially averaged over a 750 m × 750 m pixel window covering the EC station, based on the assumption that the window represents the measurement footprint of the EC station. The WaPOR ETIa for the in-situ comparison was taken as the sum of soil evaporation, plant transpiration, and interception. The ETa from the EC (ETa-EC) data were derived from the latent heat flux (LE) measurements and aggregated temporally to dekadal averages to match the temporal resolution of the WaPOR ETIa products. Footprint data was unavailable, which may introduce bias to the spatial analyses of ETa. At L1 resolution, plot size is smaller than the pixel resolution, and a single pixel extends beyond the experimental field. However, due to the high correlation between L1 and L2 (which fits within the field), we decided to include EC data for the validation at both levels.

ETIa and AGBP
The mean annual ETIa and AGBP for each resolution and each scheme are shown in Table 5. The Wonji and Metehara have the highest AGBP; this is to be expected as both schemes are dominated by sugarcane. ODN has the highest ETIa, which is expected as it is dominated by rice. The ETIa has a difference of less than 6% between levels in the Wonji, the Metehara, and Zankalon; 5-9% in ODN; and 11-17% in the Koga. The DMP has a difference of less than 6% between levels in the Metehara, ODN and Zankalon, 8% in the Wonji, and 26-29% in the Koga.

Adequacy
Each resolution estimates a similar annual mean relative ET across the scheme for all years and all levels; it follows the similar inter-annual trends in both the Wonji and the Metehara (Figure 2). In the Wonji, the annual relative ET, averaged across all years (2009-2018), ranged from 0.61-0.78 for L3, 0.63-0.76 for L2, and 0.67-0.75 for L1. The Metehara had higher mean relative ET and a higher variation in relative ET as compared to the Wonji for all levels. In the Metehara, the mean annual relative ET across all years ranged from 0.88-0.94 for L3, 0.81-0.90 for L2, and 0.80-0.88 for L1.
Across levels, the L3 mean relative ET is frequently higher than the L2 and the L1 ETo in both the Wonji and the Metehara. The L3 relative ET variation is consistently higher in the Wonji, which is reflected in the larger quartile range shown in Figure 2. The variation between plots was the highest in 2018 for both the Wonji and the Metehara.
L3, 0.63-0.76 for L2, and 0.67-0.75 for L1. The Metehara had higher mean relative ET and a higher variation in relative ET as compared to the Wonji for all levels. In the Metehara, the mean annual relative ET across all years ranged from 0.88-0.94 for L3, 0.81-0.90 for L2, and 0.80-0.88 for L1.
Across levels, the L3 mean relative ET is frequently higher than the L2 and the L1 ETo in both the Wonji and the Metehara. The L3 relative ET variation is consistently higher in the Wonji, which is reflected in the larger quartile range shown in Figure 2. The variation between plots was the highest in 2018 for both the Wonji and the Metehara.  Figure 3 shows the relative ET in 2018 for sugarcane plots in the Wonji and in the Metehara derived from WaPOR. All three levels can pick up spatial trends in relative ET. All levels show lower relative ET in the north-eastern part of the Wonji Scheme and the higher relative ET in the eastern and the north-western parts of the Wonji. In the Metehara, all levels showed that the centre of the scheme has higher relative ET, and the southern part of the scheme has lower relative ET. The higher resolutions can capture plot-to-plot variability better. A clear example is in the western part of the Wonji scheme. L1 shows all plots in this area ranging from 0.55-0.80, whereas in the same region, L3 is identifying plots to range from 0.35-0.85.  Figure 3 shows the relative ET in 2018 for sugarcane plots in the Wonji and in the Metehara derived from WaPOR. All three levels can pick up spatial trends in relative ET. All levels show lower relative ET in the north-eastern part of the Wonji Scheme and the higher relative ET in the eastern and the north-western parts of the Wonji. In the Metehara, all levels showed that the centre of the scheme has higher relative ET, and the southern part of the scheme has lower relative ET. The higher resolutions can capture plot-to-plot variability better. A clear example is in the western part of the Wonji scheme. L1 shows all plots in this area ranging from 0.55-0.80, whereas in the same region, L3 is identifying plots to range from 0.35-0.85. L3, 0.63-0.76 for L2, and 0.67-0.75 for L1. The Metehara had higher mean relative ET and a higher variation in relative ET as compared to the Wonji for all levels. In the Metehara, the mean annual relative ET across all years ranged from 0.88-0.94 for L3, 0.81-0.90 for L2, and 0.80-0.88 for L1.
Across levels, the L3 mean relative ET is frequently higher than the L2 and the L1 ETo in both the Wonji and the Metehara. The L3 relative ET variation is consistently higher in the Wonji, which is reflected in the larger quartile range shown in Figure 2. The variation between plots was the highest in 2018 for both the Wonji and the Metehara.  Figure 3 shows the relative ET in 2018 for sugarcane plots in the Wonji and in the Metehara derived from WaPOR. All three levels can pick up spatial trends in relative ET. All levels show lower relative ET in the north-eastern part of the Wonji Scheme and the higher relative ET in the eastern and the north-western parts of the Wonji. In the Metehara, all levels showed that the centre of the scheme has higher relative ET, and the southern part of the scheme has lower relative ET. The higher resolutions can capture plot-to-plot variability better. A clear example is in the western part of the Wonji scheme. L1 shows all plots in this area ranging from 0.55-0.80, whereas in the same region, L3 is identifying plots to range from 0.35-0.85.

Equity
The 2009-2018 annual scheme ETIa CV was always highest for the L3 dataset ( Table 5). The L3 annual scheme ETIa CV was higher for all schemes in all years on all but two occasions ( Figure 4). All levels show similar inter-annual trends in CV; however, the magnitude of interannual variability was greater for the L3 dataset. The Koga was an exception: while L2 and L1 followed the same interannual trend, the L3 CV showed an opposite trend in years 2010-2011 and 2015-2018. Each level in the Koga showed similar mean annual ETIa trends; however, the L1 and L2 SD decreased in those years where L3 SD either increased or remained similar. This may be a result of varying year-to-year agricultural practices that are better picked up by L3. Alternatively, it may be a result of processing issues: for example, a wetter year can increase cloud cover and therefore data gaps, particularly of L3, where the revisit time is already lower than for L1 and L2. schemes for L1, L2, and L3 resolution.

Equity
The 2009-2018 annual scheme ETIa CV was always highest for the L3 dataset ( Table 5). The L3 annual scheme ETIa CV was higher for all schemes in all years on all but two occasions ( Figure 4). All levels show similar inter-annual trends in CV; however, the magnitude of interannual variability was greater for the L3 dataset. The Koga was an exception: while L2 and L1 followed the same interannual trend, the L3 CV showed an opposite trend in years 2010-2011 and 2015-2018. Each level in the Koga showed similar mean annual ETIa trends; however, the L1 and L2 SD decreased in those years where L3 SD either increased or remained similar. This may be a result of varying year-to-year agricultural practices that are better picked up by L3. Alternatively, it may be a result of processing issues: for example, a wetter year can increase cloud cover and therefore data gaps, particularly of L3, where the revisit time is already lower than for L1 and L2.  Figure 5 shows the 2018 annual evapotranspiration in Zankalon and Metehara. The L3 dataset is capturing more spatial variability than the L1 and L2 datasets in both schemes. The plot size is significantly smaller in the Zankalon, and the L3 is not capturing plot ETIa differences as well as in the Metehara. This is shown by the plot clusters of similar values. In the Metehara, despite having similar annual CV in 2018, it is evident that L2 is picking up more spatial variations than L1.  Figure 5 shows the 2018 annual evapotranspiration in Zankalon and Metehara. The L3 dataset is capturing more spatial variability than the L1 and L2 datasets in both schemes. The plot size is significantly smaller in the Zankalon, and the L3 is not capturing plot ETIa differences as well as in the Metehara. This is shown by the plot clusters of similar values. In the Metehara, despite having similar annual CV in 2018, it is evident that L2 is picking up more spatial variations than L1.  Figure 6 shows the dekadal ETIa CV for each scheme from 2009-2018. In most cases, a greater dekadal variation in the L3 CV compared to L1 CV and L2 CV suggests that L3 captures seasonal variation better. The L3 dataset reports the largest intra-annual variation among three datasets, particularly in schemes with smaller plots. This suggests that the lower resolution is doing a better  Figure 6 shows the dekadal ETIa CV for each scheme from 2009-2018. In most cases, a greater dekadal variation in the L3 CV compared to L1 CV and L2 CV suggests that L3 captures seasonal variation better. The L3 dataset reports the largest intra-annual variation among three datasets, particularly in schemes with smaller plots. This suggests that the lower resolution is doing a better job at picking up variation in the larger plots as compared to smaller plots, i.e., L1 may be suitable at identifying variation or equity in the Wonji but not the Koga, where differences between L1 and L3 are most significant. The magnitude of differences in CVs between datasets was less for schemes with larger plots (e.g., Wonji and Metehara). However, the Wonji has the least consistent difference in the CV at a dekadal scale. The L2 mimics the L1 CV trend up to 2013, where it then deviates in four of the five schemes. This corresponds with the introduction of the PROBA-V sensor into the L2 dataset. The L3 dataset captured the magnitude of variability better than the L1 and L2 datasets. However, the L3 dataset may not have captured the dekad-to-dekad changes as well. This is noted by a frequently smoother L3 trend line as compared to L1 and L2.

Productivity
The mean annual CWP values of 2009-2018 for each scheme are shown in Table 6. The L3 CWP is frequently lower than the L1 and L2 CWP due to the higher L1 and L2 AGBP estimates ( Table 6). The L3 CWP has the greatest level of difference in the Koga. The L3 CWP CV is always greater than the L1 and L2 CV. The scheme level CWP interannual variation is consistent between levels, i.e., the scale, magnitude and direction of change in CWP between years is the same for all levels. The CWP CV is smaller than the ETIa CV and DMP CV.

Productivity
The mean annual CWP values of 2009-2018 for each scheme are shown in Table 6. The L3 CWP is frequently lower than the L1 and L2 CWP due to the higher L1 and L2 AGBP estimates (Table 6). The L3 CWP has the greatest level of difference in the Koga. The L3 CWP CV is always greater than the L1 and L2 CV. The scheme level CWP interannual variation is consistent between levels, i.e., the scale, magnitude and direction of change in CWP between years is the same for all levels. The CWP CV is smaller than the ETIa CV and DMP CV.  Figure 7. While all levels capture the scheme average CWP similarly, the plot-to-plot variation is considerably different between levels. L3 captures the most variation between plots. In the Wonji, all datasets capture scheme areas with higher or lower CWP well-for example, each level shows that the north-eastern part of the scheme has the lowest CWP and that the eastern tip has the highest. However, the L3 dataset clearly captures more variation with distinct plot-to-plot difference. The ODN shows distinct differences in plot-to-plot CWP variation between datasets. However, the L3 dataset can pick up more outlier plots. Zankalon, not shown here, showed similar results to the Koga, and the Metehara showed results more like the Wonji (also like ETIa spatial distribution patterns shown in Figure 5).
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 20 considerably different between levels. L3 captures the most variation between plots. In the Wonji, all datasets capture scheme areas with higher or lower CWP well-for example, each level shows that the north-eastern part of the scheme has the lowest CWP and that the eastern tip has the highest. However, the L3 dataset clearly captures more variation with distinct plot-to-plot difference. The ODN shows distinct differences in plot-to-plot CWP variation between datasets. However, the L3 dataset can pick up more outlier plots. Zankalon, not shown here, showed similar results to the Koga, and the Metehara showed results more like the Wonji (also like ETIa spatial distribution patterns shown in Figure 5).

Validation-Evaluation of the WaPOR Dataset
The AGBPe is plotted against the farmer reported (or AGBPa) in Figure 8. The L3 dataset has the highest performance when compared to the in-situ data in terms of both coefficient of determination (R 2 ) and root mean square error (RMSE). The L3, L2, and L1 R 2 are 0.7, 0.4, and 0.5, respectively, and the L3, L2, and L1 RMSE are 33.9 ton/ha, 31.11 ton/ha, and 48.2 ton/ha, respectively, which equates to a normalized RMSE (NRMSE) of 22.5%, 21.8%, and 33.8%.
The AGBPa shows more variation between plots and years, with an AGBPa SD of 47 ton/ha compared to an AGBPe of 26 ton/ha. The AGBPe is comparable to the average global biomass yield

Validation-Evaluation of the WaPOR Dataset
The AGBP e is plotted against the farmer reported (or AGBP a ) in Figure 8. The L3 dataset has the highest performance when compared to the in-situ data in terms of both coefficient of determination (R 2 ) and root mean square error (RMSE). The L3, L2, and L1 R 2 are 0.7, 0.4, and 0.5, respectively, and the L3, L2, and L1 RMSE are 33.9 ton/ha, 31.11 ton/ha, and 48.2 ton/ha, respectively, which equates to a normalized RMSE (NRMSE) of 22.5%, 21.8%, and 33.8%.
Remote Sens. 2020, 12, x FOR PEER REVIEW  14 of 20 for sugarcane, but a gap exists in linking biomass to yield in remote sensing-based estimates [11]. Often, a priori knowledge of harvest index is the most useful way to link biomass and yield. ETIa was compared to the EC in the Zankalon irrigation scheme. The 10-day daily average ETa-EC and WaPOR ETIa for all three spatial resolutions at the EC site are shown in Figure 9. The L1 and L2 ETIa show high consistency with each other. The L3 ETIa consistently sits between the ETa-EC and the L1 and L2 ETIa. All levels capture the overall ETa-EC seasonal trends. The L3 data shows a slightly lower R 2 (L3 = 0.36, L2 = 0.60, and L1 = 0.61) and noticeably lower bias (L3 = 1.06 mm/day, L2 = 1.7 mm/day, and L1 = 1.7 mm/day) and a lower RMSE (L3 = 1.0 mm/day, L2 = 2.2 mm/day, and L1 = 2.2 mm/day) when compared with the ETa-EC. The higher R 2 associated with the L1 and L2 ETIa reflects their ability to capture the temporal fluctuations of ETa-EC better than L3 ETIa. An example of this is at dekad 1117 or 2011-17, where L1 and L2 ETIa capture the ETa-EC dip, whereas L3 ETIa stays flat. These findings are consistent with other validations that compare WaPOR L1 ETIa data with EC data in other agricultural fields in both the Nile Delta and other dry, arid regions [55]. It should be noted that the same study found lower suitability of WaPOR data in very wet, nonagricultural areas.

Discussion
This study shows the influence of spatial resolution on irrigation performance indicators, relevant to different plot sizes and schemes. All resolutions captured the seasonal trends. This was evident when observing indicators over time, i.e., equity (e.g., Figure 6). The higher resolution showed a higher accuracy, in terms of R 2 and RMSE, for AGBP compared to in-situ data. The relative The AGBP a shows more variation between plots and years, with an AGBP a SD of 47 ton/ha compared to an AGBP e of 26 ton/ha. The AGBP e is comparable to the average global biomass yield which is 69.8 ton/ha [57]. FAO WATER [57] reports that the cane yield varies from 50 ton/ha to 150 ton/ha depending on the variety and ratooning stages, which suggests that the WaPOR-derived values are within the reported range, while the farmer estimates are on the high side. The harvested biomass is related to the growing period. The average total growing period of sugarcane in the Awash L3 area is 585 days with a range of 305-1037 days or 0.8-2.8 years. WaPOR give reasonable results for sugarcane, but a gap exists in linking biomass to yield in remote sensing-based estimates [11]. Often, a priori knowledge of harvest index is the most useful way to link biomass and yield.
ETIa was compared to the EC in the Zankalon irrigation scheme. The 10-day daily average ETa-EC and WaPOR ETIa for all three spatial resolutions at the EC site are shown in Figure 9. The L1 and L2 ETIa show high consistency with each other. The L3 ETIa consistently sits between the ETa-EC and the L1 and L2 ETIa. All levels capture the overall ETa-EC seasonal trends. The L3 data shows a slightly lower R 2 (L3 = 0.36, L2 = 0.60, and L1 = 0.61) and noticeably lower bias (L3 = 1.06 mm/day, L2 = 1.7 mm/day, and L1 = 1.7 mm/day) and a lower RMSE (L3 = 1.0 mm/day, L2 = 2.2 mm/day, and L1 = 2.2 mm/day) when compared with the ETa-EC. The higher R 2 associated with the L1 and L2 ETIa reflects their ability to capture the temporal fluctuations of ETa-EC better than L3 ETIa. An example of this is at dekad 1117 or 2011-17, where L1 and L2 ETIa capture the ETa-EC dip, whereas L3 ETIa stays flat. These findings are consistent with other validations that compare WaPOR L1 ETIa data with EC data in other agricultural fields in both the Nile Delta and other dry, arid regions [55]. It should be noted that the same study found lower suitability of WaPOR data in very wet, non-agricultural areas.
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 20 for sugarcane, but a gap exists in linking biomass to yield in remote sensing-based estimates [11]. Often, a priori knowledge of harvest index is the most useful way to link biomass and yield. ETIa was compared to the EC in the Zankalon irrigation scheme. The 10-day daily average ETa-EC and WaPOR ETIa for all three spatial resolutions at the EC site are shown in Figure 9. The L1 and L2 ETIa show high consistency with each other. The L3 ETIa consistently sits between the ETa-EC and the L1 and L2 ETIa. All levels capture the overall ETa-EC seasonal trends. The L3 data shows a slightly lower R 2 (L3 = 0.36, L2 = 0.60, and L1 = 0.61) and noticeably lower bias (L3 = 1.06 mm/day, L2 = 1.7 mm/day, and L1 = 1.7 mm/day) and a lower RMSE (L3 = 1.0 mm/day, L2 = 2.2 mm/day, and L1 = 2.2 mm/day) when compared with the ETa-EC. The higher R 2 associated with the L1 and L2 ETIa reflects their ability to capture the temporal fluctuations of ETa-EC better than L3 ETIa. An example of this is at dekad 1117 or 2011-17, where L1 and L2 ETIa capture the ETa-EC dip, whereas L3 ETIa stays flat. These findings are consistent with other validations that compare WaPOR L1 ETIa data with EC data in other agricultural fields in both the Nile Delta and other dry, arid regions [55]. It should be noted that the same study found lower suitability of WaPOR data in very wet, nonagricultural areas.

Discussion
This study shows the influence of spatial resolution on irrigation performance indicators, relevant to different plot sizes and schemes. All resolutions captured the seasonal trends. This was evident when observing indicators over time, i.e., equity (e.g., Figure 6). The higher resolution

Discussion
This study shows the influence of spatial resolution on irrigation performance indicators, relevant to different plot sizes and schemes. All resolutions captured the seasonal trends. This was evident when observing indicators over time, i.e., equity (e.g., Figure 6). The higher resolution showed a higher accuracy, in terms of R 2 and RMSE, for AGBP compared to in-situ data. The relative AGBP errors (21%, 28%, and 28% for L3, L2, and L1, respectively) is in the upper 50th percentile of reported error ranges for non-parametrized remote-sensing-based estimates and is within the ranges of error associated with farmer reported yields [11]. Some users may prefer a locally parameterized model to achieve dataset accuracy.
The L3 ETIa had lower bias: the R 2 was lower than the L2 or L1 datasets when compared to EC data. While the L1 and L2 data had better R 2 , they had a higher bias. Further, the L1 and L2 data have a higher temporal resolution in terms of spectral input, and when comparing to daily flux data, the temporal gaps of the L3 data may be influencing the accuracy more than the gain in spatial resolution. A study on a 22-ha olive farm showed that datasets with resolution coarser than L3 will result in great discrepancies as compared to actual evapotranspiration values due to the aggregation of non-linear components and to the inclusion of non-agricultural areas in such aggregation [27]. Similarly, a study in wheat found that the 30 m NDVI dataset, as compared to 250 m and 1 km NDVI datasets, provided the most accurate yield estimates [20]. This is absolute error, not relative error in space and time, looking at variation.
The level consistency of ETIa and AGBP does not seem related to the plot size at annual, scheme level. However, the L3 dataset, at the scheme level, was able to capture the spatial variation better and provide more information on scheme level equity. Validation showed that although differences in datasets were evident, the spatial and temporal trends were consistent. However, the magnitude of seasonal and spatial variability varied between datasets and that the L1 and L2 may be in some cases overestimating the ETIa. However, it should be noted that the schemes with smaller plots (i.e., Zankalon and Koga) also had the highest crop diversity. This will drive more variation in the field yields and water consumption and also in satellite interpretation, as they are less homogenous. However, this is considered to reflect reality, as small holder agriculture is often associated with more diversity, for example in South Africa [58].
Adequacy was captured well by all resolutions at an annual, whole of scheme level in the Wonji and the Metehara. While the L3 data showed higher relative ET CV, each level was able to capture the inter-annual trends. It is suggested that all resolutions are suitable to assess adequacy at interannual or inter-scheme level for the schemes assessed. Inter-plot spatial variability of adequacy was captured well by all in the Wonji and the Metehara, however, the L3 dataset captured variability the best. It is suggested that all resolutions are suitable to assess adequacy at interannual scheme level for the schemes assessed, though L3 is preferred for the scheme with smaller plots, i.e., Koga and Zankalon. This research is limited to assessing adequacy at annual scale, and in practice adequacy should be assessed at various stages of critical crop growth. Further research should consider the impact of resolution on a shorter timescale.
Inter-annual equity was captured reasonably well by all levels in the Wonji, Metehara, ODN and Zankalan. However, L3 was distinctly different to L1 and L2 in the Koga, suggesting that all levels are suitable for larger plots with mixed results for smaller plots. This may be influenced by the heterogenous landscape in the Koga, which is mixed with irrigated pixels [27]. L1 and L2 are not suitable to assess equity at plot-to-plot scale in the Koga and Zankalon. Plot-to-plot equity was also not well captured by the L3 dataset in these schemes, which is identified by the plot clusters of similar values ( Figure 5). However, the L3 and L2 datasets may be used to compare scheme plots or divisions.
Equity was also assessed on dekadal scale. All datasets captured some level of seasonal variability in each scheme. The L3 dataset captured the magnitude of variability better than the L1 and L2 datasets. The difference in CV, or equity, between resolutions decreased with increasing plot size, this is with the expectation of the Wonji, the scheme with the largest plots. The Wonji has the least inconsistent difference in CV of ETIa between spatial resolutions over space (e.g., adequacy in Figure 3) and time (e.g., equity in Figure 6). This is likely due to the introduction of Proba-V in March 2013, where the 100 m data is no longer resampled from the 250 m NDVI (or L1) data. In terms of ETIa, this only has a large effect in the Wonji. The other schemes only see a light deviation in results from the introduction of Proba-V. However, the L3 dataset did not capture the dekad-to-dekad changes as well, which was identified by a sometimes smoother L3 trendline. This may be a result of the revisit time of the L1 and L2 satellite data input of one or two days, compared to the L3 dataset of 16 days, providing less NDVI inputs and smoothing the results. This may become particularly influential during raining seasons, where cloud cover reduces the satellite data availability further for satellites with lower return periods and for users wishing to look beyond a seasonal scale. This affect may be being reduced by the daily meteorological data used to interpolate image gaps.
Inter-annual scheme CWP was captured by all levels for all schemes irrelevant of dataset and therefore suggested that all resolutions are suitable for scheme-scale inter-annual comparison. This is compared to another study which showed that at a regional scale, the L3 resolution provided little extra value in determining agricultural boundaries or NDVI trends [32].
The three levels are derived from different sensors and therefore have different nominal band wavelengths, Landsat being the most dissimilar to the other levels. The primary use of optical bands in WaPOR is to estimate various vegetation indices (primarily NDVI). Studies in marshes and agricultural lands have shown that nominal band wavelengths limit differences between MODIS and Landsat do not greatly impact vegetation indices such as the enhanced vegetation index (EVI) or NDVI [59,60]. However, the WaPOR products are not currently inter-calibrated, and the impacts of varying nominal wavelengths in WaPOR is not well known. The product producer (WaPOR) is currently undertaking inter-calibration to improve dataset consistency, which is suggested for best practice quality assessment [61,62].
Though it has been suggested that Landsat 30 m data is a valuable asset to water management, which may be true, it still may not be of high enough resolution for monitoring and evaluating small-scale agriculture [63]. Pixel purity represents the relative contribution of the surface of interest to the signal detected by the remote sensing instrument [21]. For smaller plots, the signal will be mixed by varying crops or non-agricultural area. Therefore, when the plot size it smaller than the pixel size, pixel mixing will arise. We found that at the scheme level, for absolute indicators, such as productivity or adequacy, the lower resolution products are still suitable to assess scheme-level performance over time. However, for relative indicators (i.e., equity), the spatial variation is integral, and the higher resolution product is required.
Therefore, while accuracy is lower and variation is not as well captured at the plot level, at the scheme level, L1 and L2 resolutions appear suitable and save time and processing costs. However, for higher-level inter-plot comparison, the L3 data provided value in all schemes for all indicators at both a dekadal and an annual scale. However, for patchwork agricultural landscapes, with smaller plots, i.e., <2 ha, utilisation of Sentinel (10 m visible bands), should improve interpretation and monitoring inter-plot irrigation performance. This is recommended, as the L3 data had much more variation in the scheme with small plots (Koga and Zankalon), and therefore, a higher resolution baseline or reference dataset would provide added insight to the suitability of the L3 data. As WaPOR moves toward utilization of the Sentinel 2 and 3 platform, the relevant datasets will become readily available for comparison and application in assessing irrigation performance.

Conclusions
There is an increasing ability to monitor irrigation performance from space with available open-source datasets. It is important that the effectiveness of these datasets be understood based on irrigation scheme characteristics and spatial resolution of the dataset itself. As the methodology for all datasets is the same, the resolution is a unique varying factor to compare applicability in irrigation performance assessment. The following is a brief summary and discussion of this study, presented here to understand and characterize the application of WaPOR datasets with 30 m, 100 m, and 250 m spatial resolutions in irrigation performance assessment for schemes with varying plot sizes of small (<2 ha) to medium (>10 ha) plots.
The following conclusions can be drawn from this study on comparison of varying remote-sensingbased resolution datasets in irrigation performance assessment: • Spatial resolutions of 250 m, 100 m, and 30 m are suitable for inter-annual and inter-scheme assessments for adequacy, equity, and CWP, regardless of plot size. • Spatial resolutions of 250 m and 100 m should not be used for inter-plot comparison for adequacy, equity, or CWP on plots <2 ha. The 30 m resolution may also be too coarse, and Sentinel-2 application should be considered. • Spatial resolutions of 250 m and 100 m show general spatiotemporal trends for adequacy, equity, and CWP within a scheme, but not the full extent of plot-to-plot variation for all plot sizes tested.
It is suggested that these conclusions can be applied generally to irrigation areas that have similar plot sizes. However, further investigation into the resolution requirements to suitably undertake irrigation performance assessment in small plots, i.e., ≤2 ha, is needed.