Estimating Actual Evapotranspiration over Croplands Using Vegetation Index Methods and Dynamic Harvested Area

: Advances in estimating actual evapotranspiration (ETa) with remote sensing (RS) have contributed to improving hydrological, agricultural, and climatological studies. In this study, we evaluated the applicability of Vegetation-Index (VI) -based ETa (ET-VI) for mapping and monitoring drought in arid agricultural systems in a region where a lack of ground data hampers ETa work. To map ETa (2000–2019), ET-VIs were translated and localized using Landsat-derived 3- and 2-band Enhanced Vegetation Indices (EVI and EVI2) over croplands in the Zayandehrud River Basin (ZRB) in Iran. Since EVI and EVI2 were optimized for the MODerate Imaging Spectroradiometer (MODIS), using these VIs with Landsat sensors required a cross-sensor transformation to allow for their use in the ET-VI algorithm. The before- and after- impact of applying these empirical translation methods on the ETa estimations was examined. We also compared the effect of cropping patterns’ interannual change on the annual ETa rate using the maximum Normalized Difference Vegetation Index (NDVI) time series. The performance of the different ET-VIs products was then evaluated. Our results show that ETa estimates agreed well with each other and are all suitable to monitor ETa in the ZRB. Compared to ETc values, ETa estimations from MODIS-based continuity corrected Landsat-EVI (EVI2) (EVI MccL and EVI2 MccL ) performed slightly better across croplands than those of Landsat-EVI (EVI2) without transformation. The analysis of harvested areas and ET-VIs anomalies revealed a decline in the extent of cultivated areas and a loss of corresponding water resources downstream. The ﬁndings show the importance of continuity correction across sensors when using empirical algorithms designed and optimized for speciﬁc sensors. Our comprehensive ETa estimation of agricultural water use at 30 m spatial resolution provides an inexpensive monitoring tool for cropping areas and their water consumption.


Introduction
Evapotranspiration (ET) is the water flux to the atmosphere through evaporation (E) from the soil and transpiration (T) from plants. Accurate estimation of ET is needed to drought stress. Ks for well-watered conditions (irrigation) is set to 1. In the ET-VI method, a VI is used to estimate the product of Kc × Ks [10,24].
Bausch and Neale [25] explored estimating Kc using NDVI in Colorado; they validated Kc estimated by radiometric measurements of NDVI against lysimeters measurements. Kc derived from NDVI resembled measured ones using lysimeters. Nagler et al. [36] developed an empirical method for predicting ETa of riparian plants using remotely sensed EVI from MODIS and meteorological data and then produced a validated ETa algorithm using flux towers measurements [36]. This ETa algorithm was then improved with the use of sap flux data for native cottonwood trees [37], upland shrubs [38], and a nonnative riparian shrub species, tamarisk (Tamarix spp.) [22]. Murray et al. [29] applied the Nagler et al. [22] method to derive ETa in riparian woodlands of the lower Colorado River and ETa over the adjacent agricultural areas. Their work showed that ETa estimated for crops was similar to reported values for the irrigation districts. Nagler et al. [24] successfully developed a generalized prediction of ETa across crops and riparian zones in arid climates. The algorithm was calibrated using data from eddy covariance flux towers. This method was developed using 250 m resolution MODIS imagery and later successfully scaled to work with Landsat 30 m resolution, achieving the same accuracy [39]. However, that study [39] focused entirely on riparian zones excluding agricultural sites.
In this study, we assessed the performance of optical RS-based empirical ET-VI methods and whether they can be applied over large croplands, especially in the absence of systematic in-situ data. We evaluated an ET-VI method based on Landsat-derived VIs (EVI and EVI2). Two different RS-based ET-VIs estimations were applied across the Zayandehrud River Basin (ZRB) in Iran as an indicator of agricultural drought. To our knowledge, this study is the first to assess ETa over cultivated areas of a major irrigation district using only optical and scalable VI-based methods. Cropped areas tend to show considerable inter-annual variability, particularly in semi-arid regions such as in Iran, where drought is a major factor affecting rainfed and irrigated agriculture [40]. Since VIs and associated ETa are much higher for cropped areas than for fallow lands or grasslands, total ETa and mean ETa aggregated for larger regions differ depending on whether the cropland extent interannual dynamic is carefully considered. Not much attention is given to the aspect of scaling ETa in large irrigation districts. Therefore, in this study, annual changes of harvested areas were taken into consideration. Moreover, most studies to date focused only on applying optical-thermal satellite imagery methods to derive ETa in Iran [41,42].
The study investigates several additional and more specific questions, such as, (a) how to translate and localize MODIS-based EVI (EVI2) and ETa equations, (b) the difference in ETa caused by using static vs. dynamic harvested areas, (c) how well Landsat VIs and ETa capture the reported Kc curves and crop evapotranspiration (ETc), respectively, and (d) how these different ET-VIs empirical methods perform.

Study Area
With the second largest population in the Near East and North Africa, 90% of Iran's agricultural production depends on irrigation. The central, eastern, and southern parts of Iran have experienced drought leading to a challenged agriculture sector [43]. This work looks at a closed river basin, the ZRB, which is located in the arid region of central Iran. Agriculture is the main use of water and consumes about 80% of the water resources [44]. A long-term record of agricultural water use is lacking but is needed because it plays a key role in the water management of this region. Apart from the climatological characteristics of the region, there is a chronic lack of a centrally organized ETa data bank. The need for accurate and spatially explicit ETa data is critical for water management in this region, especially considering climate change and reoccurring droughts.
The ZRB covers an area of 26,917 km 2 ( Figure 1). The main river of the ZRB flows for 350 km from the Zagros Mountains to the Gavkhuni Swamp [45]. The Zayandehrud dam is the main reservoir and plays a key role in regulating the water supply [46]. Precipitation Remote Sens. 2021, 13, 5167 4 of 27 data from 53 stations (2003 to 2018, data were obtained from the Islamic Republic of Iran Meteorological Organization (IRIMO)) demonstrate that the average precipitation varies from 63 mm in the eastern part to 1281 mm in the western part ( Figure 1a). Precipitation occurs mainly from November to April. The mean annual temperature is 14.2 • C. The elevation of the study area varies from 1442 m to 3927 m. According to the Isfahan Agricultural Organization (IAO), more than 40 different crops are cultivated in the region, with wheat, barley, alfalfa, maize, potatoes, onions, and rice being the main staple crops. Owing to the low precipitation in the eastern and central parts of the ZRB, irrigated agriculture is dominant. Water paucity, drought, and low rainfall are the main challenges in the ZRB, and these have resulted in fewer water releases from the dam downstream of the ZRB. In this study, downstream refers to the area from the dam to the Gavkhuni swamp, and from the dam upwards, where it rains more frequently, is referred to as upstream. Twelve counties cover about 92% of the basin's area and are mainly located in arid and semi-arid regions (Figure 1b, Köppen-Geiger climate zones [47]).
Meteorological Organization (IRIMO)) demonstrate that the average precipitation varies from 63 mm in the eastern part to 1281 mm in the western part ( Figure 1a). Precipitation occurs mainly from November to April. The mean annual temperature is 14.2 °C. The elevation of the study area varies from 1442 m to 3927 m. According to the Isfahan Agricultural Organization (IAO), more than 40 different crops are cultivated in the region, with wheat, barley, alfalfa, maize, potatoes, onions, and rice being the main staple crops. Owing to the low precipitation in the eastern and central parts of the ZRB, irrigated agriculture is dominant. Water paucity, drought, and low rainfall are the main challenges in the ZRB, and these have resulted in fewer water releases from the dam downstream of the ZRB. In this study, downstream refers to the area from the dam to the Gavkhuni swamp, and from the dam upwards, where it rains more frequently, is referred to as upstream. Twelve counties cover about 92% of the basin's area and are mainly located in arid and semi-arid regions (Figure 1b, Köppen-Geiger climate zones [47]). (a)

Localization of MODIS-Based EVI(EVI2) and ETa
Different VIs were calculated using images from Landsat 5, 7, and 8 at a spatial resolution of 30 m ( Figure S1 displays the spatial footprint of Landsat scenes over the ZRB) and a temporal resolution of 16 days and from MODIS (daily 250 m MOD09GQ and 500 m MOD09GA land surface reflectance products). Landsat images were processed using the Google Earth Engine (GEE) platform (https://earthengine.google.com, accessed on 20 June 2021). GEE is a cloud-based platform for geospatial analysis with massive computational capabilities and direct access to RS data [48]. MOD09GQ and MOD09GA products [49] were acquired to derive the transformation equations for EVI and EVI2 across MODIS and Landsat ( Figure 2).
To calculate ETa, ETo was first computed by applying the Global Crop Water Model (GCWM) using the FAO-PM method [50] and hourly ERA5-reanalysis data with 0.25 degrees spatial resolution aggregated to daily climate input. The model was executed in

Localization of MODIS-Based EVI(EVI2) and ETa
Different VIs were calculated using images from Landsat 5, 7, and 8 at a spatial resolution of 30 m ( Figure S1 displays the spatial footprint of Landsat scenes over the ZRB) and a temporal resolution of 16 days and from MODIS (daily 250 m MOD09GQ and 500 m MOD09GA land surface reflectance products). Landsat images were processed using the Google Earth Engine (GEE) platform (https://earthengine.google.com, accessed on 20 June 2021). GEE is a cloud-based platform for geospatial analysis with massive computational capabilities and direct access to RS data [48]. MOD09GQ and MOD09GA products [49] were acquired to derive the transformation equations for EVI and EVI2 across MODIS and Landsat ( Figure 2).
To calculate ETa, ETo was first computed by applying the Global Crop Water Model (GCWM) using the FAO-PM method [50] and hourly ERA5-reanalysis data with 0.25 degrees spatial resolution aggregated to daily climate input. The model was executed in daily time steps with a spatial resolution of 5 arc-minutes (about 8 Km) [51]. Statistical analysis and visualization of the results were conducted in R [52] and QGIS [53].

Preparation and Generation of MODIS-Like Landsat Images VIs
Landsat surface reflectance images (2000-2019) were corrected for topography and bidirectional reflectance distribution function (BRDF). Clouds, shadow, snow, water, and poor-quality pixels were masked out. Landsat 7 missing values due to the Scan Line Corrector error were simply ignored and only available pixels within the scene were used. Problems along the edges of Landsat 5 images which resulted in missing data were also removed by applying an inwards-buffer mask. Furthermore, there were small differences between the spectral characteristics of Landsat 5, 7, and 8 which created a bias [54]. To address the first research question regarding translation and localization of MODIS-based EVI(EVI2) and ET equations, we investigated the differences resulting from the spectral characteristics of Landsat 5, 7, and 8. We did this by comparing two translations approaches and their impacts on ETa: 1. Roy et al. [54] developed transformation equations to reduce the effects of these small differences and produce a sensor-independent, long-time series of Landsat composites. These functions, which are also available on the GEE platform, were applied on Landsat images to calculate continuity-corrected Landsat (ccL) VIs and incorporate them into the ETa equation directly to derive ET-VIs (ET-EVIccL, and ET-EVI2ccL).
2. The ETa equation applied in this study was originally developed using MODIS images. The second approach to reduce the effects of these differences is to translate VIs derived from Landsat images into MODIS-like VIs. For this purpose, MODIS

Preparation and Generation of MODIS-Like Landsat Images VIs
Landsat surface reflectance images (2000-2019) were corrected for topography and bidirectional reflectance distribution function (BRDF). Clouds, shadow, snow, water, and poor-quality pixels were masked out. Landsat 7 missing values due to the Scan Line Corrector error were simply ignored and only available pixels within the scene were used. Problems along the edges of Landsat 5 images which resulted in missing data were also removed by applying an inwards-buffer mask. Furthermore, there were small differences between the spectral characteristics of Landsat 5, 7, and 8 which created a bias [54]. To address the first research question regarding translation and localization of MODIS-based EVI(EVI2) and ET equations, we investigated the differences resulting from the spectral characteristics of Landsat 5, 7, and 8. We did this by comparing two translations approaches and their impacts on ETa: 1. Roy et al. [54] developed transformation equations to reduce the effects of these small differences and produce a sensor-independent, long-time series of Landsat composites. These functions, which are also available on the GEE platform, were applied on Landsat images to calculate continuity-corrected Landsat (ccL) VIs and incorporate them into the ETa equation directly to derive ET-VIs (ET-EVI ccL , and ET-EVI2 ccL ). 2. The ETa equation applied in this study was originally developed using MODIS images. The second approach to reduce the effects of these differences is to translate

Calculation of Vegetation Indices
EVI was developed to address some issues in NDVI, including sensitivity to soil background, saturation over dense canopies, and sensitivity to the residual atmosphere. EVI uses the ratio of NIR, R, and B bands to extract canopy greenness (Equation (1)).
where G is a gain factor, C1, C2 are the coefficients of the aerosol resistance, which uses the B band to correct for aerosol effects in the R band, and L functions as the soil adjustment factor. The values adopted for MODIS are L = 1, C1 = 6, C2 = 7.5, and G = 2.5 [55]. EVI is quite sensitive to variations in the B band reflectance and is useless in sensors without a blue band, which hampers its consistency across different sensors. EVI2 was introduced to address these challenges and requires only NIR and R bands (Equation (2)) [56]. EVI2 may become slightly susceptible to aerosols due to omitting the B band. However, modern RS data applies very strict quality assurance to assist with poor-quality data removal. The differences between EVI and EVI2 were reported to be roughly ±0.02 [31,56].
The following translation and conversion equations were applied to the VIs time series to generate ET-EVI MccL  To understand how these empirical translation methods applied to Landsat VIs impact the resulting ETa estimation, VIs were compared before and after applying MODIS-based equations. Two Landsat scenes (14 June 2002 and 1 August 2008) during the growing season were selected with maximum coverage of the basin and the least cloud contamination. To demonstrate the differences between the VIs over the scene, four combinations of pixelwise differences were applied between EVI ccL and EVI2 ccL , EVI MccL  method was applied over riparian and agricultural areas in the southwestern U.S. and later adapted to Landsat in urban green spaces [5], riparian zones [39], and, in this study, for the first time, ET-VIs were developed for croplands at 30 m resolution (considering harvested areas' changes) at the basin scale using GEE. The ETa equation: where (1 − e [−b×VI] ) is derived from the Beer-Lambert Law to estimate light absorption by a canopy. Using this ETa equation, four ETa products (ET-EVI MccL , ET-EVI2 MccL , ET-EVI ccL , and ET-EVI2 ccL ) were calculated for all available Landsat images over the study area at an annual scale.

Crop Cover Dynamics
To assess the difference in ETa caused by considering a static or dynamic cropping area, annual maximum NDVI layers were derived from Landsat images for two decades using GEE. The annual maximum NDVI layers were developed by applying the maximum value composite (MVC) method, which generates a composite image over a given period by preserving for each pixel the maximum NDVI value of images. Maxwell et al. [57] used variable thresholds ranging 0.61 to 0.71 for MVC to detect ever-cropped lands over the 27 years in southwestern Kansas at county-level [57]. To exclude all but cropped lands from MVC, we excluded all NDVI values using three empirical thresholds less than 0.4, 0.5, and 0.6. After visual inspection and spatial analysis of all MVC layers and comparing croplands and non-cropland areas using Google Earth maps, we found out that by removing values less than 0.4 some rangelands remained in the cropland MVC layers, and by using 0.6 some cropping areas were excluded; therefore, 0.5 was used to minimize rangelands and other covers.
Pixels with high NDVI are agricultural areas. Good and medium rangelands located upstream mostly had NDVI values of about 0.5. We also used a land use map prepared by the Ministry of Energy of Iran (https://moe.gov.ir/?lang=en-US, accessed on 20 February 2021) to evaluate our cropland layers. It depicts fallow land, orchards, rainfed cropland, spring, and fall irrigated agricultural areas. This land use map was produced (2015) using an interpretation of Landsat 8 images. Visualizing and overlaying the land use map on the Google Earth and MVC maps showed that not all green spaces and trees could be masked out, but the extent of this vegetation was insignificant compared to the croplands. After the preparation of MVC, annual cropland layers were then used to derive the ETa over the study period. To examine the impact of cropping areas' changes on ETa, we also extracted a static cropping area for the whole study period using MVC to identify land that was cropped at least once during the study period (ever-cropped land) using the time series of Landsat satellite imagery. The static layer was used to calculate ETa and was then compared against ETa derived by considering annual changes of cropping area (i.e., dynamic cropping area).

Comparison of Landsat VIs and ETa with Ground Data
To compare VIs and ET-VIs with in-situ data, estimated ET-VIs were compared to long-term average ETc values reported by the Isfahan Agricultural and Natural Resources Research and Educational Center (IANREC) at the county scale for major crops. IANREC has used the dual crop coefficient approach (Kc = Kcb + Ke) to calculate ETc. The basal crop (Kcb) and Ks values reported by FAO56 were adjusted based on the methodology proposed by [35] using ground data, i.e., minimum relative humidity, wind speed, and the average plant height for the year 2013. The soil evaporation coefficient (Ke) was calculated considering soil physical properties, irrigation intervals, and irrigation depth. Kcb and Ke were combined with ETo values to calculate ETc (ETc = (Kcb + Ke) × ETo). The ETc values obtained from IANREC were only reported as an average value for each crop and the whole study period. To detect the major crops in each county, annual statistics for crop type, production, and cultivated area were acquired (https://agri-es.ir/Default.aspx?tabid=1925 accessed on 1 June 2021). Wheat and barley observations were obtained for the growing season 2018-2019 from IAO in order to evaluate RS-VIs against typical curves of Kc values used by FAO. To compare the VIs, monthly VIs were extracted (Equation (8)): Four scaled VIs (EVI MccL , EVI2 MccL , EVI ccL , and EVI2 ccL ) were calculated and compared with FAO-Kc curves. Control points of two crops (wheat and barley) in Isfahan County, collected over February-March 2019, were used to determine if the estimated scaled VIs represent reported ones. Kc and VIs are sensitive to LAI and fractional ground cover (reviewed in [37,38,58]). Their minimum occurs at the early stages of crop cultivation (initial (ini)), changes with crop development, reaches the maximum at grain filling (mid-season (mid)), and decreases at the end season (end) [59]. A point-to-pixel method was applied to extract the corresponding pixel values at each point. Considering the crop calendar of wheat and barley (November-June), scaled VIs were compared to those of FAO56.
To assess the performance of scaled VIs against reported FAO-Kc values for wheat and barley, the RMSE was applied at three quartiles (25, 50, and 75). Also, reported long-term average ETc values of major crops in each county were compared with the longterm average of ET-VIs estimates within the counties. ETc was calculated under no-stress conditions. Local ETc values are the only available ground data at the county level for the ZRB; therefore, they were used as a proxy for ET-VIs' performance evaluation.

Performance Analysis of ET-VIs
ET-VIs were compared using annual average values, minimum and a maximum of annual ET-VIs, visualization of annual ET-VIs, and ETa anomaly at basin level. In addition, the pixel-wise spatial correlation was calculated between: The correlation coefficient (R 2 ) at the pixel level was calculated and used to compare the differences between ET-VIs. The Mann-Kendall (MK) test was applied to detect the existence of trends in ETa estimations in each county. MK is a non-parametric test used to identify monotonic trends present in time series. The null hypothesis considers no trend among the observations and the alternative hypothesis represents a monotonic trend [60]. Since the MK test cannot provide the slope of the trend (magnitude), the Sen method, a nonparametric estimator [61], was used to determine the magnitude of the trend [62]. Negative S values show a downward trend and positive values present an upward trend.

Localization of MODIS-Based EVI(EVI2)
Calculated VIs were compared on two dates, 14  and EVIMccL-EVI2MccL. In some parts of sparsely vegetated areas, the EVI2MccL and EVI2ccL values were slightly larger than EVI values. While these differences are still well within the expected noise in the VI dynamic range, [31] explained that EVI2 values are somewhat higher than EVI values due to the higher ratio of B/R in the EVI equation.

Impacts of the Static and Dynamic Cropping Areas on ETa Estimation
Static cropping areas resulted in the considerable underestimation of mean ETa (mm) and the overestimation of the total ET (km 3 ) due to the consideration of non-cultivated areas in ETa derivation over the years (Figure 4). Furthermore, inappropriate NDVI threshold results in over-or underestimation. By removing values of less than 0.4, some rangelands remained in the MVC layers, and by using 0.6 some cropped lands were excluded; therefore, 0.5 was used to exclude rangelands from croplands. Static cropping areas resulted in the considerable underestimation of mean ETa (mm) and the overestimation of the total ET (km 3 ) due to the consideration of non-cultivated areas in ETa derivation over the years (Figure 4). Furthermore, inappropriate NDVI threshold results in over-or underestimation. By removing values of less than 0.4, some rangelands remained in the MVC layers, and by using 0.6 some cropped lands were excluded; therefore, 0.5 was used to exclude rangelands from croplands.   The average difference in mean ET between ET estimated based on static versus dynamic cropping area ranges from 221.7 mm (ET-EVI2 MccL ) to 239.7 mm (ET-EVI ccL ). When comparing the volume of ETa in km 3 with regard to static cropland, we underestimated the ETa intensity in mm but we overestimated total ETa in km 3 since the area of static cropland is larger than that of dynamic areas (Figure 4b). To find out to what extent our method in cropland detection is reliable, we compared our static cropland layer and MVC of 2015 with the available land use map (Appendix A, Figures A1-A3). This land use map was generated in 2015 and does not capture recent changes in agricultural areas. About 60 percent of our static layer covered the agriculture land use map. This is due to the fact that the static layer considers changes in croplands for the whole period (i.e., 2000 to 2019), while the land use map reflects the changes that have happened within a few years up to 2015. The main discrepancy between the static layer and land use map was over rainfed areas in the upstream area. Moreover, the MVC of 2015, as expected, showed lower cultivated areas compared to the agriculture land use map. The reason is that the MVC of 2015 presented only cultivated areas of a given year i.e., 2015.

Evaluation of Scaled VIs with FAO-Kc
Kc methods are typically derived from crops grown under optimal conditions, whereas actual field crops can be subjected to various limitations and water stress [24,63]. The monthly average of scaled VIs curves versus FAO-Kc at three quartiles for wheat and barley have relatively close RMSE values ( Figure 5). The average difference in mean ET between ET estimated based on static versus dynamic cropping area ranges from 221.7 mm (ET-EVI2MccL) to 239.7 mm (ET-EVIccL). When comparing the volume of ETa in km 3 with regard to static cropland, we underestimated the ETa intensity in mm but we overestimated total ETa in km³ since the area of static cropland is larger than that of dynamic areas (Figure 4b). To find out to what extent our method in cropland detection is reliable, we compared our static cropland layer and MVC of 2015 with the available land use map (Appendix A, Figures A1-A3). This land use map was generated in 2015 and does not capture recent changes in agricultural areas. About 60 percent of our static layer covered the agriculture land use map. This is due to the fact that the static layer considers changes in croplands for the whole period (i.e., 2000 to 2019), while the land use map reflects the changes that have happened within a few years up to 2015. The main discrepancy between the static layer and land use map was over rainfed areas in the upstream area. Moreover, the MVC of 2015, as expected, showed lower cultivated areas compared to the agriculture land use map. The reason is that the MVC of 2015 presented only cultivated areas of a given year i.e., 2015.

Evaluation of Scaled VIs with FAO-Kc
Kc methods are typically derived from crops grown under optimal conditions, whereas actual field crops can be subjected to various limitations and water stress [24,63]. The monthly average of scaled VIs curves versus FAO-Kc at three quartiles for wheat and barley have relatively close RMSE values ( Figure 5). EVI ccL has the lowest RMSE values at almost all growth stages compared to other VIs while the highest RMSE belongs to EVI2 MccL . The curves of scaled VIs estimates were almost similar to that of FAO-Kc. The initial stages in the VI-and FAO-based Kc curves have considerable differences for wheat. Scaled VIs for wheat have larger RMSE rates at all growth stages, particularly at initial and mid-season stages. All scaled VIs' RMSE rates decrease when approaching the end-season. The reason can be the evaporation after irrigation or rainfall and sensitivity of reflectance to the soil top-layer wetness, whereas at the end of the season soil evaporation is diminished. The VI method considers transpiration from green vegetation, and only a small extent of evaporation from bare soil; therefore, underestimations happen during the initial and developing stages of the crop cycle.

Performance Analysis of ET-VIs
To compare the performance of the ET-VIs over the ZRB, annual and long-term averages of ET-VIs were provided ( Figure 6). Since the differences between ET-VIs, owing to the small area of cultivation and close changes of ET-VIs, are not visible in the maps, we subtracted ET-VIs from the average of all ET-VIs presented in Figure 6a (Table S1). By contrast, ET-EVI ccL and ET-EVI2 ccL have the highest maximum and average numbers in most of the counties (Figure 6b).
Cultivation density should be considered when comparing irrigated and rainfed regions, as irrigated crops mostly have a higher vegetation density. To show the performance of the ET-VIs over croplands, we selected two counties ( Figures S6 and S7): one with the highest rainfed area in upstream (Chadegan) and one with the highest ETa values in the ZRB (Khomeinishar). The ETa decreases with the distance from the river, reflecting reduced access to water.
Performance of the ET-VIs over Chadegan: with a cropping area of 34,000 ha and a 357 mm average precipitation, Chadegan has rainfed areas with no access to surface and groundwater. With no supplementary irrigation, precipitation is expected to be higher than ETa. In this county, ET-EVI MccL and ET-EVI2 MccL estimates were lower than those of ET-EVI ccL and ET-EVI2 ccL . Although the amount of estimated ETa for all VIs was higher over irrigated areas than rainfed areas, the ET-VIs exceeded precipitation in rainfed croplands. The average lowest and highest difference values belong to ET-EVI MccL (135 mm) and ET-EVI2 ccL (156 mm) respectively.
Performance of the ET-VIs over Khomeinishar: with a cropping area of 5000 ha and a 117 mm average precipitation, Khomeinishar is the smallest county located in the arid part of the ZRB. Two high consumptive crops, pear, and rice, might be the reason for high Performance of the ET-VIs over Khomeinishar: with a cropping area of 5000 h a 117 mm average precipitation, Khomeinishar is the smallest county located in th part of the ZRB. Two high consumptive crops, pear, and rice, might be the reason fo ETa estimates (907-979 mm) over this region. ET-EVIccL resulted in the highest ETa ET-EVI2MccL was lower.
A pixel-wise correlation was applied to present the differences between the E across the ZRB. Annual graphs of ET-VIs are also depicted in Figure 7 to appraise th VIs' curves. Figure 7 shows that ET-EVIMccL and ET-EVI2MccL perfectly match each except in 2011. Although ET-EVIccL often estimated ETa higher than that of ET-EVI2c A pixel-wise correlation was applied to present the differences between the ET-VIs across the ZRB. Annual graphs of ET-VIs are also depicted in Figure 7 to appraise the ET-VIs' curves. Figure 7 shows that ET-EVI MccL and ET-EVI2 MccL perfectly match each other except in 2011. Although ET-EVI ccL often estimated ETa higher than that of ET-EVI2 ccL , ET-VI ccL constantly tended to be higher than ET-VI MccL . Figure 7b shows that both ET-VI ccL and ET-VI MccL are strongly correlated at more than 80% over cultivated areas especially alongside the river and in irrigated areas. The lowest agreement between ET-EVIs and between ET-EVI2s are observed across areas with lower vegetation cover; that is, lower correlation and large scattering between the two ETa products (i.e., ET-VI ccL and ET-VI MccL ). As explained by Nagler et al. [24], the applied algorithm was developed for monitoring dryland irrigation districts and riparian areas and cannot be applied to other landcover types without accounting for that in the empirical derivation.
Remote Sens. 2021, 13, x FOR PEER REVIEW 17 o VIccL constantly tended to be higher than ET-VIMccL. Figure 7b shows that both ET-VIccL a ET-VIMccL are strongly correlated at more than 80% over cultivated areas especia alongside the river and in irrigated areas. The lowest agreement between ET-EVIs a between ET-EVI2s are observed across areas with lower vegetation cover; that is, low correlation and large scattering between the two ETa products (i.e., ET-VIccL and E VIMccL). As explained by Nagler et al. [24], the applied algorithm was developed monitoring dryland irrigation districts and riparian areas and cannot be applied to oth landcover types without accounting for that in the empirical derivation.    ETa has a sharp increase in normal to wet years due to sufficient precipitation and enough available water. From 2000 to 2002, due to drought events [64], negative ET-VIs anomalies persisted throughout the basin, particularly in rainfed areas (Figures S8-S11). In recent years, most rainfed areas have been supplemented with water in the event of water shortage; therefore, the drought impacts and water shortage are not significant upstream compared to downstream. All ET-VIs showed a considerable positive ETa anomaly across the basin from 2004 to 2007 associating with sufficient rainfall and water availability (Figure 8). In 2008, a drought event occurred in the basin, particularly upstream where the river originates, and consequently a significant reduction in ET-VIs is apparent. Similarly, in 2010 and 2017, the amount of precipitation was low and in subsequent years, i.e., 2011 and 2018, a considerable shrinkage in the croplands was conspicuous. It should be noted that the negative ETa anomalies do not correspond to precipitation reduction over the annual scale. Since the rainfall deficit means meteorological drought, any drop of ETa indicates agricultural drought i.e., a decrease in soil moisture. Nevertheless, years with persistent or severe meteorological drought result in negative ETa anomalies. To illustrate the existence and magnitude of trends in ET-VIs at basin and county levels, a summary of trend analysis is presented in Table 3. ET-EVI2 MccL in 10 counties had the lowest S values. As mentioned earlier, ET-EVI2 MccL tends to estimate ETa to a lesser degree than those of other ET-VIs. All products presented a downward nonsignificant trend in Isfahan, Lenjan, and NajafAbad. This can be due to less water availability and consequently less soil moisture in the root zone. At the ZRB scale, it was found that the annual ET-VIs, except ET-EVI ccL , decreased. ET-EVI ccL presented an upward trend with 0.89 mm/year, which was not statistically significant. A maximum nonsignificant upward trend based upon S belongs to Freydan with 3.47 mm/year. NajafAbad has the highest nonsignificant decreasing trend with 2.46 mm/year. Lenjan is the main rice producer and, due to water shortages, the cultivation of rice there has decreased. Total rice cultivation in the ZRB is about 5% of the total cultivated area. ETa has a sharp increase in normal to wet years due to sufficient precipitation and enough available water. From 2000 to 2002, due to drought events [64], negative ET-VIs anomalies persisted throughout the basin, particularly in rainfed areas (Figures S8-S11). In recent years, most rainfed areas have been supplemented with water in the event of water shortage; therefore, the drought impacts and water shortage are not significant upstream compared to downstream. All ET-VIs showed a considerable positive ETa anomaly across the basin from 2004 to 2007 associating with sufficient rainfall and water availability (Figure 8). In 2008, a drought event occurred in the basin, particularly upstream where the river originates, and consequently a significant reduction in ET-VIs is apparent. Similarly, in 2010 and 2017, the amount of precipitation was low and in subsequent years, i.e., 2011 and 2018, a considerable shrinkage in the croplands was conspicuous. It should be noted that the negative ETa anomalies do not correspond to precipitation reduction over the annual scale. Since the rainfall deficit means meteorological drought, any drop of ETa indicates agricultural drought i.e., a decrease in soil moisture. Nevertheless, years with persistent or severe meteorological drought result in negative ETa anomalies. To illustrate the existence and magnitude of trends in ET-VIs at basin and county levels, a summary of trend analysis is presented in Table 3. ET-EVI2MccL in 10 counties had the lowest S values. As mentioned earlier, ET-EVI2MccL tends to estimate ETa to a lesser degree than those of other ET-VIs. All products presented a downward nonsignificant trend in Isfahan, Lenjan, and NajafAbad. This can be due to less water availability and consequently less soil moisture in the root zone. At the ZRB scale, it was found that the annual ET-VIs, except ET-EVIccL, decreased. ET-EVIccL presented an upward trend with 0.89 mm/year, which was not statistically significant. A maximum nonsignificant upward trend based upon S belongs to Freydan with 3.47 mm/year. NajafAbad has the highest nonsignificant decreasing trend with 2.46 mm/year. Lenjan is the main rice producer and, due to water shortages, the cultivation of rice there has decreased. Total rice cultivation in the ZRB is about 5% of the total cultivated area.  Table 3. The trend in different counties. The critical Z value at the 5% confidence level is ±1.96, and there is a trend if the MK's Z value is greater than the Z critical value. Also, if the p-value is less than the significance level, the null hypothesis is rejected, meaning that there is a trend in the time series [60]. 1: Tiran, 2: Freydan, 3: Chadegan, 4: Isfahan, 5: Mobarakeh, 6: Borkhar, 7: Khomeinishahr, 8: Shahinshahr, 9: Falavarjan, 10: Lenjan, 11: Dehaghan, 12: NajafAbad. ET-EVI2 MccL and ET-EVI MccL : ET calculated using 2-and 3-band Enhanced Vegetation Index (MODIS continuity corrected Landsat) respectively; ET-EVI2 ccL and ET-EVI ccL : ET calculated using 2-and 3-band Enhanced Vegetation Index (continuity-corrected Landsat) respectively.

Discussion
In this study, amongst different methods of ETa estimation, the VI-based method (EVI and EVI2) was applied and assessed to estimate annual ETa in the ZRB. The applicability of these RS-based ETa models has the potential to facilitate and improve water management and drought monitoring, where access to in-situ data is constrained. In the ZRB, except a few short-term [65,66] or coarse-resolution [67] studies, there was no study which assessed the application of empirical ET-VI methods over croplands at basin scale using Landsat images. Most studies do not consider changes in croplands' extent over the years; however, it is critical to incorporate the impact of fallow and non-cultivated lands on the estimation of ETa.
Long-term observation of land cover usually results from multiple sensors with dissimilar characteristics [68][69][70], making the time series inconsistent and requiring continuity transformation. Many studies have emphasized the necessity of cross-sensor transformation [68,71]. We derived translation equations for the MODIS' EVI and EVI2 and formulated a MODIS-based EVI and EVI2 for Landsat sensors. We also compared pairs of MODISbased ET-VIs against non-MODIS-based ET-VIs that will permit the direct application of MODIS-VI ETa models. Nevertheless, to the knowledge of the authors, there was no study on comparison of the impact of applying cross-sensor transformation on ETa estimates over croplands. Our results showed that ET-VI ccL was highly correlated with ET-VI MccL over croplands. This is in line with the findings of Jarchow et al. [71] and Nagler et al. [39]. In Jarchow et al. [71], they found a high correlation between Landsat-EVI and MODIS-EVI for agricultural fields. In Nagler et al. [39], a high correlation was reported between Landsat-EVI (and EVI2) and MODIS-EVI (and EVI2) for riparian reaches of the lower Colorado river.
Intra-annual variation of the ETa over croplands emphasizes the importance of an accurate assessment of ET as a water management tool where in-situ data are scarce. The croplands are inescapably and significantly reliant on irrigation and must compete with other water users in the region. Maxwell et al. [57] used maximum NDVI composites to identify ever-cropped lands and to evaluate the inter-and intra-annual dynamics of cropped and non-cropped lands. Their results indicated that maximum NDVI composites were useful for identifying land use changes over time [57]. In line with Maxwell et al. [57], our findings showed the capability of MVC in capturing cropped and non-cropped lands. The shrinkage of cropping areas during drought and dry periods is significant, particularly downstream. This can lead to the underestimation of ETa when a static cropping area is only considered for ETa prediction.
Data obtained from a time series of VIs helped us to understand the phenological stages of the crops. We evaluated the phenological behavior of wheat and barley using scaled-VIs compared with FAO-Kc. The Kc mid and Kc end values extracted for these crops compared well with reported values by [72]. They collected the applied research on FAO-Kc and appraised and updated the single Kc mid and Kc end [72]. ET-VIs could capture the drought, water shortage, and shrinkage of cropping areas, particularly in the lower reaches of the basin. Various studies have reported a strong correlation between VIs and ET over a wide variety of biomes [11,22,37,38,73]. Glenn et al. [11] reported that the difference between empirical methods and measured ETa has a root mean square error of up to 30% of mean ETa across different biomes. In our study, the comparison of ET-VIs with long-term averaged ETc values at the basin-scale varied from 14.81% (ET-EVI2 MccL ) to 21.11% (ET-EVI ccL ), and at the county-level this ranged from 2.49% (ET-EVI2 MccL ) to 33.67% (ET-EVI ccL ) ( Table 1). These differences revealed that all ET-VIs overestimated the ETa. ET-EVI2 tended to have lower ETa estimates in comparison to their corresponding ET-EVIs in the majority of the counties. Nouri et al. [5] reached a similar conclusion after analyzing estimated ETa over urban green spaces using EVI, EVI2, and NDVI from Landsat and MODIS. Their findings also showed that ET-Landsat (EVI2) and ET-MODIS (EVI2) consistently had lower estimates compared to that of ET-Landsat (EVI) and ET-MODIS (EVI). The anomalies' graphs perfectly depicted that from 2010 onwards the ET-VIs had negative anomalies mostly in the tail part of the ZRB, showing a decrease in the area of cultivation. The results on the anomaly of ET-VIs corroborate the study by Sarvari et al. [74]. They demonstrated that in the 2010s, the lower parts of the ZRB dried out due to several years of water shortage, causing damage to the agriculture sector. The upstream of the ZRB encounters less water shortage due to better access to blue water resources and more rainfall. Accordingly, Sharifi et al. [75] claimed that, due to water demand, frequent droughts, and the transformation of the geographical boundaries (political boundaries replaced river basin boundaries as the basis for regional water resources management, 2008-2009) of water allocation authorities within the basin, the fulfillment of agricultural water rights was hindered. Overlooking drought events during 2000 and 2001 and their impact on cultivated areas, the ET-VIs revealed an expansion of cultivation areas from 2004 to 2007. Sharifi et al. [75] reported that the basin's total cultivation area peaked at 310,000 ha in 2006. Overall, our findings show that the ET-VIs approach can be utilized not only for agricultural water management, but also for drought monitoring, mapping, and water and food security policies.

I.
Uncertainties: our ET-VIs' uncertainties are rooted in (1) errors in the derivation of ETo (limited access to field observations of ETo and the lack of well-distributed climatic stations in ZRB hampered the application of PM to estimate ETo for the whole basin; therefore, alternative climate datasets like the global gridded datasets with coarse resolution were used in this study to map ETo); (2) errors and uncertainties in ground measurements of ETc; (3) RS methods are themselves subjected to uncertainty such as parameterization, cloud cover, errors in scaling approaches, impacts of land cover, and meteorological forcing [29,76,77]. The major hindrance of applying ET-VIs is that they cannot capture stress effects or soil evaporation [78]. In the case of the ZRB, due to the cloud and aerosol contamination, some parts of the ZRB had either excessive missing images or missing pixel values due to the presence of stripes in Landsat 7 images, for example. While missing values were reconstructed by averaging introduces some errors and uncertainties to NDVI, VIs, and ETa (4), our cross-sensor transformation and translation equations could also have introduced biases since they are based on filtered data only. II. Limitations: the accuracy of the ground data for calibration or validation is the main constraint. Since ET-VI methods cannot identify early signals of moisture stress, they are therefore not useful for real-time irrigation planning. Nevertheless, on monthly time steps, they can help relate crop water requirements to crop growth and development [11]. The validation of the ETa estimates produced in this paper is hampered by the scarcity of good quality spatial data on the availability of ground truth data like observed ETa estimates, Kc values, data on agricultural water consumption from surface and groundwater, and a precise land use map to exclude trees, green spaces, and rangelands. We assume that the current modified rainfed map, as being valid for only rainfed areas over other years, could result in some misclassification of irrigated and rainfed fields, as some irrigated lands may have been converted to rainfed, while rainfed areas may have experienced irrigation expansion. Missing images in 2003 also made us exclude this year from our evaluations. III. Recommendation: RS-based approaches have the potential to be used for national and provincial water management projects, such as drought mitigation in fulfilling food security at a national scale and, on a smaller scale, irrigation management of different counties. Considering cross-sensor differences, transformation methods should be applied before ETa calculation in order to reduce the impacts of these disparities on ETa estimates. Certain spatial characteristics may be lost due to the aggregation method, such as cropping practices and rotations. Furthermore, an accurate projection of the water consumption of crops during the growing season, along with images with the finer temporal and spatial resolution are both needed. Apart from reliable field measurements and crop-specific comparisons of ET-VIs to improve the accuracy and spatiotemporal resolution of ETa estimations, further studies should evaluate hybrid approaches combining different ETa methods by considering their corresponding advantages and limitations. We recommend comparing ET-VIs with other VI and energy balance methods as well as available RS-based ETa products, such as Operationalized Simplified Surface Energy Balance (SSEBOp) [14] or Water Productivity through Open access of Remotely sensed derived data (WaPOR) [79].

Conclusions
The development of RS-based ETa can provide an effective tool for fast hydrological monitoring, agricultural management, and climate change studies.
Vegetation-index-based ETa estimation provides robust results for agricultural crops grown in arid regions and may therefore play a key role in informing land and water management in regions lacking in-situ data. ETa calculation is improved by applying cross-sensor transformation equations before calculating ETa. Interannual variability in cropland extent affects ETa considerably; therefore, the use of static cropland extent cannot be recommended. Further research is still needed to evaluate how ET-VIs perform at seasonal and monthly scales and to compare them to alternative methods, such as energybalance based indices. Future studies should evaluate the performance of these empirical RS-based ET-VIs at different time scales and especially for rainfed systems.