An NDVI-Based Statistical ET Downscaling Method

This study proposes a new method for downscaling ETWatch 1-km actual evapotranspiration (ET) products to a spatial resolution of 30 m using Landsat8 normalized difference vegetation index (NDVI) data. The NDVI is employed as an indicator of land-surface vegetation, which displays periodic spatial patterns on the land surface. A 30-m-resolution ten-day ET dataset is then calculated primarily using the NDVI and the historical ratio of coarse NDVI and ET that considers different land cover types. Good agreement and correlations were obtained between the downscaled data and observations from three flux sites in two study areas. The mean bias (MB) per ten-day period ranges from 4.21 mm in Guantao to 1.55 mm in Huazhaizi, and the coefficient of determination (R2) varies from 0.87 to 0.95. The downscaling results show good consistency with the original ETWatch 1-km data over both temporal and spatial scales for different land cover types, with R2 values ranging from 0.82 to 0.98. In addition, the downscaled results capture the progression of vegetation growth well. This study demonstrates the applicability of the new “de-pixelation” downscaling method in the management of water resources.


Introduction
Evapotranspiration (ET) is one of the major processes in the hydrologic cycle and plays an essential role in controlling energy and water exchanges between the land surface and the atmosphere.ET is also a key component in water resource management across multiple scales for agricultural and ecological applications [1].ET is the summation of evaporation and transpiration, which consume water to different degrees in areas with different land cover types; thus, this parameter is significantly affected by the land-surface characteristics.
Given the dwindling supplies of available water resources due to over-withdrawal of groundwater and water pollution, water resource management has shifted from traditional large-scale approaches to approaches that rely on ET to estimate temporal changes, which concentrate to a greater degree on identifying temporal changes and the spatial distribution of water consumption, especially in intensely irrigated farming areas [2].Furthermore, the results of water-saving techniques, such as drip irrigation and sprinkler irrigation, could be monitored using ET data, thus improving water-use efficiency (WUE) and enhancing the ability of farmers to efficiently manage available irrigation water supplies [3][4][5][6].
Gridded ET from remote sensing (RS)-based methods, however, suffers from the temporal and spatial limitations from RS sensors.For example, Moderate Resolution Imaging Spectroradiometer (MODIS) products (which have a spatial resolution of 1 km) can be obtained twice daily, and approximately 100 images are available each year after the images influenced by clouds are removed.In contrast, only one Landsat8 image with a resolution of 30 m is taken every 16 days, regardless of cloud cover.Consequently, trade-offs exist between different RS-based methods, and combinations of different ET datasets and methods are required to improve the success of water resources management [7].
Given the limitations of RS images, a combination of images from diverse sensors can be used to obtain a complete time series of stable and higher-resolution gridded ET data.Martha C. Anderson interpolated ET snapshots obtained at the time of clear-sky Landsat overpasses [8].In this work, the ratio of actual ET to a reference ET (fRET) obtained from Landsat data after spline interpolation, ALEXI fRET data obtained by applying the Savitsky-Golay filter [9], and daily meteorological data were used as inputs.Similarly, many methods employ actual ET (ETa) and potential ET from cloud-free days and assume that the ratio of fRET or the evaporative fraction (EF) is the same or changes in an orderly fashion.Such methods include the reference ET fraction [10][11][12][13], Todorovic [14][15][16], and evaporative fraction [17,18] methods.For Landsat8 data, the effects of clouds and the satellite repeat interval both contribute to image-sparse areas.Traditional temporal gap-filling methods may filter out some high-frequency signals, such as irrigation, during these long time intervals.
Another approach involves downloading coarser-scale datasets that combine the benefits of diverse kinds of remote sensors.Downscaling is defined as an increase in spatial resolution following disaggregation of the original dataset based on fitting a statistical relationship [19,20], which requires information at a desired fine resolution [21][22][23].Liu and Wu used the spatial and temporal adaptive reflectance fusion model (STARFM) to downscale 1-km ETa data [24].Yasir H. Kaheil presented an algorithm using the discrete wavelet transform (DWT) and support vector machines (SVMs) to downscale and forecast ET values at a spatial resolution of 15 m, which is similar to the resolution used by Ke [25,26].In addition, the U.S. Geological Survey investigated the downscaling potential of simplified surface energy balance (SSEB)-derived ETa data from 1 km to 250 m by correlating the ET data with Normalized Difference Vegetation Index (NDVI) values from MODIS [27].
A correlation exists between vegetation and ET.Transpiration through the stomata of plant leaves accounts for the majority of ET, especially in arid and semi-arid areas, due to the low soil moisture content.For example, the study by Zhuang [28] concluded that the fractional vegetation cover (fvc) and canopy resistance (Rc) retrieved from images displaying NDVI values control the sensible heat flux between the surface and the atmosphere.Additionally, NDVI has also been suggested as a major factor controlling the inter-annual variations in ET over relatively broad scales [29].Many studies have highlighted the advantages of using vegetation indices (VIs) to estimate ET in previous decades [30][31][32][33][34][35][36].Edward P. Glenn summarized correlation coefficients between ET estimates based on VIs combined with relevant variables and in situ observations [36].
This study aims to introduce Landsat8 30-m NDVI data as a readily available and applicable source of information on spatial patterns combined with land cover information at the same spatial scale.In addition, regression coefficients determined only over small areas (one MODIS pixel) were employed to downscale the ETa data.Three in situ flux sites in two study areas are then used to compare with the results of the so-called "de-pixelation" method, which disaggregates the ETWATCH 1-km ETa dataset to a resolution of 30 m.

Study Area
In this research, three sites within the two study areas, the city of Zhangye and Guantao County, which feature different climatic conditions, were chosen to validate the performance of the "de-pixelation" method in semi-arid and semi-moist climate regimes.

1.
The Heihe Basin is the second largest river in China.This river flows through Qinghai, Gansu and Inner Mongolia in China, and the catchment area covers approximately 128,000 km 2 .The middle reach (98   2 .This region includes the main irrigated agricultural zone and the main water consumption area in the basin and features a semi-arid climate with hot summers and cold winters.This area receives an average rainfall of 110.9 mm year −1 (1980-2010) and has a mean annual temperature of 7 • C.
According to the 30-m land cover data from ChinaCover [37] generated using an object-based method with an accuracy of 94%, irrigated cropland (mainly wheat, cotton and maize) occupies the largest proportion (approximately 49%) of this region, followed by grassland (32%), bare land and desert (9%), artificial surfaces (7%), forest (2%), and water areas (1%).The dataset is revised every five years, and the data from 2010 were used in this research.Detailed percentages for the diverse land cover types in Zhangye City are given in Table 1.A position and land cover map of the middle reach (complete with the two flux stations used in this research, i.e., Daman and Huazhaizi) is shown in Figure 1.The Daman site is cropped with large tracts of maize (approximately 200 hm 2 ) surrounded by shelterbelts, while the Huazhaizi site is located in the Gobi desert area that surrounds the Zhangye oasis and features sparse vegetation.
Water 2017, 9, 995 3 of 18 climate with hot summers and cold winters.This area receives an average rainfall of 110.9 mm year −1 (1980-2010) and has a mean annual temperature of 7 °C.
According to the 30-m land cover data from ChinaCover [37] generated using an object-based method with an accuracy of 94%, irrigated cropland (mainly wheat, cotton and maize) occupies the largest proportion (approximately 49%) of this region, followed by grassland (32%), bare land and desert (9%), artificial surfaces (7%), forest (2%), and water areas (1%).The dataset is revised every five years, and the data from 2010 were used in this research.Detailed percentages for the diverse land cover types in Zhangye City are given in Table 1.A position and land cover map of the middle reach (complete with the two flux stations used in this research, i.e., Daman and Huazhaizi) is shown in Figure 1.The Daman site is cropped with large tracts of maize (approximately 200 hm 2 ) surrounded by shelterbelts, while the Huazhaizi site is located in the Gobi desert area that surrounds the Zhangye oasis and features sparse vegetation.

ETWATCH 1-km ETa Dataset
ETWATCH is a robust method for calculating actual evapotranspiration that combines surface energy balances and the Penman Monteith (P-M) method and has been calibrated and validated in the Heihe and Haihe basins [18,38,39].In this research, absolutely cloud-free Aqua MODIS images from 87 days in the Heihe Basin and 142 days in the Haihe Basin were used for ET calculation, with the time interval ranging from one day (dry season) to a maximum of eight days (wet season).The instantaneous energy balance and evapotranspiration fraction (EF) are calculated by the "residue approach" using MODIS data on clear days.In the process of transpiration, the energy exchange between the land surface and the atmospheric boundary layer (ABL) was simulated based on the ABL conditions using a separate aerodynamic roughness length method, which combines information on vegetation, topography and the radar-based roughness that describes the surface structure [40].This method was used instead of the traditionally used lookup table method.Gap-filled daily MOD07 atmospheric profile dataset combined with the Feng approach were employed to detect ABL height.There is an abrupt decrease of the water vapor mixing ratio (MR) profile at the height of ABL, by which air temperature, dew-point temperature and air pressure were extracted at the certain layer [41].Once the instantaneous net radiation, soil heat flux and sensible heat flux (Rn, G and H) are derived, the latent heat (LE) flux is calculated as the residual according to the energy balance equation.
The LE for the whole day is then based on the extended EF from instantaneous data and net radiation data without the energy from the daily soil heat flux (R n -G 0 ), assuming that the EF is constant throughout the day [42].During this step, calculation of daily net radiation is based on the traditional method recommended by the UN Food and Agriculture Organization (FAO) for meteorological data that considers topography [38,43,44].Daily interpolated meteorological data provided by the China National Meteorological Bureau, together with sunshine hour data derived from the FengYun-2D/E (FY-2D/E) satellite with empirically determined parameters for different kinds of clouds, are also used.Moreover, this study introduces an improved approach for estimating instantaneous and daily soil heat flux that includes the surface parameters of vegetation indices, shortwave infrared reflectance, solar zenith angle, surface temperature, and soil moisture content.This approach has been verified at six sites within the Heihe Basin and two sites within the Haihe Basin [45,46].The resulting datasets have exhibited high accuracies.
The resistance on a clear day was then calculated using the P-M method based on the weather conditions measured using routine AWS observations (maximum, minimum and average temperature, air humidity, wind speed, and number of sunshine hours from the FY2 dataset) [47].To obtain a complete set of daily ET data using the P-M method, this work used a temporal gap-filling model for surface resistance that contains information including smoothed leaf area index (LAI) values, soil moisture content, and daily constraint functions describing the response of stomatal apertures to the minimum air temperature (Tmin) and the vapor pressure deficit (VPD) [18].
In this research, ETWatch data from the Heihe Basin over two years (2012-2013) were analysed.However, only data from 2013 were downscaled for validation due to the lack of in situ observations.Data from Guantao Country from 2013 are also used.All datasets are given as ten-day intervals and then combined into month-scale values for comparison and mapping.

Landsat8 Data
The oasis of Zhangye, which lies in the middle reach of the Heihe Basin, is covered by only one Landsat8 image at path 133 and row 33.First, Landsat8 data for the study area during the growing season (April to October) in 2013 were downloaded from the Earth Explorer web site [48].Images with sparse clouds or, at a minimum, clouds that did not cover the croplands were chosen for pre-processing.Two months of Landsat8 images (August and September) were not used due to cloud contamination.Thus, data collected by the HuanJing (HJ-1A and HJ-1B) satellites were used as a substitute source of NDVI data.These data were obtained from the website of the China Center for Resources Satellite Data and Application [49] to achieve the goal of one image per month.HJ images have the same spatial resolution (30 m) as Landsat images but have a shorter time interval (two days).A comparison of the spectral and revisiting characteristics of the two satellites is provided in Table 3. Peer-reviewed studies have investigated the normalization of images from these two satellites [50].To maintain consistency between the two data sources, HJ data were transformed into the same coordinate system as that used by Landsat8.The root mean square error (RMSE) of this rectification is less than one pixel.
Similar to the conditions in Zhangye, the Landsat8 image at path 123 and row 35 was collected for Guantao County during the same period in 2013 (April to October).Fortunately, a sufficient number of Landsat images were obtained, and the substitution of HJ data was not required.Detailed acquisition dates of the Landsat data for both sites can be found in Table 4.

MOD16 ET Data
For coarse ET, there are currently multiple ET products with diverse scale, such as the MOD16 [51,52] global ETa product [53], which has a 1-km spatial resolution and an 8-day or one-month temporal resolution, which is similar to ETWatch.This product, with the applied P-M equation, has been validated in several countries and continentals.Details on the method and global validation results can be found in the literature [51,52,[54][55][56].Hence, in this research, a comparison of downscaled results using MOD16 monthly data and ETWatch inputs was performed.The MOD16A2 dataset was processed following the scale factors and added offsets in the original HDF file.

Method
In principle, the ETa values within a given coarse pixel (usually a MODIS 1-km pixel) should be nearly the same for homogeneous surface conditions, given that meteorological factors typically remain constant over short distances.However, due to the heterogeneity of the land surface, which contains different land cover types and topography, ET values vary among small patches.
One type of output downscaling that is currently widely used can be called the "correlation" approach.This process uses the relationship between ETa data from two MODIS images (or other high temporal resolution datasets) separated by a given time interval [7,23,57].To advance this type of strategy by building relationships based on Landsat ET over small areas, in this study, spatial patterns from NDVI data within individual fine pixels are highlighted.Assuming that the 1-km ETa data used in this research provide a good representation of the actual conditions, the following equation can be used: where ET 30m is the downscaled ET result at a resolution of 30 m; ET 1km is the ETa calculated from MODIS data at a resolution of 1 km (ETWatch in this research); and P 1km and p 30m are driving factors calculated from the 30-m resolution images.Downscaling using this scheme can ensure that the sum of the downscaled ET will be the same as that of the entire coarse pixel.A monthly 30-m NDVI image is combined with the land cover map at the same spatial resolution to calculate the indicators P 1km and p 30m .The transpiration associated with vegetation is greater than the evaporation associated with soil during the growing season, and the magnitude of the transpiration varies among different vegetation types.Hence, the historical ratio of NDVI/ETa was used to diminish the potential deviation because of heterogeneity in the land surface.
A lower ratio of NDVI/ETa (usually occurring within cropland) may lead to an increase in the downscaled ET using NDVI only for cropland inside individual mixed 1-km pixels.Therefore, the NDVI obtained directly from Landsat8 must be modified to eliminate this deviation.The relevant equation can be written as follows: where Ra (i,m) is the factor that offsets the NDVI due to the different NDVI/Eta ratios, which depend on the land cover type i and the month m.NDV I (i,m) is the average NDVI value for land cover type i and month m, ETa (i,m) is the average ET value for land cover type i and month m, and is the summation of NDVI/ET for all land cover types in month m.In Equation (3), p 30m is the downscaling indicator used in Equation (1), NDV I (30m,m) is derived from satellite data from month m, and Ra (m) is the offset factor in Equation ( 2) for month m.In this strategy, a lower ratio of NDVI/ETa leads to a lower offset factor of Ra within a given area.This difference may lead, in turn, to a lower fine-scale ET value compared with the NDVI-downscaled result because the offset was added to the NDVI.This effect is especially pronounced within mixed pixels.This study assumes that the vegetation conditions described by the NDVI do not change greatly within individual months.Thus, monthly NDVI information is used to represent the spatial pattern of the land surface and estimate the ten-day ET.Therefore, ten-day 30-m ET data can be obtained at the same temporal scale as the 1-km ETa data with the following model, which is similar to Equation (1): where p 30m,monthly and p 1km,monthly are the monthly NDVI data after offsetting them by Ra; ET 30m,ten−day represents the ten-day 30-m ET data; and ET 1km,ten−day represents the ten-day 1-km ET data in a given month.A ten-day temporal interval is chosen because a ten-day period is close to the 8-day sampling interval of the MODIS evapotranspiration product (MOD16).In addition, ten-day data can easily be converted into 30-day data, which agrees with the sampling frequency of the NDVI data source.
Given the needs of water resource management for data on cropland in particular areas and to meet the demand for data high temporal and spatial resolutions, the evaluation of the downscaling method is performed in terms of accuracy using in situ data (EC) and in terms of consistency with the original ETWATCH ETa dataset in the following section.

Results
In this research, the ratio of historical NDVI/ETa was calculated from the original 1-km dataset.Differences exist among the different land cover types; however, these differences remain stable over time (2012-2013 in the Heihe Basin).Comparisons of the ratios of the two main land cover types in the two study areas are provided in Figure 3. Based on these comparisons, the differences are more obvious in Zhangye City than in Guantao County because Zhangye is an oasis that relies on irrigation and is surrounded by desert and bare soil.Thus, the water consumption of Zhangye cropland differs from that of the surrounding plant-sparse area.However, the difference diminished considerably in the semi-moist area, where precipitation is relatively abundant.These favourable moisture conditions ensure the growth of green plants in towns and villages, leading to a small difference between the NDVI and ETa values for the cropland.

Results
In this research, the ratio of historical NDVI/ETa was calculated from the original 1-km dataset.Differences exist among the different land cover types; however, these differences remain stable over time (2012-2013 in the Heihe Basin).Comparisons of the ratios of the two main land cover types in the two study areas are provided in Figure 3. Based on these comparisons, the differences are more obvious in Zhangye City than in Guantao County because Zhangye is an oasis that relies on irrigation and is surrounded by desert and bare soil.Thus, the water consumption of Zhangye cropland differs from that of the surrounding plant-sparse area.However, the difference diminished considerably in the semi-moist area, where precipitation is relatively abundant.These favourable moisture conditions ensure the growth of green plants in towns and villages, leading to a small difference between the NDVI and ETa values for the cropland.A lookup table for the Ra values calculated based on ETWatch data in the two study areas is provided in Table 5.Only two main land cover types were considered in Guantao County because cropland and artificial surfaces together occupy over 95% of the entire county; thus, other types were ignored in this study.Therefore, the NDVI values in areas of forest, bare soil and water are not modified.Using the Ra lookup table above, 1-km ET was then decomposed into 30-m ET for the two regions in 2013 with a ten-day time interval.Figure 4 shows the spatial distribution of the downscaled results and the ETWatch 1-km ETa map for the entire growing season (interpolated to the same resolution).The downscaled results show good spatial consistency with the original ETWatch coarse data.There is a valley in the northeast of the middle section of the Heihe Basin (Figure 4a,b).The ET in the valley is relatively higher due to the efficient water supply.This tendency is reflected in the A lookup table for the Ra values calculated based on ETWatch data in the two study areas is provided in Table 5.Only two main land cover types were considered in Guantao County because cropland and artificial surfaces together occupy over 95% of the entire county; thus, other types were ignored in this study.Therefore, the NDVI values in areas of forest, bare soil and water are not modified.Using the Ra lookup table above, 1-km ET was then decomposed into 30-m ET for the two regions in 2013 with a ten-day time interval.Figure 4 shows the spatial distribution of the downscaled results and the ETWatch 1-km ETa map for the entire growing season (interpolated to the same resolution).The downscaled results show good spatial consistency with the original ETWatch coarse data.There is a valley in the northeast of the middle section of the Heihe Basin (Figure 4a,b).The ET in the valley is relatively higher due to the efficient water supply.This tendency is reflected in the downscaled result.Finer-scale details, such as highways and field patches, can be recognized in the downscaled map as well.This phenomenon is more obvious in Guantao Country.A small village in this region surrounded by cropland is shown in the downscaled map but is obscure in the coarse map.The spatial patterns provided by Landsat images are therefore suitable for downscaling.
Water 2017, 9, 995 9 of 18 downscaled result.Finer-scale details, such as highways and field patches, can be recognized in the downscaled map as well.This phenomenon is more obvious in Guantao Country.A small village in this region surrounded by cropland is shown in the downscaled map but is obscure in the coarse map.The spatial patterns provided by Landsat images are therefore suitable for downscaling.

Comparison with the Original ETWATCH Dataset
Downscaled ET values over all three sites show good agreement with the original ETWATCH dataset (Figure 5).The R 2 values resulting from this comparison range from approximately 0.98 at Daman (DM) and Guatntao (GT) to 0.82 at Huazhaizi (HZZ).This good agreement demonstrates that the accuracy of the downscaled dataset depends primarily on the quantity of coarse ETa data available, especially within relatively homogeneous areas.Considering that many of the crop fields in the study area are relatively homogenous (i.e., not many sparsely distributed, fine-scale crop pixels are present) and have similar NDVI values, this good agreement is consistent with our expectations.The smallest R 2 value and the lowest degree of temporal agreement are obtained for site HZZ because of the irregular distribution of bare soil, rock and plants near the site, resulting in abrupt changes in

Comparison with the Original ETWATCH Dataset
Downscaled ET values over all three sites show good agreement with the original ETWATCH dataset (Figure 5).The R 2 values resulting from this comparison range from approximately 0.98 at Daman (DM) and Guatntao (GT) to 0.82 at Huazhaizi (HZZ).This good agreement demonstrates that the accuracy of the downscaled dataset depends primarily on the quantity of coarse ETa data available, especially within relatively homogeneous areas.Considering that many of the crop fields in the study area are relatively homogenous (i.e., not many sparsely distributed, fine-scale crop pixels are present) and have similar NDVI values, this good agreement is consistent with our expectations.The smallest R 2 value and the lowest degree of temporal agreement are obtained for site HZZ because of the irregular distribution of bare soil, rock and plants near the site, resulting in abrupt changes in the NDVI (site HZZ lies within a sparse grassland that is located 150 m from a sand-covered area and 800 m from cropland).
The analysis given above suggests that the accuracy of the fine dataset after the "de-pixelation" method depends strongly on the coarse ETa data because of the high degree of correlation, especially within homogenous areas, and partly on the complexity of the land surface.Generally, lower vegetation cover fractions and more complex land cover distributions result in lower accuracy.
Water 2017, 9, 995 10 of 18 the NDVI (site HZZ lies within a sparse grassland that is located 150 m from a sand-covered area and 800 m from cropland).The analysis given above suggests that the accuracy of the fine dataset after the "de-pixelation" method depends strongly on the coarse ETa data because of the high degree of correlation, especially within homogenous areas, and partly on the complexity of the land surface.Generally, lower vegetation cover fractions and more complex land cover distributions result in lower accuracy.

Validation Using In Situ Observation
The ten-day dataset obtained by applying the "de-pixelation" method at three flux sites (DM, HZZ and GT), as summarized in Section 2.1, has been extracted for comparison with the in situ observations.The ET results obtained after "de-pixelation" for all three sites show good agreement with the in situ data, whereas the R 2 values range from 0.87 (GT) to 0.95 (DM) and the RMSE ranges from 2.28 (HHZ) to 5.34 (GT) per ten-day interval (Figure 6).The model performs well, especially at the HZZ and DM sites (after eliminating one abnormal EC record of 33 mm in the first ten-day period of April, which contrasts with the values of only 10 mm obtained for the other two ten-day periods in the same month).The model also performs well at the GT site, which is positioned within a relatively uniform crop field.Maize is planted at the DM site, and two-season crops are planted at the GT site.Compared with the other two sites, the HZZ site shows little overestimation of the ET values compared with the observations (Figure 7).Therefore, the "de-pixelation" model displays better performance within crop fields with relatively broad and homogeneous areas than in areas of bare soil and sparse vegetation.

Validation Using In Situ Observation
The ten-day dataset obtained by applying the "de-pixelation" method at three flux sites (DM, HZZ and GT), as summarized in Section 2.1, has been extracted for comparison with the in situ observations.The ET results obtained after "de-pixelation" for all three sites show good agreement with the in situ data, whereas the R 2 values range from 0.87 (GT) to 0.95 (DM) and the RMSE ranges from 2.28 (HHZ) to 5.34 (GT) per ten-day interval (Figure 6).The model performs well, especially at the HZZ and DM sites (after eliminating one abnormal EC record of 33 mm in the first ten-day period of April, which contrasts with the values of only 10 mm obtained for the other two ten-day periods in the same month).The model also performs well at the GT site, which is positioned within a relatively uniform crop field.Maize is planted at the DM site, and two-season crops are planted at the GT site.Compared with the other two sites, the HZZ site shows little overestimation of the ET values compared with the observations (Figure 7).Therefore, the "de-pixelation" model displays better performance within crop fields with relatively broad and homogeneous areas than in areas of bare soil and sparse vegetation.A comparison of the 1-km dataset and the 30-m product with the in situ observations is shown in Table 6.At all three sites, the 30-m dataset shows good consistency, in terms of its R 2 value, when compared with the original 1-km product.This finding agrees well with the conclusion given in Section 4.1, which states that the downscaled dataset strongly resembles the original coarse dataset.The progression of the time series effectively captures the different trends (one seasonal crop at the DM site, a similar trend at the HZZ site but without crop cover, and two seasonal crops with two peaks in the ET curve at the GT site).Little improvement was observed at Huazhaizi Site.The reason is mainly because the mixing of coarse data was decomposed by 30-m surface information.

Comparison among the Results from Different Data Sources
There are two types of input data in the downscaling model, coarse ETa (1 km) and fine NDVI data (30 m).In this section, to evaluate the performance of the "de-pixelation" method under different data-input circumstances, MOD16A2 monthly ETa data are combined with 30-m NDVI data for use in the same downscaling experiment as the ETWatch data.Comparisons of the scatter and progression are shown in Figure 7.The analysis was not performed for the HZZ site because MOD16A2 data did not cover the bare soil type at this site.
Due to the presence of cloud cover, this analysis used HJ images, which have a revisit period of two days, as a substitute for the Landsat8 data in Zhangye.The data from these two satellites have the same 30-m spatial resolution.Research on the performance of these two sensors shows a good agreement for the measured reflectances in the red and near-infrared (NIR) bands.Data from these bands are needed to calculate the NDVI, especially within areas of cropland and bare soil [58].In addition, the existence of this data source with a short repeat interval is one of the reasons why NDVI data are used as the driving factor in the model.To assess the performance of the "de-pixelation" method using NDVI data from different satellites, rather than spectral features, Figure 8 compares the two datasets obtained using different sensors with the in situ observations.
There is an obvious underestimation of ET in the original MOD16 data and, consequently, in the downscaled results (Figure 8).As proposed in Section 4.1, there is a strong correlation between the downscaled results and the original datasets.The mean bias (MB) per month ranges from 41.57 mm at the GT site to 54.62 mm at the DM site, which could lead to serious errors when estimating the actual water consumption.Hence, it is better to choose more accurate coarse ET data or to calibrate the dataset specifically for the research area in order to obtain a better downscaled result.
Based on the comparison of different fine NDVI inputs (Figure 9), Landsat 8 and HJ, both datasets show good agreement with the ETa from EC. Similar coefficients of determination were obtained (0.96 and 0.93), and the RMSE changes only slightly, from 2.92 to 3.53 per ten-day period.Thus, the HJ satellite may represent a reliable source of NDVI data.This comparison indicates that the NDVI values from different satellites affect the spatial pattern; thus, they influence the ET results.However, this effect has little effect on the ET results, according to the EC validation.This result enables the construction of a stable and continuous downscaled and finer-resolution dataset, based on further research into the normalization of the two sensors.

Comparison of Different Downscaling Approaches
To evaluate the performance of different models, similar experiments were carried out in the same study area using two other methods, specifically the subtraction method and the LinZi method.The same coarse ETa dataset covering the same time interval was used as the input.In principle, the 30-m Landsat ETa data are needed for both methods as the basic data used to evaluate the correlation between the coarse ETa data at different times.To evaluate the performance of the two approaches on a site-by-site basis, several in situ observations of ET are employed instead of the fine-scale ETa values.The comparison is shown in Table 7.

Comparison of Different Downscaling Approaches
To evaluate the performance of different models, similar experiments were carried out in the same study area using two other methods, specifically the subtraction method and the LinZi method.The same coarse ETa dataset covering the same time interval was used as the input.In principle, the 30-m Landsat ETa data are needed for both methods as the basic data used to evaluate the correlation between the coarse ETa data at different times.To evaluate the performance of the two approaches on a site-by-site basis, several in situ observations of ET are employed instead of the fine-scale ETa values.The comparison is shown in Table 7.

Comparison of Different Downscaling Approaches
To evaluate the performance of different models, similar experiments were carried out in the same study area using two other methods, specifically the subtraction method and the LinZi method.The same coarse ETa dataset covering the same time interval was used as the input.In principle, the 30-m Landsat ETa data are needed for both methods as the basic data used to evaluate the correlation between the coarse ETa data at different times.To evaluate the performance of the two approaches on a site-by-site basis, several in situ observations of ET are employed instead of the fine-scale ETa values.The comparison is shown in Table 7. Table 7 shows that all three methods display better performance for the relatively homogeneous cropland site (DM) than for the other two sites.In addition, due to the similarities between the subtraction method and the LinZi method, the R 2 results are nearly the same at the three sites.The values of the coefficient of determination obtained at Daman and Guantao using the "de-pixelation" method are slightly higher than those obtained using the other two methods.However, an obviously higher accuracy is observed at the Huazhaizi site.The R 2 value is 0.90 (0.81 for the other two methods), and the MB is 1.55 mm (2.35 for the subtraction method and 5.36 for the LinZi method) at this site.This pattern occurs mainly because the correlation methods concentrate only on the average difference between coarse ETa images, which may filter out some of the extremely high-frequency vegetation signal within semi-arid areas.Therefore, the "de-pixelation" method is more suitable for crop-sparse areas than other downscaling methods.
In regard to the differences in the MB obtained using the three methods, the "de-pixelation" method exhibits the most robust performance among the three methods, with the exception of the LinZi results at Guantao (4.21 mm compared with 4.00 mm).The subtraction method performs well at the Huazhaizi site but poorly at the other two.With several in situ ETa measurements substituting for the fine-scale Landsat ETa input, poorer results are expected for the two correlation methods.

Discussion
In this study, a new downscaling approach named "de-pixelation" method, which employs NDVI data and a land cover map with high spatial resolution, is introduced to downscale the ETWatch 1-km ETa data product to a spatial resolution of 30 m.There is a kind of downscaling strategy named "correlation" approaches that are currently widely used, which relies on the relationship between ETa data from two MODIS images (or other high temporal resolution datasets) separated by a given time interval.This relationship is evaluated pixel by pixel or over a small area using either a first-order linear correlation or the difference after subtraction.The linear correlation or the difference is then applied to the fine-scale resolution ET data retrieved from Landsat images.Every Landsat pixel is related to the MODIS pixel it falls inside [7,23,57].This kind of strategy, including both subtraction and regression (slope-intercept and LinZi) schemes, focuses to a greater degree on the differences among coarse pixels and assumes that all of the finer-scale ET pixels within a given coarse pixel (or within a small area for the slope-intercept method) share the same correlation.These methods do not distinguish between different land cover types; thus, to some extent, they weaken the differences among different land cover types, especially in areas with complex surfaces.Thus, based on the comparison in Section 4.4, our newly proposed method performs better and more robustly than the other two methods (the two correlation approaches) in crop-sparse areas, such as the HZZ site.Therefore, the new "de-pixelation" method is more suitable for areas with complex land cover distributions.
In this study, monthly Landsat8 NDVI images are used to represent the condition of vegetation at a spatial resolution of 30 m × 30 m. ET products with finer spatial resolutions can be obtained from multi-spectral satellites, such as RapidEye, and additional research is required.Such advances could make it possible to monitor the water use of every land parcel [59].In addition, our future work will concentrate on improvements in temporal resolution using the STARFM downscaling method to extend the model to a daily scale by including information from the MODIS 8-day NDVI product.This method could provide additional temporal detail over crop areas [26,60].
For the three EC sites within the two study areas examined in this study, representing semi-arid and semi-humid climate types, NDVI saturation did not have an appreciable influence during the growing season because of the limited amount of irrigation water in the Heihe Basin and the two short growing periods in the Haihe Basin.However, further research should be carried out to address the NDVI saturation effect in areas where such conditions occur (especially for crops).

Conclusions
Monitoring evapotranspiration using RS-based methods could improve the level of water planning and management through the generation of datasets with both fine temporal and spatial scales.In this paper, a new ET downscaling strategy which combines images from different resolution is proposed.The downscaled results obtained using the "de-pixelation" method show good agreement with the original ETWatch 1-km dataset in terms of temporal changes.Based on the R 2 values between the ETWatch data and the results of the "de-pixelation" method, which range from 0.98 to 0.82, the accuracy of the fine-scale dataset is mainly determined by the corresponding coarse dataset, similar to the findings of other researchers [26].
Output results also show good agreement with in situ flux observations at three sites, indicating that the new approach represents a reliable substitute for the 30-m ETa dataset in areas with poor coverage by Landsat8 data or other data resources.The new "de-pixelation" method performs well at two relatively homogeneous sites, DM and GT, achieving similar R 2 values with respect to the original 1-km ETWatch dataset.However, an obvious improvement occurs at the HZZ site because the original coarse dataset may mix and eliminate some information at the boundaries between vegetation and bare soil.
A comparison between the results obtained using different coarse ETa data sources, MOD16 and ETWatch, is presented based on the analysis of the influence of the accuracy of the coarse data sources.The ETWatch dataset used in this study has been effectively adapted for the two study areas and has been well calibrated; therefore, it performs well.For other data sources, such as the MOD16A2 ET product, improving the adaptability of the original ET dataset to particular regions is expected to increase the accuracy of the downscaled product [61,62].Because the accuracy of the coarse input ET data influences the accuracy of output downscaled dataset, using low-accuracy ET data as an input may cause errors.
Additionally, because of the degree of cloud contamination, two HJ satellite images were substituted for the Landsat8 images in August and September at the two Zhangye sites.A comparison of the results from the two sensors is proposed to provide a validation of this substitute data source.The substitute NDVI data from the HJ satellite does not exert much influence on the final results.This finding makes it possible to generalize the method to areas where Landsat8 images are often affected by clouds through the substitution of other satellite data.Finally, a comparison between "De-pixelation" method and two correlation methods was proposed.The new method performs better at the HZZ site with more complex land cover surrounded.Above all, new downscaling method is suitable for heterogeneous area, to provide robust fine evapotranspiration dataset.

Figure 1 .
Figure 1.(a) land cover map of the middle reach of the Heihe Basin in 2013, (b) photograph of the eddy covariance (EC) site at Daman, (c) photograph of the EC site at Huanzhaizi.

Figure 1 .
Figure 1.(a) land cover map of the middle reach of the Heihe Basin in 2013; (b) photograph of the eddy covariance (EC) site at Daman; (c) photograph of the EC site at Huanzhaizi.

Figure 2 .
Figure 2. (a) land cover map of Guantao County, (b) photograph of the EC tower.Figure 2. (a) land cover map of Guantao County; (b) photograph of the EC tower.

Figure 2 .
Figure 2. (a) land cover map of Guantao County, (b) photograph of the EC tower.Figure 2. (a) land cover map of Guantao County; (b) photograph of the EC tower.

Figure 3 .
Figure 3. Comparisons of the ratio of NDVI/ET for the two main land cover types in (a) Guantao and (b) Zhangye.

Figure 3 .
Figure 3. Comparisons of the ratio of NDVI/ET for the two main land cover types in (a) Guantao and (b) Zhangye.

Figure 4 .
Figure 4. Downscaled ET distributions in the growing season in (a) Heihe and (c) Guantao and the interpolated map in (b) Heihe and (d) Guantao.

Figure 4 .
Figure 4. Downscaled ET distributions in the growing season in (a) Heihe and (c) Guantao and the interpolated map in (b) Heihe and (d) Guantao.

Figure 5 .
Figure 5.Comparison of ten-day Eta values obtained from the original ETWATCH product and the downscaled 30-m ET values obtained by applying the "de-pixelation" method to three sites: (a) Daman; (b) Huazhaizi; and (c) Guantao (n = 21).

Figure 5 .
Figure 5.Comparison of ten-day Eta values obtained from the original ETWATCH product and the downscaled 30-m ET values obtained by applying the "de-pixelation" method to three sites: (a) Daman; (b) Huazhaizi; and (c) Guantao (n = 21).

Figure 6 .Figure 7 .
Figure 6.Comparison of the ten-day ETa values from in situ observations and downscaled 30-m ET values obtained using the "de-pixelation" method at three sites: (a) Daman; (b) Huazhaizi; and (c) Guantao.

Figure 6 .Figure 6 .Figure 7 .
Figure 6.Comparison of the ten-day ETa values from in situ observations and downscaled 30-m ET values obtained using the "de-pixelation" method at three sites: (a) Daman; (b) Huazhaizi; and (c) Guantao.

Figure 8 .Figure 9 .
Figure 8. Comparisons of downscaled 30-m ET data from the MOD16 with in situ observations at the Daman Site (a,b) and the Guantao site (c,d).

Figure 8 .
Figure 8. Comparisons of downscaled 30-m ET data from the MOD16 with in situ observations at the Daman Site (a,b) and the Guantao site (c,d).

Figure 8 .Figure 9 .
Figure 8. Comparisons of downscaled 30-m ET data from the MOD16 with in situ observations at the Daman Site (a,b) and the Guantao site (c,d).

Figure 9 .
Figure 9.Comparison of ten-day downscaled 30-m ET data from the (a) Landsat8 and (b) HJ satellites with in situ observations.

Table 1 .
Percentages of different land cover types in Zhangye provided by ChinaCover.

Table 2 .
Percentages of different land cover types in Guantao County provided by ChinaCover.

Table 1 .
Percentages of different land cover types in Zhangye provided by ChinaCover.

Table 2 .
Percentages of different land cover types in Guantao County provided by ChinaCover.

Table 3 .
Band comparison of Landsat8 with HJ 1 A/B.

Table 4 .
Acquisition date of 30-m images.

Table 5 .
List of Ra values used in this research (Calculated by historical data).

Table 5 .
List of Ra values used in this research (Calculated by historical data).

Table 6 .
Comparison of the 1-km dataset and the 30-m product with the in situ values.

Table 7 .
Comparison among the results of the three downscaling methods using in situ data from the study areas.