Downscaling of MODIS One Kilometer Evapotranspiration Using Landsat-8 Data and Machine Learning Approaches

This study presented a MODIS 8-day 1 km evapotranspiration (ET) downscaling method based on Landsat 8 data (30 m) and machine learning approaches. Eleven indicators including albedo, land surface temperature (LST), and vegetation indices (VIs) derived from Landsat 8 data were first upscaled to 1 km resolution. Machine learning algorithms including Support Vector Regression (SVR), Cubist, and Random Forest (RF) were used to model the relationship between the Landsat indicators and MODIS 8-day 1 km ET. The models were then used to predict 30 m ET based on Landsat 8 indicators. A total of thirty-two pairs of Landsat 8 images/MODIS ET data were evaluated at four study sites including two in United States and two in South Korea. Among the three models, RF produced the lowest error, with relative Root Mean Square Error (rRMSE) less than 20%. Vegetation greenness related indicators such as Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), Soil Adjusted Vegetation Index (SAVI), and vegetation moisture related indicators such as Normalized Difference Infrared Index—Landsat 8 OLI band 7 (NDIIb7) and Normalized Difference Water Index (NDWI) were the five most important features used in RF model. Temperature-based indicators were less important than vegetation greenness and moisture-related indicators because LST could have considerable variation during each 8-day period. The predicted Landsat downscaled ET had good overall agreement with MODIS ET (average rRMSE = 22%) and showed a similar temporal trend as MODIS ET. Compared to the MODIS ET product, the downscaled product demonstrated more spatial details, and had better agreement with in situ ET observations (R2 = 0.56). However, we found that the accuracy of MODIS ET was the main control factor of the accuracy of the downscaled product. Improved coarse-resolution ET estimation would result in better finer-resolution estimation. This study proved the potential of using machine learning approaches for ET downscaling considering their effectiveness and ease of implementation. Future research includes development of the spatial-temporal fusion models of Landsat data and MODIS ET in order to increase temporal resolution of downscaled ET.


Introduction
Land surface evapotranspiration (ET) is an important hydrological process that controls water and energy transfer between soil, vegetation, and atmosphere.Quantification of ET is extensively needed in many agricultural, forest, and climatic applications.At local to regional scales, information on the spatial and temporal variations of ET is vital for drought monitoring, ecosystem health assessment, water requirement evaluation, and agricultural water allocation and management [1].At regional to global scales, quantifying ET is essential for mesoscale and global circulation modeling as it affects water and energy cycles between land, ocean, and atmosphere [2].As satellite remote sensing techniques can provide viable approaches for land surface process monitoring over large areas, various approaches have been developed for ET estimation from moderate to low resolutions based on remotely sensed data during the last few decades.These approaches generally include empirical/statistical methods that link land surface temperature (LST) and/or vegetation indices (VI) with in situ ET measurements, and more physically-based models including surface energy balance (SEB) models such as Surface Energy Balance Algorithm for Land (SEBAL) [3,4], Mapping EvapoTRanspiration with Internalized Calibration (METRIC) [5], Atmosphere-Land Exchange Inverse (ALEXI) model [6], and Surface Energy Balance System (SEBS) models [7], and Penman-Monteith models [8].
The Moderate Resolution Imaging Spectroradiometer (MODIS) global evapotranspiration products (MOD16) are the first 1 km land surface ET dataset for the 109,030,000 km 2 global vegetated land surface at 8-day, monthly, and annual intervals [9].The products were developed using a Penman-Monteith approach driven by MODIS derived surface albedo, land cover, Leaf Area Index (LAI), the Fraction of absorbed Photosynthetically Active Radiation (FPAR), and daily surface meteorological parameters [9,10].Various studies have evaluated the products over different regions and shown that they agree well with the other satellite or model-based ET datasets, such as the EUropean Organisation for the Exploitation of METeorological SATellites (EUMETSAT) Satellite Application Facility on Land Surface Analysis (LSA-SAF), ET product from Meteosat Second Generation (MSG) [11] and the European Centre for Medium-Range Weather Forecasts (ECMWF) (GLDAS) ET images [12].The evaluation against ground measurements showed that the errors of MODIS ET products were around 0.31 mm/day, or 24% of the ET measured from 46 AmeriFlux eddy covariance flux towers [10].It was also reported that the products had better accuracy in temperate and fully humid climates, but underestimated ET in semiarid climates compared to 15 CarboEurope eddy covariance tower sites [11].Currently, the MODIS ET products have been used in many applications such as terrestrial gross primary productivity (GPP) estimation [13], drought monitoring [14], groundwater recharge quantification [15,16], and watershed runoff simulation [17].
Although MODIS ET products have provided a great opportunity for routine monitoring of land surface conditions at mesoscale, the 1 km ET estimates tend to be inaccurate and are not applicable at field, local, or basin scale because of a high level of spatial heterogeneity of land cover within a pixel.Especially for agriculture applications, MODIS resolution of 1 km is too coarse and may cause significant ET errors since a single pixel is generally larger than individual crop fields.During the last five years, a few methodologies have been proposed for the downscaling of ET from coarser resolution remotely sensed data using information provided by higher resolution imagery.Some studies [18][19][20] introduced disaggregation schemes that have been applied in thermal data fusion by examining the linear relationship between lower and higher resolution ET from multisensor datasets; other research [21,22] made use of image fusion methods such as the spatial and temporal adaptive reflectance fusion model (STARFM) that was designed for surface reflectance blending.Some studies aimed at improving spatial details of coarse resolution ET based on fine resolution remotely sensed data [19,21], while others aimed at generating both high spatial and temporal resolution ET [21,22].A comprehensive review by Ha et al. [23,24] stated that although downscaling methods have been widely developed to retrieve high resolution air temperature, precipitation and soil moisture, the approaches specifically for ET downscaling have not been fully explored.The study reviewed 28 downscaling methods used in various disciplines including hydrology, meteorology, image processing, and geostatistics, and addressed the potential of introducing these methods for ET downscaling.
The existing approaches developed specifically for ET downscaling mainly involved generation of ET from both MODIS and Landsat images, and then the relationship between the two ET products was evaluated to derive daily [18,22], monthly [19] or seasonal [20] ET at Landsat resolution.This indicates that the accuracies of downscaled ET are dependent on both MODIS and Landsat ET calculations, which are mostly based on SEB models such as METRIC and SEBAL.Although SEB models have been widely used for Landsat-series data including Landsat 5 Thematic Mapper (TM) and Landsat 7 Enhanced Thematic Mapper plus (ETM+) [3][4][5][6][7][8], they have not been used for recently launched Landsat 8 data.Research has revealed that estimation of daily ET at pixel level could be very sensitive to LST, meteorological input parameters such as net radiation, air temperature, and reference ET calculated with these parameters [25,26]; some parameters are satellite sensor specific, which have been well established for MODIS and Landsat TM/ETM+ sensors, but not for Landsat 8 Operational Land Imager (OLI) or Thermal Infrared Sensor (TIRS) sensors yet.Conventional SEB models that were tuned for Landsat TM/ETM+ sensors must be used with caution because the new sensors vary with their predecessors in that the spectral bands and the instrument performances are different [27].
In this paper, we introduced machine learning approaches for MODIS ET downscaling.Although machine learning algorithms have been successfully used in downscaling land parameters in other topics such as downscaling of General Circulation Models (GCMs) outputs [28] or latent heat flux estimates from Noah land surface model [29], and also have been used in downscaling satellite-derived soil moisture [30], they have not been applied in remotely sensed ET downscaling [23,24].The objective of this study is to develop a machine learning-based method in order to generate 8-day ET maps at 30 m resolution, based on MODIS 8-day ET product (MOD16A2) and Landsat 8 data, aiming to produce higher resolution ET that is consistent with the MODIS ET product.The method assumed that vegetation and surface wetness conditions do not change considerably within an 8-day interval.We examined three machine learning approaches-Cubist, Support Vector Regression (SVR), and Random Forest (RF)-for downscaling of MODIS 1 km 8-day ET product (MOD16A2) using eleven 30-m Landsat 8 data-derived variables (indices), including surface albedo, LST, Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), Soil Adjusted Vegetation Index (SAVI), Modified Soil Adjusted Vegetation Index (MSAVI), Normalized Difference Moisture Index (NDMI), Normalized Difference Water Index (NDWI), Depth of Landsat 8 OLI band 6 (D 1609 ), Normalized Difference Infrared Index-Landsat 8 OLI band 7 (NDIIb7) and Temperature Vegetation Dryness Index (TVDI).Among the eleven indices, LST has been successfully used for ET estimation due to its sensitivity to local moisture status; surface albedo has been used as an input parameter for various SEB models; vegetation greenness indices including NDVI, EVI, SAVI and MSAVI are associated with growing status of vegetation; and wetness indices including NDWI, NDMI, D 1609 , and NDIIb7 are highly correlated with water consumption that is reflected from ET processes.The method assumed that vegetation and surface wetness conditions do not change considerably within an 8-day interval, and the growing status of vegetation is highly correlated with water consumption that is reflected from ET processes.Different from previous ET downscaling methods, the machine learning models in our study do not need to calculate Landsat-based ET and thus do not rely on the accuracy of meteorological parameters.The models were tested at four study sites within different climate zones in the United States and South Korea, and evaluated using in situ flux tower measurements.

Study Areas
Four study areas, including two sites in the United States and two sites in South Korea, were explored in this study (Figure 1).The two U.S. sites are located near Lamont, Oklahoma (Site 1: 97.89 ˝W, 36.27˝N) and Twitchell Island, California (Site 2: 121.65 ˝W, 37.58 ˝N), and feature two distinct climate conditions based on the Köppen-Geiger climate classification map [31].Site 1 experiences a humid subtropical climate (Köppen climate map classified as Cfa) with hot, wet summers and cold, dry winters.The average temperature is around 16.    1).Because of the geometric excellence of Landsat 8 OLI and TIRS images [32], all images had geometric accuracy, i.e., root mean square error (RMSE) less than 1 pixel.A subset of each image was extracted to cover a variety of land cover types and less non-vegetated area.1).Because of the geometric excellence of Landsat 8 OLI and TIRS images [32], all images had geometric accuracy, i.e., root mean square error (RMSE) less than 1 pixel.A subset of each image was extracted to cover a variety of land cover types and less non-vegetated area.1).Because of the geometric excellence of Landsat 8 OLI and TIRS images [32], all images had geometric accuracy, i.e., root mean square error (RMSE) less than 1 pixel.A subset of each image was extracted to cover a variety of land cover types and less non-vegetated area.(5-day or 6-day for the end of the year) at 1 km resolution from the year 2000 to 2014 [33].MOD16A2 ET is the sum of ET during each 8-day time periods within a year (mm/8 days, mm/5 days or mm/6 days for 361 composite data).In this study, the datasets covering Landsat image dates were selected, with starting dates of 1-4 days before the Landsat image was acquired (Table 1).ET and ET_QC layers were subset within the study area boundary, and the quality flag layer was used to exclude low quality ET.All ET products were re-projected to be the same projection system with Landsat 8 imagery, i.e., UTM WGS 1984 14N for Site 1, 10N for Site 2 and 52N for Sites 3 and 4.

In Situ ET Calculation
In situ ET measurements were obtained from AmeriFlux towers at the two U.S. sites and flux towers operated by the Hydrological Survey Centre at the South Korean sites.AmeriFlux is a network composed of over 110 sites across North and South America that continuously measure meteorological variables such as CO 2 , water and energy fluxes using an eddy covariance technique.In this study, each U.S. site has an AmeriFlux tower station, namely US Atmospheric Radiation Measurement Program Southern Great Plains Main (US-ARM SGP Main) at Site 1, and US Twitchell Island (US-Twt) at Site 2 (Figure 2a,b).US-ARM flux tower is located at an open cattle pasture surrounded by a rain fed winter wheat field [34].Summer crops such as maize, soybean, or sorghum are commonly planted after wheat is harvested [34,35].US-Twt site is a rice paddy field.Level 2 data with half-hourly flux measurements during 2013-2014 were downloaded and processed as higher level data is not available for recent years.Air temperature, relative humidity (RH) and air pressure (Pa) measurements were used to calculate water vapor for LST computation [36], and LE (W/m 2 ) measurements were used to derive 8-day ET measurements (mm/8-day) following [10].As the AmeriFlux tower observations are collected every 30 minutes, ET (mm/30 min) was first calculated as λ " p2.501 ´0.002361 ˆTn q ˆ10 6  (1) where T n is air temperature for nth 30-minute observation in degree Celsius and λ is the latent heat of vaporization (J/kg).When the number of valid 30-minute measurements of LE and T n (N) is no less than 40, daily ET was calculated using Equation (3).When N is less than 40, daily ET was estimated by linear interpolation with the two nearest daily ET values.The 8-day ET is a sum of daily ET during an 8-day interval with the same starting date of MODIS ET product.
ET " Similar to the AmeriFlux system, two flux towers in South Korea (i.e., Seolmacheon in Site 3 and Cheongmicheon in Site 4) measure meteorological and soil parameters and use an eddy covariance method to calculate ET (Figure 2c,d).The Seolmacheon flux tower was designed to observe ET over a mixed forest with trees of 30-50 years at an altitude of about 420 m, while the Cheongmicheon was constructed to monitor ET over agricultural lands in a rice paddy area at an altitude of 310 m.

Methodology
The ET downscaling method mainly consists of three components (Figure 3), including (1) derivation of eleven Landsat 8 data-based indicators that are associated with vegetation growth, surface temperature and moisture, and upscaling of the indicators so as to be consistent with the MODIS 8-day ET product; (2) development of machine learning models using samples with Landsat 8-derived indicators as predictor variables and MODIS 8-day ET as a response variable; and (3) generation of ET with 30 m resolution based on the developed models, and accuracy assessment through comparison between the downscaled ET, MODIS and in situ ET measurements.

Methodology
The ET downscaling method mainly consists of three components (Figure 3), including (1) derivation of eleven Landsat 8 data-based indicators that are associated with vegetation growth, surface temperature and moisture, and upscaling of the indicators so as to be consistent with the MODIS 8-day ET product; (2) development of machine learning models using samples with Landsat 8-derived indicators as predictor variables and MODIS 8-day ET as a response variable; and (3) generation of ET with 30 m resolution based on the developed models, and accuracy assessment through comparison between the downscaled ET, MODIS and in situ ET measurements.

Landsat 8 VIs Calculation
Surface albedo and vegetation indices were derived from Landsat 8 OLI surface reflectance.Although surface reflectance data for the Landsat 8 OLI sensor are provided from USGS, the internal atmospheric correction algorithm has not been published and validated [37].Research has found that different atmospheric correction methods, such as Second Simulation of a Satellite Signal in the Solar Spectrum Vector (6SV), radiative transfer algorithm, Dark Object Subtraction (DOS) and Fast Lineof-sight Atmospheric Analysis of Hypercubes (FLAASH) algorithm do not produce significant differences in vegetation index calculation from Landsat 8 OLI data [27].Therefore, in this study we adopted the FLAASH algorithm in ENVI software to perform atmospheric correction for the Landsat 8 images.Surface reflectance of bands 1-7 were used to calculate albedo and vegetation indices, and a brightness temperature of band 10 was used to generate LST.Cloud and cloud shadows were detected using the "Fmask" algorithm [38] in order to create a mask layer.The mask layer, in combination with Landsat 8 OLI L1T-provided Quality Assurance (QA) data were used to remove contamination by cloud/cirrus/shadow.Eight vegetation indices showing vegetation greenness and water content including NDVI, EVI, SAVI, MSAVI, NDMI, NDWI, NDIIb7, and D1609 were calculated from surface reflectance for each cloud-clear pixel (see Table 2 [39][40][41][42][43][44][45][46] for equations).Although surface reflectance data for the Landsat 8 OLI sensor are provided from USGS, the internal atmospheric correction algorithm has not been published and validated [37].Research has found that different atmospheric correction methods, such as Second Simulation of a Satellite Signal in the Solar Spectrum Vector (6SV), radiative transfer algorithm, Dark Object Subtraction (DOS) and Fast Line-of-sight Atmospheric Analysis of Hypercubes (FLAASH) algorithm do not produce significant differences in vegetation index calculation from Landsat 8 OLI data [27].Therefore, in this study we adopted the FLAASH algorithm in ENVI software to perform atmospheric correction for the Landsat 8 images.Surface reflectance of bands 1-7 were used to calculate albedo and vegetation indices, and a brightness temperature of band 10 was used to generate LST.Cloud and cloud shadows were detected using the "Fmask" algorithm [38] in order to create a mask layer.The mask layer, in combination with Landsat 8 OLI L1T-provided Quality Assurance (QA) data were used to remove contamination by cloud/cirrus/shadow.Eight vegetation indices showing vegetation greenness and water content including NDVI, EVI, SAVI, MSAVI, NDMI, NDWI, NDIIb7, and D 1609 were calculated from surface reflectance for each cloud-clear pixel (see Table 2 [39][40][41][42][43][44][45][46] for equations).* The constant c represents the relative position of band 6 to band 7 and band 5 calculated using the mean wavelength of Landsat bands 5 (0.865 µm), 6 (1.609 µm) and 7 (2.201 µm).

Landsat 8 Broadband Surface Albedo Estimation
To date, operational algorithms have not been developed for Landsat 8 OLI broadband surface albedo computation.Compared to its predecessors Landsat TM and ETM+ sensors, Landsat 8 OLI incorporates two additional spectral bands, one deep blue band (0.43-0.45 µm) and one shortwave infrared band (1.36-1.39µm).Besides, reflective wavebands corresponding to the previous Landsat ETM+ bands have narrower waveband widths, i.e., different spectral response functions.Therefore, algorithms and parameters developed for albedo calculation for Landsat TM/ETM+ sensors cannot be directly used for the Landsat 8 OLI sensor.In this study, the albedo calculation strategy widely used for SEB models was employed [47], where broadband surface albedo is calculated as the integration of at-surface reflectance across the shortwave spectrum, as shown in Equation (4).α " where α is albedo, ρ b is surface reflectance for a given band b, which can be derived from any atmospheric correction method such as FLAASH used in this study.w b is the weighting coefficient representing the surface solar radiation fraction within the spectral range for band b, and is calculated as: where R sλ is at-surface spectral hemisphere solar radiation for wavelength λ (µm); and UP b and LO b are upper-and lower-wavelength bounds assigned to Landsat 8 band b, respectively.Following similar procedures in [47], the Simple Model of the Atmospheric Radiative Transfer of Sunshine (SMARTS) radiative transfer model [48] was used for the calculation of at-surface solar radiation levels for three combinations of land elevations, precipitable water content, and solar angles.Regardless of the wide range of solar radiation settings, w b did not have significant change.Table 4 presents the calculated w b for Landsat 8 OLI in comparison with Landsat 7 ETM+ sensor.Although Landsat 8 TIRS sensor measures thermal infrared radiance with two bands (band 10 and band 11) located at 10 and 12 µm, significant calibration errors caused by stray light have been reported for band 11 [49].As suggested by United States Geological Survey (USGS), a single channel algorithm by Jiménez-Muñoz et al. [50] was adopted to derive land surface temperature from band 10 in this study.Water vapor content required by the algorithm was derived from the measurements at weather stations or the MODIS water vapor product (MOD05L2 Collection 5.1) if weather station measurements were not available within the study area [36].
TVDI, which represents water stress of vegetation, was estimated on the basis of the triangular shape of LST vs. NDVI scatter plots [51].

TVDI "
Ts ´Ts min a `b¨NDV I ´Ts min (6) where Ts is LST for each pixel in the image, Ts min is the minimum LST, i.e., wet edge in the LST-NDVI space.Coefficients a and b are determined by a linear fit between maximum LST for a given NDVI (Ts max ) and NDVI at the dry edge [51].

Upscaling of Landsat 8 Indices
Landsat 8 data-derived indicators were upscaled to 1 km resolution by means of a simple aggregation approach.First, MODIS 8-day 1 km ET product (MOD16A2) was re-projected with the same projection system as Landsat 8 data.For each indicator, pixels within each MODIS 1 km grid were averaged to generate Landsat 1 km indicators.If a considerable amount of 30 m pixels (>30%) were covered with cloud/cirrus/cloud shadow within a 1 km grid, the grid was excluded from modeling.Ke et al. [27] has shown that the simple aggregation method produced good consistency between Landsat 8 derived NDVI and MODIS NDVI at MODIS resolution.MOD16A2 QA masks were also used to remove grids with low quality caused by cloud/aerosol states or algorithm failure [33].

Machine Learning-Based Downscaling Models
Three machine learning algorithms including Cubist, Support Vector Regression (SVR) and Random Forest (RF) were examined in this study.Cubist is a software package that implements a modified regression tree algorithm, which combines instance-and model-based techniques to create rule-based multivariate regression models from training data [52].It allows overlap between multivariate regression models, thus predictions from multiple rules are averaged to produce the final output.Cubist has been widely used in different applications of remote sensing, such as biomass estimation, impervious surface quantification, drought monitoring, water quality estimation, and land cover estimation [53][54][55][56].
SVR is a regression version of Support vector machines (SVMs).The basic principle of SVM is that independent variables are mapped to a higher dimensional space where nonlinear correlations are transformed to be linearly separable using kernel functions, and the optimal linear separator is selected to maximize the margin between classes [57].SVR, as an extension of SVM, uses the same principles as SVM for classification, while it can produce real numbers as output given a set of independent input variables by defining a margin of tolerance (ε).Both SVM and SVR have been proven robust for classification and the estimation of continuous properties in remote sensing [58][59][60][61][62][63][64].In this study, LibSVM (version 3.20) tools developed by Chang and Lin [65] were adopted to implement SVR, for which the radius basis function (RBF) kernel was used and the optimal combination of parameters such as cost, gamma and ε were selected using a grid search approach.
RF is an improved algorithm based on a classification and regression trees (CART) method, by combining a large set of CARTs in an ensemble.It has been widely used for various remote sensing applications for both classification and regression [66][67][68].Compared to the CART algorithm, RF uses a bootstrap aggregating technique to improve model performance.Each tree is built from a random subset of training data with a random subset of predictor variables, and the aggregation of output from many trees may smooth the variance between trees and yield more accurate prediction results [69].Variable importance is obtained by the percentage increase in the mean squared error (MSE), which is calculated using out-of-bag (OOB) data when each variable is permuted.In this study, randomForest add-on package provided in R statistical software was used to implement the method.
Using the eleven upscaled 1 km Landsat indices as predictor variables and MODIS 1 km ET product as the response variable, prediction models were established based on Cubist, SVR and RF algorithms, respectively.The developed models were then applied to the eleven 30 m Landsat indices in order to generate 30 m ET product.In order to evaluate the model performances, ten-fold cross validation with fixed subsets of samples was conducted for each model, and the average RMSE and relative RMSE (rRMSE) as a percentage of mean MODIS 8-day ET of validation datasets were assessed and compared.The developed machine learning models were then applied to the original 30 m Landsat indicator layers to predict downscaled ET at 30 m resolution.The models were also applied to the upscaled indicators to create Landsat-modeled 1 km ET.The downscaled 30 m ET was then compared with MODIS 1 km ET and Landsat modeled 1 km ET in terms of the ET spatial patterns.The predicted 30 m ET was aggregated to 1 km resolution and compared to MODIS ET.All four products, including MODIS 1 km ET, Landsat-downscaled 30 m ET, Landsat-modeled 1 km ET, and Landsat-aggregated 1 km ET were assessed using in situ ET measurements.

Results from Three Machine Learning Algorithms
Among all three models, SVR models resulted in the lowest performances, as the RMSE and rRMSE were always the highest at all study sites (Figure 4).RF yielded similar or slightly better performances than Cubist, with RMSE less than 2.5 mm/8 days for US sites and 5.7 mm/8 days for Korean sites; rRMSEs were less than 20% at the US sites and less than 15% at Korea sites, which is within the range of the MODIS ET product accuracy (rRMSE around 25%) [10].Performances of all models have similar temporal patterns, with increasing errors (both RMSE and rRMSE) during growing seasons (June-September) and decreasing errors during leaf-off seasons (October-December).Korean sites; rRMSEs were less than 20% at the US sites and less than 15% at Korea sites, which is within the range of the MODIS ET product accuracy (rRMSE around 25%) [10].Performances of all models have similar temporal patterns, with increasing errors (both RMSE and rRMSE) during growing seasons (June-September) and decreasing errors during leaf-off seasons (October-December).Figure 5 shows the average importance of the eleven Landsat indicators resulting from all RF models and their ranks.Importance of a variable is measured by the increase in MSE when the variable is permuted.Greater increase in MSE indicates higher importance of the variable.Table 5 shows the mean and standard deviation of linear correlation coefficients ( ) between Landsat Figure 5 shows the average importance of the eleven Landsat indicators resulting from all RF models and their ranks.Importance of a variable is measured by the increase in MSE when the variable is permuted.Greater increase in MSE indicates higher importance of the variable.Table 5 shows the mean and standard deviation of linear correlation coefficients (ρ) between Landsat indicators and MODIS ET calculated at all dates or only during growing seasons.From Figure 5 and Table 5, the variable importance in RF models was largely consistent with the correlations between the variables and MODIS ET.Among the eleven indictors, NDVI, EVI, and SAVI that are related to vegetation greenness and growth conditions were ranked the most important in ET modeling.Permuting values of NDVI could lead to as much as a 7.2 mm/8 days ET bias.The correlation coefficients between these variables and MODIS ET were also highest with positive relationships (ρ > 0.69 when averaged at all dates and ρ > 0.73 during growing seasons, p-value < 0.0001).
Remote Sens. 2016, 8, 215 12 of 26 indicators and MODIS ET calculated at all dates or only during growing seasons.From Figure 5 and Table 5, the variable importance in RF models was largely consistent with the correlations between the variables and MODIS ET.Among the eleven indictors, NDVI, EVI, and SAVI that are related to vegetation greenness and growth conditions were ranked the most important in ET modeling.Permuting values of NDVI could lead to as much as a 7.2 mm/8 days ET bias.The correlation coefficients between these variables and MODIS ET were also highest with positive relationships ( > 0.69 when averaged at all dates and > 0.73 during growing seasons, p-value < 0.0001).Variables representing vegetation moisture status including NDIIb7 and NDWI were ranked the fourth and fifth.As is shown in Table 5, correlations between NDIIb7, NDWI and MODIS ET were slightly weaker than NDVI/EVI/SAVI but still strong and significant ( = 0.68 and −0.67, respectively, for all seasons and = 0.79 and −0.76 for growing seasons).LST and TVDI are also negatively related to ET as increasing temperature usually corresponds to dry conditions resulting in less  Variables representing vegetation moisture status including NDIIb7 and NDWI were ranked the fourth and fifth.As is shown in Table 5, correlations between NDIIb7, NDWI and MODIS ET were slightly weaker than NDVI/EVI/SAVI but still strong and significant (ρ = 0.68 and ´0.67, respectively, for all seasons and ρ = 0.79 and ´0.76 for growing seasons).LST and TVDI are also negatively related to ET as increasing temperature usually corresponds to dry conditions resulting in less evapotranspiration.Although previous studies have shown that temperature-related variables have a dominant impact on satellite-based daily ET estimation, the correlation coefficients between LST and ET are only ´0.59 on average in our study and those between TVDI and ET were only ´0.32 when all seasons were considered.LST/TVDI were overall ranked less important than vegetation greenness and moisture indices.However, during growing seasons the relationships between LST/TVDI and ET were much stronger, as the correlation coefficients increased to ´0.73 and ´0.60, respectively, and the standard deviations declined.Albedo has little impact on the modeling results.The relationship between albedo and ET was weak (ρ = ´0.38 for all seasons and ´0.44 for growing season).The modified SAVI (MSAVI) was not superior over SAVI in this study, as the correlation was weaker and was not identified as an important variable.

Agreement between Landsat and MODIS ET
As RF outperformed SVR and it produced similar or slightly better accuracy than Cubist, it was used to generate the downscaled 30 m ET and Landsat modeled 1 km ET.In order to evaluate the overall agreement of Landsat downscaled ET against MODIS ET, the 30 m product was aggregated to 1 km grids (Landsat-aggregated 1 km ET), and RMSE and rRMSE were calculated for each site.As shown in Figure 6, RMSE of Landsat-aggregated 1 km ET was within 2 mm/8 days and rRMSE within 23% at Site 1. Site 2 produced slightly higher RMSE (0.8-2.9 mm/8 days) and rRMSE (15%-26%).Korean sites have higher RMSE and similar rRMSE as US sites.The average rRMSE from the four sites is 22%.Sites 1 and 2 showed obviously similar temporal trends, as the aggregated ET had a decreasing agreement with MODIS ET during the growing season compared to leaf-off season.The agreement was slightly lower than, if not similar as, that between the Landsat modeled 1 km ET and MODIS ET shown in Figure 4.This means that the model performance could be used as an indicator to explain the resultant accuracy of downscaled ET with MODIS ET as reference data.
evapotranspiration.Although previous studies have shown that temperature-related variables have a dominant impact on satellite-based daily ET estimation, the correlation coefficients between LST and ET are only −0.59 on average in our study and those between TVDI and ET were only −0.32 when all seasons were considered.LST/TVDI were overall ranked less important than vegetation greenness and moisture indices.However, during growing seasons the relationships between LST/TVDI and ET were much stronger, as the correlation coefficients increased to −0.73 and −0.60, respectively, and the standard deviations declined.Albedo has little impact on the modeling results.The relationship between albedo and ET was weak ( = −0.38 for all seasons and −0.44 for growing season).The modified SAVI (MSAVI) was not superior over SAVI in this study, as the correlation was weaker and was not identified as an important variable.

Agreement between Landsat and MODIS ET
As RF outperformed SVR and it produced similar or slightly better accuracy than Cubist, it was used to generate the downscaled 30 m ET and Landsat modeled 1 km ET.In order to evaluate the overall agreement of Landsat downscaled ET against MODIS ET, the 30 m product was aggregated to 1 km grids (Landsat-aggregated 1 km ET), and RMSE and rRMSE were calculated for each site.As shown in Figure 6, RMSE of Landsat-aggregated 1 km ET was within 2 mm/8 days and rRMSE within 23% at Site 1. Site 2 produced slightly higher RMSE (0.8-2.9 mm/8 days) and rRMSE (15%-26%).Korean sites have higher RMSE and similar rRMSE as US sites.The average rRMSE from the four sites is 22%.Sites 1 and 2 showed obviously similar temporal trends, as the aggregated ET had a decreasing agreement with MODIS ET during the growing season compared to leaf-off season.The agreement was slightly lower than, if not similar as, that between the Landsat modeled 1 km ET and MODIS ET shown in Figure 4.This means that the model performance could be used as an indicator to explain the resultant accuracy of downscaled ET with MODIS ET as reference data.At cropland, which is mainly irrigated in Site 2, ET increased from April to July, and then decreased until the end of the year.ET in grassland showed a decreasing trend since April.As grass growth in this area is mainly controlled by the amount of precipitation, it is expected that ET is lower during the hot and dry summer.Spatial variation of ET at cropland was higher than the other vegetated land covers for all three products, and at cropland the Landsat downscaled product showed even more detailed spatial distribution of ET than MODIS and the Landsat modeled 1 km ETs.Since only three dates were available for Site 4, the temporal patterns of ET were not captured in details.However, ET at cropland was a bit lower than that in forests in the middle of the growing season (i.e., June) while ET was very low during the off-growing season (i.e., November)  As grass growth in this area is mainly controlled by the amount of precipitation, it is expected that ET is lower during the hot and dry summer.Spatial variation of ET at cropland was higher than the other vegetated land covers for all three products, and at cropland the Landsat downscaled product showed even more detailed spatial distribution of ET than MODIS and the Landsat modeled 1 km ETs.Since only three dates were available for Site 4, the temporal patterns of ET were not captured in details.However, ET at cropland was a bit lower than that in forests in the middle of the growing season (i.e., June) while ET was very low during the off-growing season (i.e., November) regardless of land cover type.The good agreement between Landsat and MODIS ET in terms of spatial and temporal patterns proved the potential of our downscaling approach.
regardless of land cover type.The good agreement between Landsat and MODIS ET in terms of spatial and temporal patterns proved the potential of our downscaling approach.

Spatial Variation of the Downscaled ET Product
It is obvious that the Landsat-scale ET map in Figures 7 and 8 showed more spatial details than coarse resolution ET.Zoomed views of Landsat downscaled ET at Site 2 (Figure 9) also demonstrated that the product was able to represent the spatiotemporal dynamics of ET at the patch level, with different patches showing varying trends of ET change.

Spatial Variation of the Downscaled ET Product
It is obvious that the Landsat-scale ET map in Figures 7 and 8 showed more spatial details than coarse resolution ET.Zoomed views of Landsat downscaled ET at Site 2 (Figure 9) also demonstrated that the product was able to represent the spatiotemporal dynamics of ET at the patch level, with different patches showing varying trends of ET change.In order to evaluate the improvement of spatial information represented by the downscaled ET, the differences between the 30 m and 1 km MODIS ETs within the corresponding 1 km grids was calculated and the statistics of the differences was analyzed for each land cover type.First, a 1 km land cover layer was generated using 30 m land cover products.The USGS National Land Cover Database (NLCD) 2011 [70] was used for US sites and three land cover products (MODIS Global 500m Terra and Aqua combined land cover product (MCD12Q1 IGBP) [71], European Space Agency (ESA) Global Land Cover Map 2009 product (GlobCover 2009) [72], and the Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) [73] were combined for Korea sites to ensure accurate land cover information [27].The three products were combined to create a new ensemble 30 m land cover data only using the pixels that agreed from all three land cover layers.Grids with pure land cover types, i.e., a single land cover occupying over 80% of the 1 km grid area were selected for further analysis.Boxplots in Figure 10 show the distribution of the differences between 30 m Landsat downscaled ET and MODIS ET within crop, forest, grass, and shrub land.For all sites, the median of the differences was around zero, indicating an overall unbiased estimation of ET using MODIS ET as a reference.Dispersion of the biases against MODIS ET increased from spring to summer and decreased during winter, implying that the downscaled ET revealed a higher spatial variation during growing seasons.At Site 1, the majority of the differences (25th-75th percentile) was within ±5 mm/8 days, and the extreme difference values were around ±10 mm/8 days.Cropland and forests showed a slightly greater dispersion of the differences than grassland.At Site 2, the difference values were on average within ±5 mm/8 days; the extreme values were on average ±20 mm/8 days at croplands and forest, and ±6 mm/8 days at grassland and shrub area.Sites 3 and 4 showed a similar pattern.For both sites, the majority of the differences (25th-75th percentile) were within ±10 mm/8 days, and the extreme difference values were around ±20 mm/8 days.Dispersion of the biases considerably decreased during the leaf-off season for Site 4, while it was maximized in the middle of the growing season for Site 3. In order to evaluate the improvement of spatial information represented by the downscaled ET, the differences between the 30 m and 1 km MODIS ETs within the corresponding 1 km grids was calculated and the statistics of the differences was analyzed for each land cover type.First, a 1 km land cover layer was generated using 30 m land cover products.The USGS National Land Cover Database (NLCD) 2011 [70] was used for US sites and three land cover products (MODIS Global 500m Terra and Aqua combined land cover product (MCD12Q1 IGBP) [71], European Space Agency (ESA) Global Land Cover Map 2009 product (GlobCover 2009) [72], and the Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) [73] were combined for Korea sites to ensure accurate land cover information [27].The three products were combined to create a new ensemble 30 m land cover data only using the pixels that agreed from all three land cover layers.Grids with pure land cover types, i.e., a single land cover occupying over 80% of the 1 km grid area were selected for further analysis.Boxplots in Figure 10 show the distribution of the differences between 30 m Landsat downscaled ET and MODIS ET within crop, forest, grass, and shrub land.For all sites, the median of the differences was around zero, indicating an overall unbiased estimation of ET using MODIS ET as a reference.Dispersion of the biases against MODIS ET increased from spring to summer and decreased during winter, implying that the downscaled ET revealed a higher spatial variation during growing seasons.At Site 1, the majority of the differences (25th-75th percentile) was within ˘5 mm/8 days, and the extreme difference values were around ˘10 mm/8 days.Cropland and forests showed a slightly greater dispersion of the differences than grassland.At Site 2, the difference values were on average within ˘5 mm/8 days; the extreme values were on average ˘20 mm/8 days at croplands and forest, and ˘6 mm/8 days at grassland and shrub area.Sites 3 and 4 showed a similar pattern.For both sites, the majority of the differences (25th-75th percentile) were within ˘10 mm/8 days, and the extreme difference values were around ˘20 mm/8 days.Dispersion of the biases considerably decreased during the leaf-off season for Site 4, while it was maximized in the middle of the growing season for Site 3.

Comparisons between Satellite-Based and in Situ ET
Time series of satellite-based ET and in situ observations at the flux towers (US-ARM, US-Twt, Seolmacheon, and Chungmicheon) within the four study sites are summarized in Figure 11.At the four sites, both satellite-based and in situ ETs showed a similar seasonal trend, with a peak season around July and a low season in December at Sites 1, 2, and 4, and ET peak was found to be in the middle of the growing season at Site 3. Compared to the other sites, the flux tower at Site 2 (US-Twt) observed greater ET in growing seasons.This is because the flux tower is located within a rice paddy that consumes a greater amount of water than the wheat does at Site 1 (similarly rice paddy at Site 4).Each year, the field is flooded after rice is planted in spring, and drained when rice is harvested in early fall.Thus, ET experiences a rapid rise from April to June and a sharp descent from September.

Comparisons between Satellite-Based and in situ ET
Time series of satellite-based ET and in situ observations at the flux towers (US-ARM, US-Twt, Seolmacheon, and Chungmicheon) within the four study sites are summarized in Figure 11.At the four sites, both satellite-based and in situ ETs showed a similar seasonal trend, with a peak season around July and a low season in December at Sites 1, 2, and 4, and ET peak was found to be in the middle of the growing season at Site 3. Compared to the other sites, the flux tower at Site 2 (US-Twt) observed greater ET in growing seasons.This is because the flux tower is located within a rice paddy that consumes a greater amount of water than the wheat does at Site 1 (similarly rice paddy at Site 4).Each year, the field is flooded after rice is planted in spring, and drained when rice is harvested in early fall.Thus, ET experiences a rapid rise from April to June and a sharp descent from September.Although satellite-based products captured the temporal trend of ET variation at the four study sites, the values of ET had large discrepancies from the in situ ET measurements, especially during growing seasons.Figures 11a, b  Although satellite-based products captured the temporal trend of ET variation at the four study sites, the values of ET had large discrepancies from the in situ ET measurements, especially during growing seasons.Figure 11a, b, and d show a large underestimation of ET at the crop sites, especially for the MODIS-based, Landsat modeled 1 km, and Landsat aggregated 1 km products, with the bias as great as ´20 mm/8 days.The downscaled 30 m ET is similar to MODIS ET at US-ARM (Site 1, during the year 2013) and Cheongmicheon (Site 4), while at US-Twt (Site 2) it is on average 13 mm/8 days higher than MODIS ET during growing seasons, which is closer to in situ ET.Further examination indicates that compared to Sites 1 and 4, the crop patches are more heterogeneous within a 1 km grid around the flux measurement tower at Site 2 (Figure 11f).The measurement tower is located within a rice patch, with higher vegetation greenness and wetness values and lower LST observation compared to the corn, alfalfa and fallow/idle crop patches surrounding the tower [74], thus higher ET estimates than the 1 km averaged ET within the MODIS 1 km grid.The higher and more accurate downscaled ET estimation at US-Twt site shows that the downscaling models could capture the ET spatial variation.Although it would be ideal to select homogeneous lands for our model evaluation, US-ARM and US-Twt are the only crop sites that provide flux measurements going up to 2014, and they represent typical climate types.Cheongmicheon (Site 4) is surrounded by rice paddy fields and a small portion of forested areas.Cheongmicheon showed a similar pattern with Sites 1 and 2, but the differences between the satellite-based and in situ ETs in Site 4 were smaller (~10 mm/8 days for the growing season and ~1 mm/8 days for the off-growing season) than in Sites 1 and 2, which could be due to relatively homogeneous crop areas and nearby forests.Unlike other sites, satellite-derived ETs were higher than in situ ET measurements for all three dates in the growing season at Site 3 (Figure 11c), where forest is the dominant land cover.Hu et al. [11] reported similar findings that the MOD16 ET product tends to overestimate forest ET mainly due to high LAI values used as input to ET estimation models.It was not able to look into seasonal variation, as there were only three dates of satellite-derived ETs all of which were from the growing season.While there were also three dates of satellite-derived ETs in Site 4, one was from the growing season and the other two from the off-growing season.Both in situ and satellite-derived ET showed a similar seasonal pattern at Sites 1 and 2.
Figure 12 shows the scatterplots of satellite-based ET estimates against in situ observations at each of the four sites, and Table 6 lists their linear agreement.Note that linear regression was not calculated for Sites 3 and 4 because there were only three measurements for the sites.At Site 13mm/8days higher than MODIS ET during growing seasons, which is closer to in situ ET.Further examination indicates that compared to Sites 1 and 4, the crop patches are more heterogeneous within a 1 km grid around the flux measurement tower at Site 2 (Figure 11f).The measurement tower is located within a rice patch, with higher vegetation greenness and wetness values and lower LST observation compared to the corn, alfalfa and fallow/idle crop patches surrounding the tower [74], thus higher ET estimates than the 1 km averaged ET within the MODIS 1 km grid.The higher and more accurate downscaled ET estimation at US-Twt site shows that the downscaling models could capture the ET spatial variation.Although it would be ideal to select homogeneous lands for our model evaluation, US-ARM and US-Twt are the only crop sites that provide flux measurements going up to 2014, and they represent typical climate types.Cheongmicheon (Site 4) is surrounded by rice paddy fields and a small portion of forested areas.Cheongmicheon showed a similar pattern with Sites 1 and 2, but the differences between the satellite-based and in situ ETs in Site 4 were smaller (~10 mm/8 days for the growing season and ~1 mm/8 days for the off-growing season) than in Sites 1 and 2, which could be due to relatively homogeneous crop areas and nearby forests.Unlike other sites, satellite-derived ETs were higher than in situ ET measurements for all three dates in the growing season at Site 3 (Figure 11c), where forest is the dominant land cover.Hu et al. [11] reported similar findings that the MOD16 ET product tends to overestimate forest ET mainly due to high LAI values used as input to ET estimation models.It was not able to look into seasonal variation, as there were only three dates of satellite-derived ETs all of which were from the growing season.While there were also three dates of satellite-derived ETs in Site 4, one was from the growing season and the other two from the off-growing season.Both in situ and satellite-derived ET showed a similar seasonal pattern at Sites 1 and 2.
Figure 12 shows the scatterplots of satellite-based ET estimates against in situ observations at each of the four sites, and Table 6 lists their linear agreement.Note that linear regression was not calculated for Sites 3 and 4 because there were only three measurements for the sites.At Site 1, downscaled ET yielded the highest R 2 (0.76) and lowest RMSE (11.8 mm/8 days) compared to coarse resolution products.At Site 2, although downscaled ET produced lower R 2 than MODIS ET, it still gave the lowest RMSE.

Novelty, Performances and Opportunities
In this paper, machine learning approaches were first introduced for ET downscaling from coarse resolution to fine resolution by combining MODIS 8-day ET product and Landsat 8-derived eleven indicators.The rRMSEs of RF models were less than 20% and within the error range of the MODIS ET product (rRMSE around 25%), indicating the reliability of the downscaling model.Among the eleven indicators, vegetation greenness-related indices such as NDVI/EVI/SAVI were the most important variables used in the machine learning modeling.In Penman-Monteith model that was used to generate MOD16 ET products, remotely-sensed LAI is an important input parameter for quantification of wet canopy surface evaporation and plant transpiration [33].As LAI and NDVI/EVI/SAVI are highly correlated, it is reasonable that these variables were highly important.These indices have also long been recognized as important indicators for ET estimation, especially at crop areas that were dominant land cover types in study Sites 1, 2 and 4. NDVI is on average the most important indicator among the VIs.Although NDVI is known to have saturation issues and it can be influenced by soil background, the use of EVI and other vegetation greenness related indices such as SAVI could help to avoid these issues and ensures the reliability of the models.Vegetation moisture-based indicators such as NDIIb7 and NDWI were also important because these variables describe soil moisture and leaf water content, higher leaf moisture results in greater amount of water transpired from leaves, especially during growing seasons.Interestingly, LST-related indicators were less important.Unlike SEB models that rely on LST to derive ET, the Penman-Monteith model did not use remotely sensed LST as an input.On the other hand, LST derived from Landsat 8 imagery only represents the instant thermal conditions at the image acquisition date/time.Although our investigation showed that the instant Landsat-derived LST had high correlation with the MODIS 8-day average LST product (MOD11A2) (ρ > 0.8 and mean R 2 = 0.83), the surface temperature can vary considerably during the 8-day period.Therefore, it is not surprising that LST has less of a contribution to the modeling of 8-day aggregated ET.
Our results showed that the downscaled ET is overall consistent with MODIS ET product, and it demonstrates detailed spatial variation and temporal trend at 30 m resolution.The downscaling approach assumes that Landsat-derived indicators characterize vegetation conditions during each 8-day period.The overall consistency between downscaled and MODIS ET proved that this assumption is reasonable even when the resolution of MODIS ET product is over 1000 times (33 ˆ33) coarser than Landsat data.Greater variation of the downscaled ET over cropland was observed as crop ET varies significantly with irrigation and growth.This also demonstrated the capability of the product to represent the spatial and temporal heterogeneity of ET.The ET estimates have better agreement with in situ measurements, especially at Site 2 where ET heterogeneity is high within 1 km grids.Because the downscaling models rely on the MODIS ET estimation, the accuracy of the MODIS product highly impacts the finer-resolution estimation.With the improved accuracy of coarse resolution global or regional ET products, it is expected that our downscaling approach can achieve better performance in various regions with different climate and environmental characteristics.
The downscaling method is easy to implement; it does not require computation of Landsat daily ET, hence does not need parameter tuning for ET estimation models such as SEB, and the method is not impacted by the accuracy of meteorological parameters such as solar radiation and wind speed.All indicators used in the machine learning models can be easily derived based on Landsat 8 OLI and TIRS data.Because the satellite-based indicators are spatially explicit, the method can be applied over large areas or basins for ET estimation.It is also expected that this approach can also be applied to the downscaling of daily MODIS ET or ET from other coarse-resolution satellites such as MERIS.

Challenges
One major limitation of the proposed approach is that the accuracy of downscaled ET is greatly impacted by the accuracy of coarse-resolution ET.As many methods have been developed for improving MODIS ET [75,76], better MODIS ET estimation is expected to increase the accuracy of the downscaled product.Another disadvantage of the approach is that downscaled ET can only be produced on cloudless image acquisition days.In areas with more frequent cloud cover, such as South Korea, the approach might not be able to describe the temporal trends of ET at higher spatial resolution.As Landsat 8 collects images in an 8-day offset from Landsat 7, our approach may provide ET at higher temporal resolution using the combination of the two datasets.In our future research, spatial-temporal fusion of Landsat data and MODIS ET will be explored to increase the temporal resolution of downscaled ET.The proposed approach may require post processing over the areas overlapped between adjacent Landsat scenes or the boundary areas of MODIS tiles where there could be spatial discrepancy of ET estimates.One possible solution would be to use merging algorithms typically used to mosaic images considering spatiotemporal variations of ET estimates.An alternative is to use fusion algorithms that are often used for downscaling such as Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM; [22,77]) to mitigate the variations of ET estimates over such overlapped areas.

Conclusions
In this study, we proposed a machine learning-based method for MODIS ET downscaling with Landsat 8 data.Eleven indicators, including albedo, LST, and VIs such as NDVI, EVI, SAVI, MSAVI, NDMI, NDWI, D1609, NDIIb7 and TVDI, derived from Landsat 8 data, were used as predictor variables and MODIS 8-day 1 km ET product (MOD16A2) as a response variable to build machine learning models.Among the three machine learning algorithms including SVR, Cubist, and RF examined, RF and Cubist resulted in higher model accuracies.rRMSEs from RF models were less than 20% for the four study sites in the US and South Korea, which were within the error range of the MODIS ET product (rRMSE around 25%).The variable importance analysis showed that vegetation growth-related indicators such as NDVI, EVI, and SAVI were most important in 8-day ET modeling by RF, while vegetation moisture-based indicators such as NDIIb7 and NDWI had slightly lower importance.LST-related indicators representing temperature at the time of image acquisition were less important than either vegetation greenness or wetness indicators.The predicted Landsat downscaled ET had an overall good agreement with MODIS ET, as the average rRMSE was around 22%.The downscaled ET product showed a similar temporal trend as MODIS ET.In addition, the product demonstrated higher spatial details; especially in crop areas, the downscaled product could represent ET variation at patch scale.Neither MODIS ET nor Landsat downscaled ET product yielded a very good agreement with in situ ET probably because of land cover heterogeneity around flux tower sites, however, the downscaled ET estimates were closer to in situ observations (R 2 = 0.76 and 0.77, RMSE = 11.8 and 12.5 mm/8 days for Site 1 and Site 2, respectively).The results showed the potential of using machine learning approaches for ET downscaling considering their effectiveness and ease of implementation.Our next study will explore spatial-temporal fusion methods that combine the advantages of high spatial resolution of Landsat data and high temporal resolution of MODIS ET in order to generate 8-day ET at 30 m resolution.

3
˝C and annual precipitation is around 880 mm.Twitchell Island is within the Sacramento-San Joaquin Delta approximately 100 km inland from the Pacific Ocean and a Mediterranean climate zone, with hot, dry summers, and cool, wet winters (Köppen climate map classified as Csb).The mean air temperature is 15.1 ˝C and the mean annual precipitation is 335 mm.Two sites in South Korea are Seolmacheon in Paju (Site 3: 126.95 ˝E, 37.93 ˝N) and Cheongmicheon in Yeoju (Site 4: 126.65 ˝E, 37.15 ˝N).The climatic characteristics of the two sites are similar.Both are of humid climate and hot in summer while cold in winter.Annual mean temperature of the sites is about 11.5 ˝C and annual precipitation is about 1150 mm.While Seolmacheon is located in a mixed forest, Cheongmicheon is located in rice paddy fields.Remote Sens. 2016, 8, 215 4 of 26 cool, wet winters (Köppen climate map classified as Csb).The mean air temperature is 15.1 °C and the mean annual precipitation is 335 mm.Two sites in South Korea are Seolmacheon in Paju (Site 3: 126.95°E, 37.93°N) and Cheongmicheon in Yeoju (Site 4: 126.65°E, 37.15°N).The climatic characteristics of the two sites are similar.Both are of humid climate and hot in summer while cold in winter.Annual mean temperature of the sites is about 11.5 °C and annual precipitation is about 1150 mm.While Seolmacheon is located in a mixed forest, Cheongmicheon is located in rice paddy fields.

Figure 1 .
Figure 1.Location of the study Sites 1, 2, 3 and 4 over (a) US and (b) South Korea.

A
total of 32 Landsat 8 Level 1 (L1T) images with minimum cloud cover (<5% within the study area) during the year 2013-2014 were obtained from USGS Global Visualization Viewer (http://glovis.usgs.gov/),including 11 images over Site 1 from 3 June 2013 to 25 August 2014; 15 images over Site 2 from 16 April 2013 to 9 August 2014, and three images over each of Sites 3 and 4 during the growing season (Figure 2, Table

Figure 1 .
Figure 1.Location of the study Sites 1, 2, 3 and 4 over (a) US and (b) South Korea.
cool, wet winters (Köppen climate map classified as Csb).The mean air temperature is 15.1 °C and the mean annual precipitation is 335 mm.Two sites in South Korea are Seolmacheon in Paju (Site 3: 126.95°E, 37.93°N) and Cheongmicheon in Yeoju (Site 4: 126.65°E, 37.15°N).The climatic characteristics of the two sites are similar.Both are of humid climate and hot in summer while cold in winter.Annual mean temperature of the sites is about 11.5 °C and annual precipitation is about 1150 mm.While Seolmacheon is located in a mixed forest, Cheongmicheon is located in rice paddy fields.

Figure 1 .
Figure 1.Location of the study Sites 1, 2, 3 and 4 over (a) US and (b) South Korea.

A total of 32
Landsat 8 Level 1 (L1T) images with minimum cloud cover (<5% within the study area) during the year 2013-2014 were obtained from USGS Global Visualization Viewer (http://glovis.usgs.gov/),including 11 images over Site 1 from 3 June 2013 to 25 August 2014; 15 images over Site 2 from 16 April 2013 to 9 August 2014, and three images over each of Sites 3 and 4 during the growing season (Figure 2, Table

Figure 2 .
Figure 2. Landsat 8 OLI color-infrared images for: (a) Site 1, image collected on 6 August 2013; (b) Site 2, image collected on 21 July 2013; (c) Site 3, image collected on 11 May 2013; and (d) Site 4, image collected on 5 June 2013.Yellow outlines show the boundary of the study area, blue rectangles show the areas around flux towers of sites 2 and 4, and green triangles indicate the location of flux towers.

Figure 2 .
Figure 2. Landsat 8 OLI color-infrared images for: (a) Site 1, image collected on 6 August 2013; (b) Site 2, image collected on 21 July 2013; (c) Site 3, image collected on 11 May 2013; and (d) Site 4, image collected on 5 June 2013.Yellow outlines show the boundary of the study area, blue rectangles show the areas around flux towers of sites 2 and 4, and green triangles indicate the location of flux towers.

Figure 3 .
Figure 3. Flow diagram of the machine learning-based ET downscaling model.

Figure 3 .
Figure 3. Flow diagram of the machine learning-based ET downscaling model.

Figure 5 .
Figure 5. Importance of the 11 variables represented by the average increase in Mean Squared Error (MSE) of evapotranspiration (ET) resulted from all RF models at the four sites.

Figure 5 .
Figure 5. Importance of the 11 variables represented by the average increase in Mean Squared Error (MSE) of evapotranspiration (ET) resulted from all RF models at the four sites.

Figures 7
Figures7 and 8show time series of the ET maps at the northeast part of Site 2 (US-Twt) and the southwest part of Site 4 (Cheongmicheon) through 2013 and the corresponding land cover types.Both Landsat modeled 1 km and Landsat downscaled 30 m ET showed very similar spatiotemporal patterns as MODIS ET.At cropland, which is mainly irrigated in Site 2, ET increased from April to July, and then decreased until the end of the year.ET in grassland showed a decreasing trend since April.As grass growth in this area is mainly controlled by the amount of precipitation, it is expected that ET is lower during the hot and dry summer.Spatial variation of ET at cropland was higher than the other vegetated land covers for all three products, and at cropland the Landsat downscaled product showed even more detailed spatial distribution of ET than MODIS and the Landsat modeled 1 km ETs.Since only three dates were available for Site 4, the temporal patterns of ET were not captured in details.However, ET at cropland was a bit lower than that in forests in the middle of the growing season (i.e., June) while ET was very low during the off-growing season (i.e., November)

Figures 7 and 8
Figures 7 and 8 show time series of the ET maps at the northeast part of Site 2 (US-Twt) and the southwest part of Site 4 (Cheongmicheon) through 2013 and the corresponding land cover types.Both Landsat modeled 1 km and Landsat downscaled 30 m ET showed very similar spatiotemporal patterns as MODIS ET.At cropland, which is mainly irrigated in Site 2, ET increased from April to July, and then decreased until the end of the year.ET in grassland showed a decreasing trend since April.As grass growth in this area is mainly controlled by the amount of precipitation, it is expected that ET is lower during the hot and dry summer.Spatial variation of ET at cropland was higher than the other vegetated land covers for all three products, and at cropland the Landsat downscaled product showed even more detailed spatial distribution of ET than MODIS and the Landsat modeled 1 km ETs.Since only three dates were available for Site 4, the temporal patterns of ET were not captured in details.However, ET at cropland was a bit lower than that in forests in the middle of the growing season (i.e., June) while ET was very low during the off-growing season (i.e., November) regardless of land cover type.The good agreement between Landsat and MODIS ET in terms of spatial and temporal patterns proved the potential of our downscaling approach.

Figure 7 .
Figure 7. MODIS 8-day ET, Landsat 1 km ET and downscaled Landsat 30 m ET in the northeast part of Site 2 (refer to Figure 2b blue outline for the area boundary) during the year of 2013.

Figure 7 .
Figure 7. MODIS 8-day ET, Landsat 1 km ET and downscaled Landsat 30 m ET in the northeast part of Site 2 (refer to Figure 2b blue outline for the area boundary) during the year of 2013.

Figure 8 .
Figure 8. MODIS 8-day ET, Landsat 1 km ET and downscaled Landsat 30 m ET in the southwest part of Site 3 (refer to Figure 2f blue outline for the area boundary) during the year of 2013.

Figure 8 .
Figure 8. MODIS 8-day ET, Landsat 1 km ET and downscaled Landsat 30 m ET in the southwest part of Site 3 (refer to Figure 2f blue outline for the area boundary) during the year of 2013.

Figure 9 .
Figure 9. Zoomed view of Landsat 30 m downscaled ET (mm/8-days) in the northeast part of Site 2 during the year of 2013.

Figure 9 .
Figure 9. Zoomed view of Landsat 30 m downscaled ET (mm/8-days) in the northeast part of Site 2 during the year of 2013.

Figure 10 .
Figure 10.Boxplots of difference between Landsat downscaled 30 m ET and MODIS ET in forest, crop, shrub and grass areas for: (a) Site 1; (b) Site 2; (c) Site 3; and (d) Site 4.Figure 10.Boxplots of difference between Landsat downscaled 30 m ET and MODIS ET in forest, crop, shrub and grass areas for: (a) Site 1; (b) Site 2; (c) Site 3; and (d) Site 4.

Figure 11 .
Figure 11.Comparison with in-situ ET at: (a) Site 1 (US-ARM flux tower); (b) Site 2 (US-Twt flux tower); (c) Site 3; and (d) Site 4. (e-h) The close view of the tower sites in July 2013 for Sites 1-3 and May 2013 for Site 4, with a black square denoting 1 km grid around each tower location: (e) Site 1; (f) Site 2; (g) Site 3; and (h) Site 4.
, and d show a large underestimation of ET at the crop sites, especially for the MODIS-based, Landsat modeled 1 km, and Landsat aggregated 1 km products, with the bias as great as −20 mm/8 days.The downscaled 30 m ET is similar to MODIS ET at US-ARM (Site 1, during the year 2013) and Cheongmicheon (Site 4), while at US-Twt (Site 2) it is on average

Figure 11 .
Figure 11.Comparison with in-situ ET at: (a) Site 1 (US-ARM flux tower); (b) Site 2 (US-Twt flux tower); (c) Site 3; and (d) Site 4. (e-h) The close view of the tower sites in July 2013 for Sites 1-3 and May 2013 for Site 4, with a black square denoting 1 km grid around each tower location: (e) Site 1; (f) Site 2; (g) Site 3; and (h) Site 4.

Table 1 .
Landsat 8 images and MOD16A2 products for all sites.

Table 1 .
Landsat 8 images and MOD16A2 products for all sites.

Table 3
lists the band settings of Landsat 8 OLI in comparison with Landsat 7 ETM+ sensor.Landsat 8 OLI Cirrus band (band 9: 1.36-1.39µm) was not included in calculation due to its high absorption and low reflective characteristics.

Table 3 .
Landsat 8 OLI and Landsat 7 ETM+ band width and UP b and LO b in Equation (5).

Table 4 .
Weighting coefficients based on at-surface solar radiation derived from SMARTS.

Table 5 .
Correlation coefficients between Landsat variables and MODIS ET.

Table 5 .
Correlation coefficients between Landsat variables and MODIS ET.

June-September) Average ρ Standard Deviation of ρ Average ρ Standard Deviation of ρ
1, downscaled ET yielded the highest R 2 (0.76) and lowest RMSE (11.8 mm/8 days) compared to coarse resolution products.At Site 2, although downscaled ET produced lower R 2 than MODIS ET, it still gave the lowest RMSE.

Table 6 .
Linear regression results between satellite-based ET and in situ ET at Sites 1 and 2.