Developing a Gap-Filling Algorithm Using DNN for the Ts-VI Triangle Model to Obtain Temporally Continuous Daily Actual Evapotranspiration in an Arid Area of China

: Temporally continuous daily actual evapotranspiration (ET) data play a critical role in water resource management in arid areas. As a typical remotely sensed land surface temperature (LST)-based ET model, the surface temperature-vegetation index (Ts-VI) triangle model provides direct monitoring of ET, but these estimates are temporally discontinuous due to cloud contamination. In this work, we present a gap-ﬁlling algorithm (TSVI_DNN) using a deep neural network (DNN) with the Ts-VI triangle model to obtain temporally continuous daily actual ET at regional scale. The TSVI_DNN model is evaluated against in situ measurements in an arid area of China during 2009–2011 and shows good agreement with eddy covariance (EC) observations. The temporal coverage was improved from 16.1% with the original Ts-VI tringle model to 67.1% with the TSVI_DNN model. The correlation coe ﬃ cient (R), root mean square error (RMSE), bias, and mean absolute di ﬀ erence (MAD) are 0.9, 0.86 mm d − 1 , − 0.16 mm d − 1 , and 0.65 mm d − 1 , respectively. When compared with the National Aeronautics and Space Administration (NASA) o ﬃ cial MOD16 version 6 ET product, estimates of ET using TSVI_DNN are improved by approximately 49.2%. The method presented here can potentially contribute to enhanced water resource management in arid areas, especially under climate change.


Introduction
Evapotranspiration (ET) is a critical component of the water cycle and water balance because it links a number of ecological and hydrological processes [1]. In arid areas, actual ET is a vital consumptive use of water derived from precipitation and irrigation, thereby affecting crop yield [2]. Hence, actual ET is a key measurement for irrigation programs, particularly where there is insufficient precipitation to meet crop growth requirements [3,4]. However, as a result of increased irrigation, drinking water demands, and urban water usage, groundwater levels have decreased significantly through over-pumping [5][6][7]. To improve both water use efficiency and the level of water resource management to balance these different water demands, it is necessary to obtain temporally continuous daily actual ET data to more accurately calculate total water consumption.
The surface temperature-vegetation index (Ts-VI) model is a typical remotely sensed land surface temperature (LST)-based ET model [8]. It depends primarily on LST and is presently most applicable for The primary crop is spring maize, which grows well-the result of at least five irrigation events every year. There are more than ten eddy covariance (EC) stations in the area with a footprint of 200-300 m. In addition to EC data, there are in situ data collected from the Watershed Allied Telemetry Experimental Research (WATER) [20] from 2008. In this work, two in situ datasets were used ( Figure  1 (b)): (1) a three-year EC observation dataset at the Yingke station (YK), and (2) an EC network having six EC stations with measurements taken during May-September 2012. Daily ET was aggregated from 30-min observations after removing the days having gaps due to rainfall and instrument malfunctions. These datasets were provided by the National Tibetan Plateau Data Center (http://data.tpdc.ac.cn) [32][33][34].

Data
Thirteen remote sensing and meteorological forcing datasets were used in this study (Table 1). While LST and NDVI were used in the Ts-VI triangle model, NDVI, albedo, surface soil moisture, wind speed (WS), air temperature (AT), relative humidity (RH), air pressure (AP), downward shortwave radiation (DSR), and downward longwave radiation (DLR) were used in DNN training. Precipitation and land cover data were used in data pre-processing, and MOD16 ET was used for comparison. Details of these datasets are given as follows.

Remote Sensing Data
Moderate Resolution Imaging Spectroradiometer (MODIS), LST (MOD11A1), NDVI (MOD13A2) and ET (MOD16A2) data were collected from the National Aeronautics and Space

Data
Thirteen remote sensing and meteorological forcing datasets were used in this study (Table 1). While LST and NDVI were used in the Ts-VI triangle model, NDVI, albedo, surface soil moisture, wind speed (WS), air temperature (AT), relative humidity (RH), air pressure (AP), downward shortwave radiation (DSR), and downward longwave radiation (DLR) were used in DNN training. Precipitation and land cover data were used in data pre-processing, and MOD16 ET was used for comparison. Details of these datasets are given as follows.  [35]. We used high-quality MODIS LST products, as determined from the quality control files while also removing images which had valid data less than 90% of the study area or when precipitation was greater than 1 mm d −1 . Gap-free NDVI products with a temporal resolution of 1-day were obtained using Harmonic Analysis of Time Series (HANTS) [27,36]. The LST and NDVI are two main parameters used in the Ts-VI triangle model.
Albedo data related to net radiation are a component of the Global LAnd Surface Satellite (GLASS) products generated by the inquiry team at Beijing Normal University. The dataset provides gap-free albedo data with a spatial resolution of 1 km and a temporal resolution of 8-days. These data were retrieved from the MODIS data using an angular bin algorithm and a statistics-based temporal filtering method [37].
The MOD16A2 ET version 6 product was utilized as a benchmark to compare with our results. This product is an 8-day composite ET dataset produced at a 500-m spatial resolution. The MOD16 ET algorithm is based on the Penman-Monteith model driven by daily meteorological forcing data from MERRA GMAO at about 0.5 • × 0.6 • resolution and MODIS land surface parameters, including NDVI, LAI (leaf area index), albedo, and land cover [10,38].
All of these datasets were re-projected and resampled to 0.01 • and, with the exception of MOD16 data, to 1-day using a bilinear resampling method.

Meteorological Forcing Data
Meteorological forcing data, including WS, AT, RH, precipitation, AP, DSR, and DLR were provided by the Data Assimilation and Modeling Center for Tibetan Multi-spheres (DAMCTM), Institute of Tibetan Plateau Research, Chinese Academy of Sciences [41,42], with a spatial resolution of 0.1 • and a temporal resolution of 3 h. Station data from the China Meteorological Administration (CMA), TRMM precipitation (3B42), GEWEX-SRB radiation, and GLDAS data were used to produce this dataset.
The spatial resolution of this dataset was downscaled to 0.01 • using a semi-empirical relationship and a digital elevation model (DEM) [43,44], with the exception of precipitation and wind speed, which were downscaled using a bilinear resampling method. All of the data were then aggregated temporally from 3 h to 1 day.
The final temporally continuous reference dataset with a resolution of 0.01 • and temporal resolution of 1 day were obtained to be used to train the DNN.

Methods
The workflow for this study (Figure 2) is as follows. First, obtain daily ET estimates from the Ts-VI triangle model under clear-sky conditions. Second, train the DNN using the ET estimates from step 1. Finally, obtaining temporally continuous daily actual ET by running the trained DNN with all-sky reference data, including remote sensing information and meteorological forcing data. A correlation coefficient (R), root mean square error (RMSE), bias, and mean absolute difference (MAD) were used as the error metric.

Methods
The workflow for this study ( Figure 2) is as follows. First, obtain daily ET estimates from the Ts-VI triangle model under clear-sky conditions. Second, train the DNN using the ET estimates from step 1. Finally, obtaining temporally continuous daily actual ET by running the trained DNN with all-sky reference data, including remote sensing information and meteorological forcing data. A correlation coefficient (R), root mean square error (RMSE), bias, and mean absolute difference (MAD) were used as the error metric. The relationship between LST and VI can be used to describe the evaporative ability of the land surface based on the assumption that LST varies for a given VI based primarily on soil moisture availability rather than atmospheric forcing differences over a relatively flat area. Hence, a Ts-VI triangular shape can be obtained when there are a number of pixels over a flat area formed by two physical bounds: the upper decreasing dry and lower horizontal wet edges, representing zero ET and potential ET, respectively [8,15,45]. In this study, NDVI was replaced by the fraction of vegetation (Fc), which appears to be more representative of the relative proportionality between soil and vegetation within a pixel [45]. The relationship between LST and VI can be used to describe the evaporative ability of the land surface based on the assumption that LST varies for a given VI based primarily on soil moisture availability rather than atmospheric forcing differences over a relatively flat area. Hence, a Ts-VI triangular shape can be obtained when there are a number of pixels over a flat area formed by two Remote Sens. 2020, 12, 1121 6 of 17 physical bounds: the upper decreasing dry and lower horizontal wet edges, representing zero ET and potential ET, respectively [8,15,45]. In this study, NDVI was replaced by the fraction of vegetation (Fc), which appears to be more representative of the relative proportionality between soil and vegetation within a pixel [45].
where NDVI min and NDVI max are the minimum and maximum NDVI, which are assumed to be 0.9 and 0.1, respectively, in this work based on the MOD13A2 data.
Using the Ts-VI triangle model consisting of a spatial relationship between LST and Fc, the Priestley-Taylor equation was extended by Jiang et al. [8], such that the latent heat flux (λET) could be calculated as follows: where EF is the evaporation fraction (dimensionless); λ is the latent heat of vaporization; ∅ is a combined-effect parameter and is similar to α in the Priestley-Taylor equation ranging from 0 to 1.26; ∆ is the slope between saturation vapor pressure and air temperature (kPa • C −1 ); γ is psychometric constant (kPa • C −1 ); and R n, ins and G ins are instantaneous net radiation (W m −2 ) and soil heat flux (W m −2 ), respectively. The EF is obtained from the Ts-VI triangle method using the input LST as a surrogate for air temperature, while ∅ is obtained from Ts-VI triangle space using a two-step interpolation scheme [31]. Assuming that EF is constant during a single day under clear-sky conditions, daily ET could be estimated using the following equation [46]: where R n,daily is daily net radiation (W m −2 ) and G daily is daily soil heat flux (W m −2 ; normally assumed negligible) [4].
In this work, a Ts-VI triangle model with enhanced edge determination was adopted to calculate EF [45]. This model is highly suitable for arid areas where the wet edge is difficult to find in images due to the surface's high evaporative capacity.

Training DNN Using ET Estimates from the Ts-VI Triangle Model under Clear-Sky Conditions
A four-layer, fully connected DNN (9-128-128-1) was employed ( Figure 3). The input layer includes nine inputs: NDVI, Albedo, WS, AT, RH, AP, DSR, LSR, and SM. The two hidden layers have 128 neurons each, which effectively use GPU to accelerate calculations. The output layer has a single neuron representing daily ET. To reduce the risk of overfitting, we limited the number of hidden layers to two and adopted a regularization term for the weights to the loss function. Meanwhile, about 20% of the data were randomly selected as validation data. These steps eliminated the need for dropout, since our experiments showed that there was little risk of overfitting. Training is terminated after 1000 echoes are trained. The root mean squared error (RMSE) is taken as the cost function. The learning rate is set to 0.001, with a decreasing factor of 0.9 every 200 echoes. The activation function is based on rectified linear units (Relu), which preserves information about relative intensities as the information travels through multiple layers [47]. The optimization scheme 'Adam' was used to improve computational efficiency and reduce memory requirements, as it is well suited to problems that have large datasets [48]. Three years of reference data over the entire study area were used to train the DNN.
Daily ET estimates from the TS-VI triangle method under clear-sky conditions were used in the DNN training process. Unlike gap-filling methods, quality is more important than quantity for machine learning. Hence, optimizing application conditions of the Ts-VI triangle model is critical in improving the DNN performance (Section 2.2.1), especially for its generalization ability.

Obtaining Temporally Continuous Daily ET by Driving the Trained DNN Using Reference Data
In the final step, the trained DNN is driven by all-sky temporally continuous daily reference input data. However, since the Ts-VI triangle model is not suitable for soil that is frozen (the daily average air temperature less than 0℃) or covered with snow (albedo greater than 0.3) [49], we limited predictions to days when the soil was neither frozen nor covered with snow. To maintain consistency in the results, the estimates of ET from the original Ts-VI model were discarded.

Results
A model named TSVI_DNN was obtained by training the DNN using ET estimates from the original Ts-VI triangle model (hereafter labelled TSVI_Ori) for 2009-2011. It is obvious that the performance of the TSVI_DNN model is affected by the accuracy of the TSVI_Ori model, specifically by high-quality ET outputs from the TSVI_Ori model. As a result, we first validated the qualitycontrolled ET estimates from the TSVI_Ori model and then evaluated the consistency between the TSVI_DNN and TSVI_Ori models. Next, performance was evaluated by a direct comparison of TSVI_DNN estimates to in situ measurements of ET and an intercomparison with the NASA official MOD16 ET product, which was used as a benchmark. To focus on the objective, only daily results were validated. These results contain uncertainty from both the Ts-VI triangle model and scale expansion from instantaneous to daily estimates.

Comparison with the Original Ts-VI Triangle Model
Using three years of satellite observations over the study area, there were sufficient samples to train the DNN. Strict quality control was used to estimate ET using the TSVI_Ori model. Considering that the quality of the ET estimates from TSVI_Ori is critical in the DNN training process, we first evaluated it at YK station using three years of in situ measurements. Figure 4 (a) reveals that a comparison of ET estimates from TSVI_Ori with in situ measurements is very close to 1:1 line, where an R of 0.9 (P<0.01) indicates the model's ability to capture the high variability in the data. Accuracy was acceptable with an RMSE of 0.94 mm d −1 and an MAD of 0.71 mm d −1 . These results are comparable to those from previous studies of the Ts-VI triangle model [45]. Since bias was low, there was little underestimation of ET. When using satellite imagery there are other important issues to consider, such as their coarser spatial resolution, overpass frequency, the possibility of cloud cover presence at overpass time (imagery delivery time). These issues sometimes limit the effectiveness of the aforementioned methods for mapping daily ET at a very high resolution (crop fields) and on a regular basis for near real-time irrigation scheduling [21].
Approximately 6.8 × 10 5 high-quality ET estimates were obtained from the TSVI_Ori model for training the DNN, and 1.5 × 10 5 were randomly selected as validation data. ET estimates from the TSVI_DNN were compared to those from the TSVI_Ori ET during 2009-2011. As shown in Figure   Figure 3. Structure of the DNN used in the study.

Obtaining Temporally Continuous Daily ET by Driving the Trained DNN Using Reference Data
In the final step, the trained DNN is driven by all-sky temporally continuous daily reference input data. However, since the Ts-VI triangle model is not suitable for soil that is frozen (the daily average air temperature less than 0 • C) or covered with snow (albedo greater than 0.3) [49], we limited predictions to days when the soil was neither frozen nor covered with snow. To maintain consistency in the results, the estimates of ET from the original Ts-VI model were discarded.

Results
A model named TSVI_DNN was obtained by training the DNN using ET estimates from the original Ts-VI triangle model (hereafter labelled TSVI_Ori) for 2009-2011. It is obvious that the performance of the TSVI_DNN model is affected by the accuracy of the TSVI_Ori model, specifically by high-quality ET outputs from the TSVI_Ori model. As a result, we first validated the quality-controlled ET estimates from the TSVI_Ori model and then evaluated the consistency between the TSVI_DNN and TSVI_Ori models. Next, performance was evaluated by a direct comparison of TSVI_DNN estimates to in situ measurements of ET and an intercomparison with the NASA official MOD16 ET product, which was used as a benchmark. To focus on the objective, only daily results were validated. These results contain uncertainty from both the Ts-VI triangle model and scale expansion from instantaneous to daily estimates.

Comparison with the Original Ts-VI Triangle Model
Using three years of satellite observations over the study area, there were sufficient samples to train the DNN. Strict quality control was used to estimate ET using the TSVI_Ori model. Considering that the quality of the ET estimates from TSVI_Ori is critical in the DNN training process, we first evaluated it at YK station using three years of in situ measurements. Figure 4a reveals that a comparison of ET estimates from TSVI_Ori with in situ measurements is very close to 1:1 line, where an R of 0.9 (P < 0.01) indicates the model's ability to capture the high variability in the data. Accuracy was acceptable with an RMSE of 0.94 mm d −1 and an MAD of 0.71 mm d −1 . These results are comparable to those from previous studies of the Ts-VI triangle model [45]. Since bias was low, there was little underestimation of ET. When using satellite imagery there are other important issues to consider, such as their coarser spatial resolution, overpass frequency, the possibility of cloud cover presence at overpass time (imagery delivery time). These issues sometimes limit the effectiveness of the aforementioned methods for mapping daily ET at a very high resolution (crop fields) and on a regular basis for near real-time irrigation scheduling [21]. Approximately 6.8 × 10 5 high-quality ET estimates were obtained from the TSVI_Ori model for training the DNN, and 1.5 × 10 5 were randomly selected as validation data. ET estimates from the TSVI_DNN were compared to those from the TSVI_Ori ET during 2009-2011. As shown in Figure 4b ET estimates from TSVI_DNN are highly consistent with estimates from TSVI_Ori at the YK station. Regional performance is also shown in Figure 5, revealing that, for the most part, the ET estimates from TSVI_DNN are highly consistent with the ET estimates from TSVI_Ori. The greatest difference between them is less than 1 mm d −1 . We should state that the DNN configuration, to avoid overfitting, will not guarantee that the ET estimates from TSVI_DNN are the same as the ET estimates from TSVI_Ori, even though the traditional DNN has this ability. For example, the difference between outputs from the two models was relatively large when the ET estimates from TSVI_Ori were extremely high. This also indicates that there is a risk of missing extreme predictions using the TSVI_DNN model, which is a focus of future work.
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 18 4(b), ET estimates from TSVI_DNN are highly consistent with estimates from TSVI_Ori at the YK station. Regional performance is also shown in Figure 5, revealing that, for the most part, the ET estimates from TSVI_DNN are highly consistent with the ET estimates from TSVI_Ori. The greatest difference between them is less than 1 mm d −1 . We should state that the DNN configuration, to avoid overfitting, will not guarantee that the ET estimates from TSVI_DNN are the same as the ET estimates from TSVI_Ori, even though the traditional DNN has this ability. For example, the difference between outputs from the two models was relatively large when the ET estimates from TSVI_Ori were extremely high. This also indicates that there is a risk of missing extreme predictions using the TSVI_DNN model, which is a focus of future work.   4(b), ET estimates from TSVI_DNN are highly consistent with estimates from TSVI_Ori at the YK station. Regional performance is also shown in Figure 5, revealing that, for the most part, the ET estimates from TSVI_DNN are highly consistent with the ET estimates from TSVI_Ori. The greatest difference between them is less than 1 mm d −1 . We should state that the DNN configuration, to avoid overfitting, will not guarantee that the ET estimates from TSVI_DNN are the same as the ET estimates from TSVI_Ori, even though the traditional DNN has this ability. For example, the difference between outputs from the two models was relatively large when the ET estimates from TSVI_Ori were extremely high. This also indicates that there is a risk of missing extreme predictions using the TSVI_DNN model, which is a focus of future work.   (a) A scatterplot of regional ET estimates from TSVI_DNN vs. those from TSVI_Ori; (b) a histogram of the difference between ET estimates from the TSVI_DNN and TSVI_Ori models. To show the spatial consistency between the TSVI_DNN and TSVI_Ori models, ET images on DOY 204 in 2009 are shown for both of them, for the both images had value in the growing season ( Figure 6). The spatial pattern of ET estimates from the TSVI_DNN model was similar to that of TSVI_Ori, with ET decreasing from southeast to northwest. This ET spatial pattern is also consistent with those of previous studies [50][51][52]. In summary, the performance of both models during the period of overlap was highly consistent from point to regional scales, indicating that the DNN may have a high accuracy.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 18 The temporal coverage was improved from 16.1% with TSVI_Ori model to 67.1% with TSVI_DNN model during 2009-2011. Regardless of the surface covered by snow or the soil being frozen, TSVI_DNN is temporally continuous at the daily scale.
To show the spatial consistency between the TSVI_DNN and TSVI_Ori models, ET images on DOY 204 in 2009 are shown for both of them, for the both images had value in the growing season ( Figure 6). The spatial pattern of ET estimates from the TSVI_DNN model was similar to that of TSVI_Ori, with ET decreasing from southeast to northwest. This ET spatial pattern is also consistent with those of previous studies [50][51][52]. In summary, the performance of both models during the period of overlap was highly consistent from point to regional scales, indicating that the DNN may have a high accuracy.

Comparison with MOD16 ET Product and In Situ Measurements
Although the TSVI_DNN model performs similarly to the TSVI_Ori model, its performance should also be evaluated against additional data. We first evaluated the TSVI_DNN against in situ measurements taken at the YK station during 2009-2011. The results show good agreement (Figure 7) between the results from the TSVI_DNN model and measured ET, where an R value of 0.9 (P < 0.01) indicates the model's ability to capture high variability in the data. The RMSE, bias, and MAD are 0.86 mm d −1 , −0.16 mm d −1 , and 0.65 mm d −1 , respectively, further showing high accuracy. These results are comparable to those between TSVI_DNN and TSVI_Ori. Compared with Figure 4a, these results have 562 samples, which is more than the 147 samples of TSVI_Ori. Considering that water resource management is always on the monthly or yearly scale, the bias is low enough to be compared with the water demand for maize of 4.3 mm d −1 [53]. Hence, it is reasonable to say that the TSVI_DNN model could be used in water resource management.
To further verify the results from our method, we chose to verify our results against the NASA official MOD16 ET version 6 product. Results from previous studies indicate that MOD16 ET products have varied widely and have shown good performance [10,38]. Hence, we selected it as a benchmark. The 500-m resolution original MOD16 ET was resampled to 0.01 • spatial resolution, as mentioned previously. However, the temporal resolution of MOD16 ET is 8 days, which is longer than the 1-day estimates from in situ observations and our results. For comparison, we resampled ET estimates from the TSVI_DNN model and the in situ measurements of ET from 1 day to 8 days.
The time series of ET estimates from TSVI_DNN and MOD16 ET against in situ measured ET are shown in Figure 8. Both estimates can capture the annual variation shown by in situ ET measurements. However, MOD16 ET seriously underestimated measured values, while TSVI_DNN greatly reduced this underestimation. As shown in Table 2, when compared with MOD16 ET, the TSVI_DNN model reduced the RMSE with in situ measurements from 1.30 to 0.66 mm d −1 , the bias from −0.89 to −0.23 mm d −1 , and the MAD from 1.02 to 0.5 mm d −1 . At the same time, R increased from 0.84 to 0.93. Thus, the accuracy (RMSE) of TSVI_DNN improved estimated ET by 49.2%, while also providing estimates closer to actual measurements, thereby reducing underestimation. We found that the latest version of MOD16 is of higher quality than the previous version (1 km, 8 day), which greatly underestimated ET at the YK station (data not shown) [52]. However, our results also show that there is still a lot of room for improvement in the MOD16 ET product.
As the scatterplot of regional ET estimates from TSVI_DNN vs. those from MODI 16 during 2009-2011 shows, the TSVI_DNN has a high correlation with the MOD 16 ET (R = 0.74, P < 0.01) (Figure 9a). However, the MOD16 underestimated ET more severely by about 1.06 mm d −1 than TSVI_DNN (Figure 9b). As an example, shown in Figure 9c,d, the spatial pattern of ET estimates from TSVI_DNN is similar to those from MOD16, where results also decreased from southeast to northwest. Although the TSVI_DNN model performs similarly to the TSVI_Ori model, its performance should also be evaluated against additional data. We first evaluated the TSVI_DNN against in situ measurements taken at the YK station during 2009-2011. The results show good agreement ( Figure  7) between the results from the TSVI_DNN model and measured ET, where an R value of 0.9 (P<0.01) indicates the model's ability to capture high variability in the data. The RMSE, bias, and MAD are 0.86 mm d −1 , -0.16 mm d −1 , and 0.65 mm d −1 , respectively, further showing high accuracy. These results are comparable to those between TSVI_DNN and TSVI_Ori. Compared with Figure 4 (a), these results have 562 samples, which is more than the 147 samples of TSVI_Ori. Considering that water resource management is always on the monthly or yearly scale, the bias is low enough to be compared with the water demand for maize of 4.3 mm d −1 [53]. Hence, it is reasonable to say that the TSVI_DNN model could be used in water resource management.
To further verify the results from our method, we chose to verify our results against the NASA official MOD16 ET version 6 product. Results from previous studies indicate that MOD16 ET products have varied widely and have shown good performance [10,38]. Hence, we selected it as a benchmark. The 500-m resolution original MOD16 ET was resampled to 0.01° spatial resolution, as mentioned previously. However, the temporal resolution of MOD16 ET is 8 days, which is longer than the 1-day estimates from in situ observations and our results. For comparison, we resampled ET estimates from the TSVI_DNN model and the in situ measurements of ET from 1 day to 8 days.
The time series of ET estimates from TSVI_DNN and MOD16 ET against in situ measured ET are shown in Figure 8. Both estimates can capture the annual variation shown by in situ ET measurements. However, MOD16 ET seriously underestimated measured values, while TSVI_DNN 2%, while also providing estimates closer to actual measurements, thereby reducing underestimation. We found that the latest version of MOD16 is of higher quality than the previous version (1km, 8 day), which greatly underestimated ET at the YK station (data not shown) [52]. However, our results also show that there is still a lot of room for improvement in the MOD16 ET product.     Ideally, we expect that the TSVI_DNN model has a better extensible performance at different spatial and temporal scales. We therefore tested the TSVI_DNN model trained using 2009-2011 data to predict ET in 2012 and found that the quality of the variability and spatial pattern is comparable to that of the TSVI_Ori model ( Figure 10   Ideally, we expect that the TSVI_DNN model has a better extensible performance at different spatial and temporal scales. We therefore tested the TSVI_DNN model trained using 2009-2011 data to predict ET in 2012 and found that the quality of the variability and spatial pattern is comparable to that of the TSVI_Ori model ( Figure 10

Advantages of the TSVI_DNN Model
The most significant contribution of the TSVI_DNN model is filling the gaps in the ET estimates from the TSVI_Ori model using the DNN. It provided an applicable way to obtain temporally continuous daily ET over the arid area during the soil in unfrozen conditions, and the results show that the temporal coverage was significantly improved from 16.1% to 67.1% compared with the TSVI_Ori model. Meanwhile, the spatial pattern of TSVI_DNN is consistent with previous studies for the same area, which showed that spatial variation is small and ET is always high due to irrigation when using process-based and LST-based models [50][51][52].
In arid areas, actual ET is limited by the soil moisture, especially by the root zone soil moisture, and the irrigation events often introduce additional uncertainty. According to the mechanism of the TSVI_Ori model, the surface and root zone soil moisture have a great contribution to ET estimates from the TSVI_Ori model. Hence, the TSVI_DNN model provided a new way to extend the ET with the contribution of soil moisture to different spatio-temporal scale with the help of DNN. However, other mechanism models either limited by the coarse resolution of surface soil moisture product or by a lack of fine-resolution and reliable root zone soil moisture, and using atmospheric vapor pressure deficit as a surrogate might introduce additional uncertainty due to its low spatial variation [11,54,55].
In addition, the TSVI_DNN model relies on the inputs from remotely sensed and forcing data, but does not need ground data. This was another main difference from the widely used crop coefficient based Penman-Monteith method, where the crop coefficient is obtained through field experiment and varies during vegetation developing stages [56]. Moreover, it is also different from the ground data driven machine learning methods [24].
What is more, this study used the DNN to establish relationships between daily ET estimates from the Ts-VI triangle model and reference data and then obtain temporally continuous daily ET by overcoming the issue of gaps on cloudy days. The DNN has a stronger relationship-mining ability than more traditional neural networks, especially for large training datasets. When increasing the size of the training dataset, the established relationship becomes more stable.

Limitations of the TSVI_DNN Model
A number of limitations of the Ts-VI triangle model should be stated to further improve the quality of TSVI_DNN in the future. (1) The error in daily ET estimates used as training data from the Ts-VI triangle model has at least two components: errors in the Ts-VI triangle model and errors in the scaling/transforming process. In this study, we limited the clear-sky conditions to relative stable weather based on the quality of LST and precipitation. The total error (RMSE) of the daily ET estimates of 0.94 mm d −1 suggests that this method was effective. (2) Several factors can reduce the performance of the Ts-VI triangle model. For example, heavy rainfall on the last day might lead to an inaccurate determination of the dry and wet edges [45]. Although there are a number of papers indicating that the coefficient of the P-T equation equals 1.26, values ranging from 1.0 to 1.5 have also been reported [1,45,57]. (3) There are at least five methods that can be used to transform instantaneous remote sensing latent heat flux or evaporation observations to daily estimates [18,21]. Most methods, such as the reference ET-based method, need more instantaneous information, which is difficult to obtain at satellite overpass time. Hence, in this study, we used the simple EF-based method. Despite its limitations, the Ts_VI triangle model is a good choice to estimate ET estimation based on results from a number of published studies. Independent validation in this study also proves that the use of the Ts_VI triangle model can generate acceptable estimates of daily ET.
Other limitations include the low reliability and coarse resolution forcing data from the global land model, e.g., GLDAS [58]. In this study, data correction by the weather station observations and downscaling by empirical methods to improve the spatial resolution to 0.01 • can reduce this error to a degree [42]. This might be a compromise compared with the complex dynamic downscaling method using the weather research and forecasting model (WRF) at mesoscale [59].
Other LST-based ET model estimates can also be used to train the DNN. For example, the SEBS and Norman-95 (N95) models are widely used in regional ET estimates using remotely sensed LST [13,60]. They have similar advantages in that both have reasonably favorable accuracy and direct monitoring, while their disadvantage is in being constrained by clouds because of the sensitivity of LST. Our results using the Ts-VI triangle model present a satisfactory case. It should be noted that different ET models may have distinguishable differences. Quality control is critical in this extension of the application, where quality is more important than the number of datapoints. Meanwhile, DNN still has less ability to capture extreme values (very high and low ET) than traditional neural networks. Future studies will explore methods to improve this DNN.

Conclusions
LST-based ET models are confronted with difficulties in estimating temporally continuous daily actual ET. In this study, we developed a gap-filling algorithm using DNN for the Ts-VI triangle model to obtain temporally continuous daily actual ET in an arid area of China. High-quality discontinuous daily ET was obtained from the TSVI_Ori ET model under clear-sky condition using high-quality LST observations. We took advantage of these high-quality ET estimates, along with meteorological forcing data, to train the DNN. Finally, the trained TSVI_DNN model was used to estimate temporally continuous daily actual ET at the regional scale in an arid area. We found that: (1) ET estimates from the TSVI_DNN model showed the ability to capture the high variability in ET with an R of 0.9. The results were highly accurate with an RMSE of 0.86 mm d −1 and an MAD of 0.65 mm d −1 . The bias towards underestimation was low (with a bias of −0.16 mm d −1 ); (2) the ET estimates from TSVI_DNN are highly consistent with the ET estimates from the TSVI_Ori model, although it was necessary to avoid the overfitting of the DNN by adopting methods designed for that task; (3) compared with the NASA official MOD16 ET version 6 product, the accuracy (RMSE) of TSVI_DNN improved estimates by 49.2%. The in situ comparison with measurements and intercomparison with the MOD16 product showed that TSVI_DNN demonstrated outstanding performance. In summary, due to the high non-stationarity and non-linearity characteristics of the ET process, the DNN provides a practical alternative to estimating temporally continuous daily ET based on the LST-based ET model.