A Scheme to Estimate Diurnal Cycle of Evapotranspiration from Geostationary Meteorological Satellite Observations

The diurnal cycle of evapotranspiration (ET) is significant in studying the dynamics of land–atmosphere interactions. The diurnal ET cycle can be considered as an indicator of dry/wet surface conditions. However, the accuracy of current models in estimating the diurnal ET cycle is generally low. This study developed an improved scheme to estimate the diurnal cycle of ET by solving the surface energy balance equation combined with simplified parameterization, with daily ET as the constraint. Meteosat Second Generation (MSG) land surface temperature, and longwave and shortwave radiation products were the primary inputs. Daily ET was from the remote sensing-based ETMonitor model. The estimated instantaneous (30 min) ET from the improved scheme outperformed the official MSG instantaneous ET product when compared with in situ half-hourly measurements at 35 flux sites from the FLUXNET2015 dataset, and was also comparable with European Center for Medium-Range Weather Forecasts (ECMWF) ERA5 ET data, with an R2 of 0.617 and root mean square error (RMSE) of 65.8 W/m2 for the improved scheme. Results were largely improved compared with those without daily ET as the constraint. The improved method was stable for the estimation of ET’s diurnal cycle at the similar atmospheric conditions and the accuracy was comparative at different land cover surfaces. Errors in the input variables and the simplification of surface heat flux parameterization affected surface energy balance closure, which can lead to instability of the solution of constants in the simplified parameterization and further to the uncertainty of ET’s diurnal cycle estimation. Measurement errors, different source areas in measured variables, and inconsistent spatial representativeness between remote sensing and site measurements also impacted the evaluation.


Introduction
Turbulent fluxes of sensible and latent heat represent the principal means by which the surface cools and exchanges moisture with the atmosphere [1]. Evapotranspiration (ET) is the leading way of surface water loss from the land surface. Latent heat flux due to ET is a significant component in the surface energy balance equation [2,3]. ET and latent heat flux are different expressions of the same process and can be converted to each other. Accurate ET estimation is critical in many fields, such as atmospheric and hydrological processes, climate change, crop irrigation management, drought monitoring, and water resources management [4][5][6][7][8]. Remote sensing (RS) techniques provide an approach to obtain ET on the regional and global scales. Current RS-based ET models are mainly classified as land surface changes [36]. The third version of the Global Land Evaporation Amsterdam Model (GLEAM v3) (https://www.gleam.eu/) was developed. Two datasets on the daily scale, differing in their forcing and temporal coverage, are currently available [37]. ETMonitor is a process-based ET model with mainly satellite observation datasets as inputs. It has the capacity to generate global daily ET at a high spatial resolution of 1 km [38]. Considering these available daily ET products, the objective of this study was to develop an improved scheme from that of Lu et al. (2014) to estimate ET's diurnal cycle with daily ET as the constraint. Daily ET data were from ETMonitor model, and MSG land surface temperature and radiation products were the input data sources. All in situ measurements in this study were from the FLUXNET2015 dataset, which was used to evaluate the estimated results. Results were also compared with MSG ET products and European Center for Medium-Range Weather Forecasts (ECMWF) ERA5 ET data. The method and data are detailed in Section 2. Section 3 gives the evaluation and comparison results and analyzes the error sources for the improved method. Conclusions are summarized in Section 4.

Methods
The principle of the parameterization scheme for estimating continuous surface fluxes, developed by Lu et al. (2014), is that the surface sensible heat flux (H), latent heat flux (LE), and soil heat flux (G) are first simplified as the functions of LST, air temperature, and some constants in one day period derived from bulk heat transfer equation and the heat conductivity equation on the basis of some reasonable assumptions. With known net radiation, unchangeable constants in the parameterization scheme in one day can be solved by a minimization technique based on the surface energy balance equation. H, LE, and G were, respectively, parameterized as (1) where T s and T a are land surface temperature and air temperature, respectively (K); P s (T s ) is the saturated vapor pressure at T s (hPa); ∂T T=T s is the derivative of the saturated vapor pressure at T s (hPa/K); t is time in the unit of hour; and d 1 -7 are the unknown constants. Combining Equations (1)-(3), the surface energy balance equation (R n = H + LE + G) can be rewritten as where R n is net surface radiation (W/m 2 ). ϕ k (T s , T a , t) are functions related to T s and T a within square brackets in Equations (1-3), i.e., ϕ k (T s , T a , t) = With known R n , T s , and T a , Equation (4) is converted into a linear system as where A ij = ϕ i (T s , T a , t j ), 1 ≤ i ≤ 7, 1 ≤ j ≤ n, n is the number of measurements, and → b j = R n (t j ). Because of errors in the simplified parameterization schemes and input variables, Equation (5) essentially solves an ill-posed problem. It is difficult to obtain true solutions for the ill-posed problem. In the initial version of this parameterization scheme, the least-squares method with weak constraints was used to solve Equation (5). Constraints included d k (k = 1, 2, 3, 4, 6, 7) > 0 and d 5 < 0. The weak constraint sometimes brought an unstable solution for d k , leading to large LE estimation errors, especially at dry surface condition.
The accurate solution of the ill-posed problem needs the assistance of more prior knowledge. With the release of some available ET products at the daily scale, actual daily ET can be considered as prior knowledge to constrain the solution of Equation (5) and further improve the accuracy of the diurnal cycle of LE estimation. This is the improvement of this study compared with the study of Lu et al. (2014). Equation (5) was also solved by the least-squares method in this study, but with new constraint conditions implemented by the lsqlin function in MATLAB software, i.e., where d k (k = 1, 2, 3, 4, 6, 7) > 0 and d k (k = 5) < 0 are the weak constraints in the original version. (d 3 ϕ 3, j + d 4 ϕ 4,j + d 5 ) nighttime = 0 is the assumption of no ET at nighttime.
<= ET daily is the main improvement in this study, i.e., known daily ET was used to restrain LE estimation at the instantaneous scale. In theory, because there are seven unknown variables in Equations (1)-(3), at least seven sets of known values of T s , T a , and R n during the daytime are required as inputs of Equation (6). After obtaining the unknown constants of d 1 -7 , the instantaneous LE in one day is calculated by Equation (2) with known d 3 , d 4 , d 5 , T s, and T a . In this study, input variables of T s and R n were from MSG products at a half-hourly scale; input T a was from ECMWF ERA5 data at hourly temporal resolution, and linearly interpolated to 30 min interval; and daily ET was from the ETMonitor model. In situ measurement data for evaluation and comparison were from the FLUXNET2015 dataset. All these data and the data process are detailed in Section 2.2. The flow of this study is shown in Figure 1, and details are given in Section 2.2.

MSG Data
Various MSG-derived operational products were produced by the LSA-SAF partners and distributed via their website (https://landsaf.ipma.pt). The MSG products can be found in four geographical regions (Euro, NAfr, SAfr, and SAme) and/or on one area covering the four regions (MSG-Disk) in the current LSA-SAF system. MSG Euro products were downloaded and used in this study, including operational products of MSG land surface temperature (MLST) at a temporal resolution of 15 min, MSG downward surface shortwave flux (MDSSF) and downward surface longwave flux (MDSLF) at the temporal resolution of 30 min, MSG daily surface albedo (MDAL), MSG evapotranspiration (MET) at a temporal resolution of 30 min, and MSG daily evapotranspiration (DMET) in 2013. The internal product of daily land surface emissivity (EM) was also downloaded for R n calculation with MLST, MDSLF, MDSSF, and MDAL products. MDSSF and MDAL were used to calculate net shortwave radiation (R ns ), and EM, MLST, and MDSLF were used to calculate net longwave radiation (R nl ). Net radiation is then the sum of R ns and R nl . This study used all these instantaneous products at a temporal frequency of 30 min, coincidental with in situ measurements. The spatial resolution of MSG products was 3.1 km. Calculated instantaneous R n , MLST, and ERA5 T a , as inputs of Equation (6), were used to estimate LE's diurnal cycle. Instantaneous LE estimation was then evaluated by in situ LE measurements from the FLUXNET2015 dataset, and compared with MSG MET product and ERA5 ET data. The MET, DMET, and MLST products and the calculated R n were also assessed by in situ measurements. The process can be found in the flowchart in Figure 1. The units of the released MET and DMET products, mm/h and mm/day, respectively, were converted into W/m 2 by the latent heat of vaporization of 2.45 × 10 6 J/kg.

ETMonitor Daily ET Data
ETMonitor is a process-based model with remote sensing data as inputs [13,39,40] that was developed on the basis of the Shuttleworth-Wallace dual-source model. It includes five main modules of soil evaporation, vegetation transpiration, canopy interception evaporation, water body evaporation, and ice and snow sublimation. Daily ET is the sum of the evaporation of the five components. Global daily ET products for 2013 and 2014 were released by the National Tibetan Plateau Data Center [41]. Only 2013 data were used for this study. The ET version was generated by using ECMWF ERA5 reanalysis data as forcing driving data, and remote sensing-based net radiation, dynamical area of water bodies, leaf area index, soil moisture, and land use and land cover were also used as inputs. The temporal and spatial resolutions of this ET dataset were 1 day and 5 km, respectively. The unit of this dataset is mm/day, which in this study was converted into the daily average latent heat flux in the unit of W/m 2 by the latent heat of vaporization of 2.45 × 10 6 J/kg. ETMonitor daily ET data were assessed by FLUXNET2015 in situ measurement data and compared with MSG DMET products.

FLUXNET2015 Dataset
The FLUXNET2015 dataset is the most recent dataset (last updated on 6 February 2020) produced by the FLUXNET research community, and it includes over 1500 site-years of data from 212 sites of multiple regional flux networks, such as AmeriFlux, AsiaFlux, and EuroFlux. The FLUXNET2015 dataset includes several improvements to data quality control protocols and the data processing pipeline compared to previous versions of FLUXNET Marconi dataset (2000) and the FLUXNET LaThuile dataset (2007) (https://fluxnet.fluxdata.org/data/fluxnet2015-dataset/). The FULLSET Data Product for the FLUXNET2015 release includes all data product variables with a standardized data format, a uniform data quality flag, highly vetted gap filling, and many variables generated by intermediate processing steps to allow for the in-depth understanding of individual processing steps and their effect in the final data products. Taking MSG cover for the Euro part and the available FLUXNET2015 dataset in 2013 into consideration, there were 35 sites with high data quality that were lastly selected in this study. General information for the 35 selected sites is listed in Table 1, and the location is shown in Figure 2.  There were 2 mixed forests (MF), 6 croplands (CRO), 6 grasslands (GRA), 11 evergreen needleleaf forests (ENF), 5 water (WET), 4 deciduous broadleaf forest (DBF), and 1 evergreen broadleaf forest (EBF) sites. Mean annual precipitation for these sites varied from about 470 to 1650 mm. Observed incoming and outgoing longwave radiation was used to calculate LST on the basis of the Stefan-Boltzmann law to compare against LST retrieval from MSG data. The non-closure of the energy balance is a problem of eddy covariance flux data. This study directly used the measured H and LE without any correction for the evaluation of the estimated results and the comparison with MSG and ERA5 ET data. To reduce the cloud's influence, those days with at least 10 valid MSG-LST values during daytime were used to evaluate the performance of the improved method. Lastly, 1431 days were available, and the number of the selected days for each site is given in Table 1.
For the comparison of RS-based results with in situ measurements, the RS-based value is from the pixel value containing the site location. The spatial representativeness of RS data used in this study was about 3-5 km. The ground-based observation was commonly conducted with a smaller spatial representativeness, from several hundred meters to several kilometers for the tower-based eddy covariance measurement [42]. The footprint of the eddy covariance measurement depends on measurement height, wind direction/velocity, atmospheric stability, and underlying surface conditions [43]. This mismatch of spatial representativeness may be ignorable for homogeneous regions but may introduce large bias for results in heterogeneous regions. To test the homogeneity of the RS pixels with flux site, the proportions of the predominant land cover classes in the pixel with flux station was obtained by using MCD12C1 data (Table 1). MCD12C1 data are the MODIS Land Cover Climate Modeling Grid data product, which provided the sub-pixel proportions of each land cover class in each 0.05 degree pixel, similar to the spatial resolution of RS-based data in this study. The proportions of the predominated land cover class varied from 29% to 98%. There are 30 sites with the proportion of the predominated land cover class larger than 50%, which could mean that the land cover in the pixel with flux site was relatively single. For those sites with the proportion of the predominated land cover class less than 50%, the number of land cover classes was at least 4 and up to 10 at CZ-wet site with a maximum proportion of 29%, which indicated that the surface conditions at these sites were heterogeneous. It is commonly challenging to obtain the measurement at the pixel scale for a heterogeneous surface [44]. Therefore, the in situ measurements from FLUXNET2015 was directly used to validate the value of RS-based data in this study, although some bias may be introduced in the heterogeneous regions.

ERA5 Reanalysis Data
ERA5 is the fifth generation ECMWF atmospheric reanalysis of the global climate. Daily updates are available 5 days behind real time. The two variables of air temperature and evapotranspiration used in this study were from the "ERA5 hourly estimates of variables on single levels" dataset with a spatial resolution of 0.25 • , downloaded from https://cds.climate.copernicus.eu/cdsapp#!/dataset/ reanalysis-era5-single-levels?tab = overview. Air temperature is the temperature of the air at 2 m above land surface, and it is calculated by interpolating between the lowest model level and Earth's surface, considering atmospheric conditions. Evapotranspiration is the hourly accumulated amount of water (in m unit) that evaporated from Earth's surface, including a simplified representation of transpiration from vegetation into vapor in the air. Data on the hourly scale were linearly interpolated to the half-hourly scale in this study. Half-hourly air temperature data were the input of the improved method, while half-hourly evapotranspiration data were used to compare and evaluate the estimated results.

Evaluation Indices
Evaluation indices used in this study were the determination coefficient (R 2 ), root mean square error (RMSE), mean absolute bias (BIAS), and the error frequency distribution histogram. Equations for these indices are listed in Table 2.

Index Annotations
) 2 X and Y are the estimated and observed (referenced) variables, respectively. Error here is the difference of the estimated results to the observed or referenced variables.

Evaluation of Instantaneous LE estimation
Following the flowchart of Figure 1, with MSG LST, ERA5 T a , MSG-based R n , and ETMonitor daily ET as the inputs into the improved scheme, estimated instantaneous LE for the selected days is shown in Figure 3a. Compared with in situ LE measurements from the FLUXNET2015 dataset, R 2 was 0.617, RMSE was 65.8 W/m 2 , with BIAS of 7.6 W/m 2 , which was better than the estimated results without daily ET as the constraint (i.e., the original scheme), shown in Figure 3b. The original scheme generally overestimated instantaneous LE by 61.8 W/m 2 with R 2 of 0.567 and RMSE of 120.9 W/m 2 . The estimated instantaneous LE from the improved scheme was better than MSG ET (MET) product shown in Figure 3c. Some evaluations and comparisons for MET and DMET products were made with in situ measurements, RS-based ET products and global reanalysis data [45,46]. The MSG ET product was generally satisfied. When the MET product for selected days was compared with in situ LE measurements from the FLUXNET2015 dataset in this study, it was underestimated by 14.4 W/m 2 , with R 2 of 0.546 and RMSE of 69.8 W/m 2 . ERA5 ET data were also compared with in situ LE, shown in Figure 3d. Evaluation index values appeared better than the results from the improved method, but there was slight overestimation by ERA5 ET data when ET was large. The error frequency distribution histograms for the four instantaneous LE estimations are shown in Figure 3e. The overestimation of the original method is evident. Error distribution for the improved method was similar to that of the MET product. The majority of errors, 76% for the improved method, 81% for the MET product, and 84% for ERA5 ET data, were between −50 and 50 W/m 2 .  Comparisons of the estimated instantaneous LE by the improved method, the MET product, and ERA5 ET data with in situ measurements for each site are given in the bar chart in Figure 4. For most of these sites, more than 60%, estimated results were consistent with in situ measurements with good correlation of R 2 > 0.7. The MET product had similar performance. The serious underestimation by more than 50 W/m 2 was from the MET product at three Italian sites, namely the IT-BCi croplands site, IT-Tor grassland site, and IT-SR2 evergreen needleleaf forest site, which could be evidently reduced by the improved scheme and ERA5 ET data, except for the IT-Tor site. At the IT-Tor site, the improved method underestimated LE by 46 W/m 2 , and ERA5 ET data gave an LE underestimation by 43 W/m 2 . The RMSE for the improved method and MET products was more than 100 W/m 2 , and 86 W/m 2 for the ERA5 ET data. No matter which method, the immense RMSE value occurred at this site. The IT-Tor site is located in the north of Italy with an elevation of 2160 m, and the mountain is the predominant terrain in this region. The complex underlying surface can lead to more uncertainties for remote sensing-based ET estimates because of heterogeneous surface characteristics [47]. Overestimations from the improved method at BE-Lon, DE-Kli, DE-SfN, DK-Sor, and FR-Pue sites were due to overestimations of daily ET from ETMonitor. Figure 4 also shows that the accuracy of LE estimation did not change with different land covers. Instantaneous LE at the IT-BCi and IT-CA3 sites for nine consecutive days (13)(14)(15)(16)(17)(18)(19)(20)(21) June 2013) is shown in Figure 5. During these nine consecutive days, similar atmospheric and underlying conditions were found. The improved scheme, MET product, and ERA5 ET data could smoothly simulate LE's diurnal cycle except for several outliers from the improved method because of LST uncertainty. However, the LE from the improved scheme was closer to in situ observations than the MET product and ERA5 ET data were for the two sites. For similar atmospheric conditions, the estimated LE was also similar at the same site, implying that the improved scheme was stable for the diurnal cycle of LE estimation. The MET product and ERA5 ET data generally underestimated LE, more seriously at the IT-BCi site. The assumption of no ET at nighttime in the improved scheme was consistent with the MET product at nighttime.

Impact of Input Variables on Diurnal Cycle of ET Estimation
The input variables of the improved scheme for instantaneous LE estimation were instantaneous T a , LST, and R n and daily ET. The error in these variables was an error source for instantaneous LE estimation from the improved scheme. Evaluation results for instantaneous ERA5 T a , MSG LST, and MSG-based R n by FLUXNET2015 are shown in Figure 6a-c. These RS-based variables had good agreement with in situ measurements from the FLUXNET2015 dataset, with an R 2 larger than 0.870. RMSEs for T a and LST were 4.6 and 4.0 K, respectively, and 80.7 W/m 2 for R n . Their error frequency histograms are given in Figure 6d. More than 73% of errors ranged from −4 to 4 K for T a and LST, and from −80 to 80 W/m 2 for R n . Inconsistent spatial representativeness between remote sensing spatial resolution and site observations is a key problem for remote sensing validation [48,49], which is also the main reason for the inconsistency of RS-based variables compared with in situ measurements in this section, as well as for the results shown in Section 3.1.
Daily LE is required as a constraint in the improved method. Although it is not a direct input variable for instantaneous LE estimation, it constrains the solution of Equation (6) and further affects instantaneous LE. Figure 7a shows that ETMonitor daily ET was in good agreement with in situ measurements, with an R 2 of 0.719, RMSE of 27.6 W/m 2 (0.96 mm/day), and slight overestimation, with a BIAS of 6.5 W/m 2 . The result was better than that of the MSG daily ET product, as shown in Figure 7b. As with the MSG instantaneous ET product shown in Figure 3c, the MSG daily ET product underestimated ET by 10 W/m 2 , with an R 2 of 0.490 and RMSE of 35 W/m 2 (1.42 mm/day).  To analyze the sensitivity of the instantaneous LE estimates to the error of the input variable, we used each input variable from site observations to replace each input from remote sensing into the improved method. The estimated instantaneous LE are compared with in situ measurements, and the R 2 , RMSE, and BIAS are shown by the bar chart in Figure 8. The black mark is the result shown in Figure 3a, i.e., all inputs from remote sensing data. When site T a , LST, and R n are, respectively, the input, the estimated LE is not improved, even becoming a little worse, especially for estimation based on site LST as input. Surface net radiation, especially for the outgoing longwave radiation, is related to land surface temperature. When site LST is the input, the input R n is still from MSG-based R n . Inconsistent inputs cause the energy balance non-closure because of the inconsistent radiation source, while surface energy balance is the principal foundation of the method of Lu et al. (2014). When site daily LE instead of ETMonitor daily LE is regarded as the constraint into Equation (6), the estimated instantaneous LE is optimal with R 2 of 0.761, RMSE of 48.5 W/m 2 , and a slight underestimation of 1.5 W/m 2 . The results indicate that the accuracy of the instantaneous LE estimation from the improved scheme depends on the daily ET error and is not very sensitive to the error of T a , LST, and R n . The higher is the accuracy for the daily ET, the closer are the real values to the instantaneous ET estimation.

Uncertainties from Simplified Parameterization
H, LE, and G were simplified as functions of some unknown constants in one-day period, land surface temperature, and air temperature on the basis of the bulk heat transfer and heat conductivity equations, that is, the expressions in Equations (1)-(3). These unknown constants would be solved by the surface energy balance equation, which had been converted to solve the ill-posed problem of Equation (6). The accuracy of simplified parameterization affected the stability of the solution of the ill-posed problem. If simplified parameterization is not perfect, it leads to vast differences between the parameterized net radiation and the one as input, and directly influences the solution of the ill-posed equation. To test the adaptability of these simplified parameterizations at the sites in this study, H, LE, and G parameterizations, given in Section 2.1, were assessed using in situ measurements (Figure 9a-c). Parameterized H, LE, and G were calculated by the fit of the least-squares method according to Equations (1)-(3) using in situ measurements. The parameterized LE was more related to in situ measurements with an R 2 of 0.845 and RMSE of 35.3 W/m 2 . The simplified scheme for H was not very well compared with in situ measurements, which may be attributed to the improper simplification in H parameterization [35]. The deficiency of H parameterization requires more studies in the future. The parameterized G had slight underestimation. Measurement errors, the imperfect parameterization, and the inconsistent source area of the observed variables led to significant discrepancy between the sum of parameterized H, LE, and G and measured R n in the surface energy balance equation. Therefore, even though all inputs in the improved scheme were from in situ measurements, the estimated instantaneous LE was not in good agreement with the in situ measurements shown in Figure 9d, and not better than the results from RS-based inputs shown in Figure 3a. The better results from the RS-based inputs may be attributed to the relatively consistent spatial representativeness from remote sensing data. MET products at the half-hourly scale were used to evaluate the accuracy of LE parameterization. LE parameterization was close to the MET product, with an R 2 of 0.948 and RMSE of 19.0 W/m 2 (Figure 10a). This can easily be understood because the MET product was mainly generated by bulk transfer formulation, which is also the foundation of the simplified parameterization. Generally, the simplification of LE parameterization is reasonable, i.e., LE could be parameterized as the function of three constants, land surface temperature, and air temperature. When all inputs in Equation (6) were from MSG products, estimated instantaneous LE was strongly related to the MSG ET product, as expected, with an R 2 of 0.841 and RMSE of 33.2 W/m 2 (Figure 10b). ET estimation in the MSG ET scheme was based on surface energy balance. Meanwhile, all inputs were from MSG products, implying the same spatial representativeness and good surface energy balance closure. Therefore, under the condition of good surface energy balance closure, the improved scheme could estimate instantaneous LE well. Any factors leading to energy balance non-closure influence the solution of Equation (6). This may be the primary error source of the improved scheme.

Conclusions
The diurnal ET cycle is helpful in the study of the dynamics of the atmospheric boundary layer. Geostationary meteorological satellites can provide continuous surface observation, which is very important for continuous surface flux estimation. Differing from the conventional RS-based ET method, the study used MSG land surface temperature and radiation data as inputs to develop an improved scheme to estimate the diurnal ET cycle by solving the surface energy balance equation with ETMonitor daily ET as the constraint, i.e., a scheme from daily ET to the diurnal cycle of ET. Results were evaluated by measurements at 35 sites located in Europe from the FLUXNET2015 dataset. The estimated half-hourly LE agreed well with in situ measurements, with an R 2 of 0.617 and RMSE of 65.8 W/m 2 , better than those of the MSG ET product, and comparable with ECMWF ERA5 ET data. The improved scheme for estimating the diurnal cycle of ET is not very sensitive to errors in air temperature, land surface temperature, and net radiation, but depends on daily ET errors. When daily ET is from in situ measurements, estimated instantaneous LE can be improved. Errors leading to surface energy balance non-closure influenced the accuracy of the improved scheme. When all inputs were from MSG products, estimated instantaneous LE was strongly related to the MSG ET product. Because land surface temperature is an input of the method, obtaining continuous land surface temperature data, especially during cloudy days, is crucial for this method, which is also key to current LST-based ET methods. With the development of methods for LST retrieval for cloudy days and accuracy improvement of daily ET estimation, the method of the diurnal cycle of ET estimation developed in this study has the application potential.