The Use of Remote Sensing-Based ET Estimates to Improve Global Hydrological Simulations in the Community Land Model Version 5.0

: Terrestrial evapotranspiration (ET) is a critical component of water and energy cycles, and improving global land evapotranspiration is one of the challenging works in the development of land surface models (LSMs). In this study, we apply a bias correction approach into the Community Land Model version 5.0 (CLM5) globally by utilizing the remote sensing-based ET dataset. Results reveal that the correction approach can alleviate both overestimation and underestimation of ET by CLM5 over the globe. The adjustment to overestimation is generally effective, whereas the effectiveness for underestimation is determined by the ET regime, namely water-limited or energy-limited. In the areas with abundant precipitation, the underestimation is effectively corrected by increasing ET without the water supply limit. In areas with rare precipitation, however, increasing ET is limited by water supply, which leads to an undesirable correction effect. Compared with the ET simulated by CLM5, the bias correction approach can reduce the global-averaged relative bias (RB) and the root mean square error (RMSE) by 51.8% and 65.9% against Global Land Evaporation Amsterdam Model (GLEAM) ET data, respectively. Meanwhile, the correlation coefﬁcient (CC) can also be improved from 0.93 to 0.98. Continentally, the most substantial ET improvement occurs in Asia, with the RB and RMSE decreased by 69.7% (from 7.04% to 2.14%) and 70.2% (from 0.312 mm day − 1 to 0.093 mm day − 1 , equivalent to from 114 mm year − 1 to 34 mm year − 1 ), and the CC increased from 0.92 to 0.99, respectively. Consequently, beneﬁting from the improvement of ET, the simulations of runoff and soil moisture are also improved over the globe and each of the six continents, and the improvement varies with region. This study demonstrates that the use of satellite-based ET products is beneﬁcial to hydrological simulations in land surface models over the globe.


Introduction
Evapotranspiration (ET) is a critical component in the Earth's system in linking the energy, water, carbon, and nitrogen cycles. Over half of precipitation on the land surface is returned to the atmosphere through ET [1,2]. There are two different concepts on ET: potential evapotranspiration and actual evapotranspiration. The former is the measure of the ability of the atmosphere to transfer water from the surface by evaporation and transpiration assuming no control on the water supply. In contrast, the latter is the amount of water evaporating from the terrestrial surface and transpiring from plants, which is limited by the amount of water stored in these reservoirs. ET here refers to actual ET, and it mainly consists of plant transpiration, soil evaporation and interception loss. Accurate quantification of global terrestrial ET is necessary for understanding variability in the global water cycle [3]. However, it is challenging to accurately measure and simulate ET,

Model and Forcing Data
The CLM is the land component of the Community Earth System Model (CESM), which was developed through continuous updates and improvements by the National Center for Atmospheric Research (NCAR) of America. The model is structured as follows: each grid cell within CLM is classified into multiple subgrid land units, and each land unit consists of multiple snow/soil columns occupied with diverse plant functional types (PFTs). The land surface processes for each subgrid land unit, column, and PFT are simulated independently and each subgrid unit maintains its own prognostic variables.
CLM5 is the latest version of the CLM. Scientific justification and evaluation related to CLM5 have been conducted [28]. In this study, CLM5 in its offline mode with the prescribed vegetation phenology is used. The surface datasets required in the CLM5 were obtained from a variety of sources. Both the percent PFTs and the prescribed PFT leaf area Remote Sens. 2021, 13, 4460 4 of 27 index was derived from Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data [44]. Prescribed canopy top and bottom heights were retrieved from Bonan [22] as described in Bonan et al. [23]. The soil color was obtained from Lawrence and Chase [45], and the soil texture was derived based on International Geosphere-Biosphere Programme (IGBP) soil dataset, which includes approximately 5000 soil mapping units. The maximum fractional saturated area, slope, and elevation were all retrieved from the USGS HYDRO1K 1-km dataset [46]. The description and model code of CLM5 are available from https: //www.cesm.ucar.edu/models/clm, accessed on 20 May 2020.
In CLM, ET mainly consists of 3 major components, namely interception loss, plant transpiration, and soil evaporation in the vegetated areas. Snow sublimation is also a component of ET, but its magnitude is much smaller than the 3 major component in most areas of global land surface. Hence, the bias correction approach is applied to interception loss, plant transpiration, and soil evaporation only.
The forcing data used in this study is the WATCH Forcing Data methodology applied to ERA5 (WFDE5). It was generated by applying the sequential elevation and the monthly bias correction approach to half-degree aggregated ERA5 reanalysis products. Evaluations against meteorological observations at station sites show that WFDF5 perform better than the ERA interim-based WFDEI for all variables [47]. The application to an uncalibrated hydrological model (WaterGAP) over several large river basins indicates that the use of WFDF5 can lead to more plausible global hydrological water balance components compared with the raw ERA5 data for model forcing [47]. In this study, the WFDE5 data from 1980 to 2018 at the hourly temporal resolution and 0.5 • spatial resolution are selected as the driving force for CLM5, as the 1979 data is not complete.

Methodology
Land surface model is capable of capturing the general pattern of ET. However, the substantial biases still exist due to the uncertainties in meteorological forcing data, surface data, and model structure and parameters [42,43,48]. The bias in ET may further propagate to other variables through the terrestrial hydrothermal processes. Following Parr et al. [42] and Wang et al. [43], we improve the simulations of CLM5 by utilizing remote sensingbased ET over the globe. The technical process consists of 3 parts, as shown in Figure 1. In the first part, namely ET calibration, we simulate ET using the default CLM5 and obtain the ET climatology for an historical period. The GLEAM based ET is rescaled to match the resolution of model simulations. The ET scaling factor is then calculated as the ratio of the corresponding rescaled GLEAM-based ET to the modeled ET for the same historical period. This calculation is implemented for each month and each grid cell. In the second part, namely ET correction, we obtain a modeled ET for any other period (historical or future) by running the default CLM5 and a correct modeled ET by multiplying the scaling factor for each grid. Then, the corrected ET is fed into CLM5 to replace the modeled ET for the same period, and the simulated hydrological processes are adjusted accordingly. Subject to the physical mechanisms in the model, the ET and other hydrological variables are adjusted in the model automatically. In the third part, the corrected ET, corrected runoff, and soil moisture are output, which are validated against the reference data. The underlying assumption is that the relationship between GLEAM-based ET and modeled ET remain unchanged from one period to the other, but it has a unique seasonal cycle and spatial variability. To implement the method, we carry out 2 types of simulations, namely "CLM" and "CLMET", respectively. The former denotes that simulation with the default CLM5 configuration, while the latter denotes that simulation in which the model-generated ET is replaced with the corrected ET. In the CLM, the modeled ET components, namely vegetation transpiration, soil evaporation, and interception loss, are recorded at the PFT level for every time step (1 h). Then, these 3 components are corrected by multiplying the scaling factor. In the CLMET simulation, the CLM5 is re-run for the same period as CLM, but the 3 ET components are overwritten by the corresponding corrected ones. Therefore, it should be noted that the correction method is actually applied to the calculation in every Remote Sens. 2021, 13, 4460 5 of 27 time step of CLM5. The hourly simulation is aggregated into the temporal averages at different time scales (namely monthly, seasonally, and annually), which are evaluated in the following analysis.
Sens. 2021, 13, x FOR PEER REVIEW 5 of 28 recorded at the PFT level for every time step (1 h). Then, these 3 components are corrected by multiplying the scaling factor. In the CLMET simulation, the CLM5 is re-run for the same period as CLM, but the 3 ET components are overwritten by the corresponding corrected ones. Therefore, it should be noted that the correction method is actually applied to the calculation in every time step of CLM5. The hourly simulation is aggregated into the temporal averages at different time scales (namely monthly, seasonally, and annually), which are evaluated in the following analysis. . Variable denotes the month-and grid-specific relationship (i.e., scaling factor) between model-based simulation and GLEAM-based observation, denotes the GLEAM-based ET value for specific time period, demotes the model-based value for the same time period with GLEAM-based ET, and denotes the simulated ET components for each time step in the default CLM5 over the time of ET correction. SM denotes soil moisture.
In this study, due to the span of the meteorological forcing data (1980-2018), we evenly divide the entire duration into 2 periods, namely 1980-1999 and 2000-2018. For each period, the first 6-year run is disregarded as spinup to ensure that the model reaches to the equilibrium state. The 1986-1999 is taken as the calibration period and 2006-2018 as the validation period. Firstly, the CLM type simulation is conducted in a 20-year (1980-1999) period with the first 6-year run disregarded as spinup. Then, we conduct both CLM type and CLMET type simulations during the validation period with the initial condition of 1st January 2000 recorded from the calibration period. It is worth noting that the overwriting process for ET may break the water balance. If that happens, we check whether the water stored in the vegetation canopy meets the demand for interception loss and whether the water stored in the soil meets the demand for soil evaporation and plant transpiration throughout the model time step. If not, the interception loss (soil evaporation and plant transpiration) rate is set to be equal to the available water stored in the vegetation canopy (soil) divided by the model time step. Figure 1 illustrates a schematic diagram describing the workflow of implementing the ET bias correction approach into CLM5.
Four statistics, namely bias, relative bias (RB), root-mean-square error (RMSE), and correlation coefficient (CC) are selected to evaluate the performance of the 2 types of simulations by comparing the modeled variables with the reference datasets. The statistics are calculated as follows:  In this study, due to the span of the meteorological forcing data (1980-2018), we evenly divide the entire duration into 2 periods, namely 1980-1999 and 2000-2018. For each period, the first 6-year run is disregarded as spinup to ensure that the model reaches to the equilibrium state. The 1986-1999 is taken as the calibration period and 2006-2018 as the validation period. Firstly, the CLM type simulation is conducted in a 20-year (1980-1999) period with the first 6-year run disregarded as spinup. Then, we conduct both CLM type and CLMET type simulations during the validation period with the initial condition of 1st January 2000 recorded from the calibration period. It is worth noting that the overwriting process for ET may break the water balance. If that happens, we check whether the water stored in the vegetation canopy meets the demand for interception loss and whether the water stored in the soil meets the demand for soil evaporation and plant transpiration throughout the model time step. If not, the interception loss (soil evaporation and plant transpiration) rate is set to be equal to the available water stored in the vegetation canopy (soil) divided by the model time step. Figure 1 illustrates a schematic diagram describing the workflow of implementing the ET bias correction approach into CLM5.
Four statistics, namely bias, relative bias (RB), root-mean-square error (RMSE), and correlation coefficient (CC) are selected to evaluate the performance of the 2 types of simulations by comparing the modeled variables with the reference datasets. The statistics are calculated as follows: where N represents the total number of grid cells, S i and R i represent the spatial mean of the modeled and reference values for grid cell i, respectively. cov S i , R i represents the covariance between S i and R i , σ S i and σ R i represent the standard deviation of S i and R i , respectively. Meanwhile, RMSE and -CC are also used to evaluate the temporal performance of the two types of model simulations.

Data
Two remote sensing-based ET products, GLEAM ET and MODIS ET [7,10], and observation-based FLUXNET Multi-Tree Ensemble (MTE) ET [11,49] are used in this study for model calibration or validation. The GLEAM ET during 1980-1999 is used to derive ET scaling factors, and this data during 2000-2018 is used to assess the effectiveness of the correction method. MODIS ET and MTE ET are adopted as the independent data for model assessments. The observation-based runoff data University of New Hampshire-Global Runoff Data Centre (GRDC) and the Soil Moisture Active Passive (SMAP) are also taken as the reference data to assess the impact of the correction method on the hydrological processes. We upscale the finer resolution data to 1 • grid cells to be in line with the model output by the simple arithmetic average.

GLEAM ET
The GLEAM version 3.5a [6,50,51] is used to calculate the ET scaling factors and to validate the simulated ET in the two types of simulations. It was derived based on reanalysis radiation and air temperature, a combination of gauge-based reanalysis and satellite-based precipitation, and satellite-based vegetation optical depth, spanning the time period from 1980 to 2020. The potential evaporation in GLEAM 3.5a was calculated using the PT equation based on observations of surface net radiation and near-surface air temperature, and is converted into actual evaporation based on the evaporative stress factor [52,53]. Recent evaluations showed the GLEAM ET product performs better than other remote-sensed ET products in estimating land surface evapotranspiration [6,32]. The GLEAM dataset is provided at a spatial resolution of 0.25 • and daily, monthly, and yearly time scale.

MODIS ET and FLUXNET-MTE ET
To evaluate the improvement effectiveness, MODIS ET and MTE ET are also used for independent assessments of the performance of the two types of simulations. The MODIS-based ET dataset used in this study was developed by the Numerical Terradynamic Simulation Group (NTSG), University of Montana. The monthly MODIS-based ET product is available from 2000 to 2014 with a spatial resolution of 0.5 • . The MTE ET dataset was derived from upscaling eddy covariance measurements based on the global network of eddy covariance flux towers. It has been used to improve the simulation of soil evaporation, evaluate the performance of ecosystem models, and assess the evapotranspiration variability [28,54,55]. The monthly MTE ET product is available from 1982 to 2011 at the 0.5 • resolution. In this study, both of the two ET datasets are applied to independently validate the model performance.

University of New Hampshire-GRDC Runoff
The climatological monthly runoff over the globe used in this study is the University of New Hampshire-Global Runoff Data Centre (GRDC) Composite Runoff Fields V1.0. It was derived by combining the observed gauge river discharge data from the GRDC with outputs of a water balance model driven by the observational forcing meteorology data. The combined runoff fields preserve the accuracy of the discharge measurements as well as the spatial and temporal distribution, and is considered "best estimate" of terrestrial runoff over the globe [56]. The GRDC runoff data provides multi-year averaged annual and monthly runoff with a spatial resolution of 0.5 • .

SMAP Soil Moisture
The Soil Moisture Active Passive (SMAP) mission [57] was designed to provide a global-scale mapping of soil moisture measured by L-band (1.41 GHz) passive and active microwave sensors since 2015. Since the L-band is more sensitive to soil moisture and more easily penetrates vegetation layers than other bands, the SMAP dataset is considered one of the most promising products for soil moisture. SMAP generates a range of products and soil moisture retrievals [58]. In this study, the surface soil layer (0-5 cm) and root zone soil layer (0-100 cm) soil moisture data of SMAP L4 EASE-Grid (version 5) from April 2015 to December 2018 with a 9 km spatial resolution are adopted to evaluate the performance in simulating soil moisture in these two layers [59].

ET Scaling Factor
According to the methods described in Section 2, the ET scaling factors for each month over the global land are calculated based on the CLM simulation and the GLEAM data in the period 1986-1999, and their spatial patterns and continental averages are shown in Figure 2 and Table 1, respectively. It should be noted that the GLEAM-derived dew may not be consistent with the CLM-modeled dew in some areas of the high latitudes in the northern hemisphere, which results in a negative scaling factor. If that happens, we do not scale ET and mask these areas out in Figure 2. In general, during the boreal summer such as May, June, and July, the scaling factors are approximately 1.0, indicating the model simulations are in line with GLEAM estimations. By contrast, the differences between simulations and GLEAM estimations are large during the boreal winter, and the scaling factors are substantially larger than 1.0, especially for December and January. Overall, the global land averaged ET simulated by CLM are lower than GLEAM estimations, as the ET scaling factors are greater than 1.0 for all 12 months. Despite the overall underestimation on the global average, overestimations still occur in some months over several continents. For example, overestimations are pronounced during May, June, and July over Asia and Europe. The magnitudes of the scaling factors remarkably vary with continent and season. For instance, the area-averaged scaling factors for January are 1.50, 3.30, 0.90, 1.99, 3.50, and 1.04 for Africa, Asia, Australia, Europe, and North America and South America, respectively. For the same continent, the difference in the scaling factors among different months is large, for example, the maximum value in January is almost four times as large as the minimum value in May over Asia. Overall, the scaling factors vary greatly with month and continent, which indicates that the difference is evident between GLEAM ET and CLM-simulated ET. Therefore, there is a strong potential to improve CLM in simulating ET.

Evaluation
The performance of CLM and CLMET are evaluated based on various reference datasets described in Section 3. Four statistical metrics, namely bias, RB, RMSE, and CC are used as the quantitative criterion. In the following evaluations, Greenland, Antarctica, Sahara, Arabian Peninsula, and Taklimakan are excluded in the analysis and masked out in the following figures, as some of the reference data do not cover these areas. Table 2 presents the temporal evolution of simulated ET from types of simulations (CLM and CLMET) over the globe against GLEAM-based ET during the period of 2006-2018. It can be seen that the overestimations in CLM generally exist during this period, and the most notable overestimation occurs in 2018, with the RB of 9.3% (Table 2). CLMET effectively improve the performance in ET simulation by alleviating the general overestimations, and the RBs of most years are within 4%. The RMSE are also significantly reduced during this period. Spatially, Figure 3 illustrates the multi-year averages (2006-2018) of ET derived from GLEAM, simulated by the CLM and CLMET, and the RBs of these two simulations against GLEAM. Generally, both the CLM and CLMET reasonably capture the spatial patterns of ET compared with the GLEAM data, e.g., higher ET in

Evaluation
The performance of CLM and CLMET are evaluated based on various reference datasets described in Section 3. Four statistical metrics, namely bias, RB, RMSE, and CC are used as the quantitative criterion. In the following evaluations, Greenland, Antarctica, Sahara, Arabian Peninsula, and Taklimakan are excluded in the analysis and masked out in the following figures, as some of the reference data do not cover these areas. Table 2 presents the temporal evolution of simulated ET from types of simulations (CLM and CLMET) over the globe against GLEAM-based ET during the period of 2006-2018. It can be seen that the overestimations in CLM generally exist during this period, and the most notable overestimation occurs in 2018, with the RB of 9.3% (Table 2). CLMET effectively improve the performance in ET simulation by alleviating the general overestimations, and the RBs of most years are within 4%. The RMSE are also significantly reduced during this period. Spatially, Figure 3 illustrates the multi-year averages (2006-2018) of ET derived from GLEAM, simulated by the CLM and CLMET, and the RBs of these two simulations against GLEAM. Generally, both the CLM and CLMET reasonably capture the spatial patterns of ET compared with the GLEAM data, e.g., higher ET in Amazonia, central Africa, Indonesia islands, and lower ET in western America, Alaska, north Siberia, and southeast Australia, as shown in Figure 3a-c. In addition, the global-averaged ET are almost the same among GLEAM, CLM, and CLMET. However, both overestimation and underestimation do exist at the regional scale in CLM; the areas with overestimation and underestimation are nearly half and half ( Figure 3e). CLMET effectively alleviates both the overestimation and underestimation, obtaining ET values closer to GLEAM. The global-averaged RB in CLM is 5.27%, with a substantial portion of areas where RBs exceed 10%. In CLMET, the global-averaged RB is reduced to 2.54%, and RBs are within 10% in most areas of the global land. The RMSE value in CLM is 0.346 mm day −1 (126 mm year −1 ), which is reduced to 0.118 mm day −1 (43 mm year −1 ) in CLMET. Likewise, the CC is improved from 0.93 in CLM to 0.98 in CLMET. The improvement is more remarkable over Asia and North America than other continents, with the RMSE reduced by 70.2% and 67.8%, respectively ( Table 3 ally decreases with the latitude increasing towards the two poles. CLMET significantly reduces RMSE at almost all latitudes, with a greater improvement in low-latitude areas. The statistics of the two types of simulations during different seasons over six continents are shown in Table 3, and the relative difference of multi-year seasonal averaged ET is presented in Figure 5. It can be seen that the CLM overestimates the global-averaged ET in March-April-May (MAM) and June-July-August (JJA), while it underestimates ET in September-October-November (SON) and December-January-February (DJF). The RBs are 16.39%, 6.84%, −4.56% and −12.68% for MAM, JJA, SON, and DJF, respectively. At the continental scale, the largest overestimation occurs in Asia during MAM, with an RB as large as 23.95%, whereas the largest underestimation occurs in North America during DJF, with an RB of −39.87%. CLMET significantly ameliorates the model's performance, as almost all the statistics in CLMET are superior to those in the CLM. The improvement from CLM to CLMET is more substantial for JJA and SON than DJF and MAM. The RB is reduced from 6.84% in CLM to 3.51% in CLMET during JJA, and from −4.56% in CLM to −1.56% in CLMET during SON. As for continents, the greatest improvement occurs over Asia, with the RB and RMSE reduced by 69.6% (from 7.04% in CLM to 2.14% in CLMET) and 70.2% (from 0.312 mm day −1 in CLM to 0.093 mm day −1 in CLMET, equivalent to from 114 mm year −1 to 34 mm year −1 ), respectively. Meanwhile, the CC over Asia is also significantly improved. Amazonia, central Africa, Indonesia islands, and lower ET in western America, Alaska, north Siberia, and southeast Australia, as shown in Figure 3a-c. In addition, the globalaveraged ET are almost the same among GLEAM, CLM, and CLMET. However, both overestimation and underestimation do exist at the regional scale in CLM; the areas with overestimation and underestimation are nearly half and half (Figure 3e) Figure 4. The largest RMSE value of CLM-simulated ET against GLEAM occurs near the equator where the magnitude of ET is relatively larger.

ET
The value gradually decreases with the latitude increasing towards the two poles. CLMET significantly reduces RMSE at almost all latitudes, with a greater improvement in lowlatitude areas. The statistics of the two types of simulations during different seasons over six continents are shown in Table 3, and the relative difference of multi-year seasonal averaged ET is presented in Figure 5. It can be seen that the CLM overestimates the globalaveraged ET in March-April-May (MAM) and June-July-August (JJA), while it underestimates ET in September-October-November (SON) and December-January-February (DJF). The RBs are 16.39%, 6.84%, −4.56% and −12.68% for MAM, JJA, SON, and DJF, respectively. At the continental scale, the largest overestimation occurs in Asia during MAM, with an RB as large as 23.95%, whereas the largest underestimation occurs in North America during DJF, with an RB of −39.87%. CLMET significantly ameliorates the model's performance, as almost all the statistics in CLMET are superior to those in the CLM. The improvement from CLM to CLMET is more substantial for JJA and SON than DJF and MAM. The RB is reduced from 6.84% in CLM to 3.51% in CLMET during JJA, and from −4.56% in CLM to −1.56% in CLMET during SON. As for continents, the greatest improvement occurs over Asia, with the RB and RMSE reduced by 69.6% (from 7.04% in CLM to 2.14% in CLMET) and 70.2% (from 0.312 mm day −1 in CLM to 0.093 mm day −1 in CLMET, equivalent to from 114 mm year −1 to 34 mm year −1 ), respectively. Meanwhile, the CC over Asia is also significantly improved.         We also assess CLM and CLMET with two independent reference datasets of ET, MODIS-based, and MTE-based ET during the overlap period (2006-2011) between MODIS and MTE. Figure 6 shows the global map of the multi-year averaged ET in CLM and CLMET and their RBs against either MODIS-or MTE-based ET. In terms of global averaged ET, both CLM and CLMET perform similarly relative to the reference data, with the RBs less than 2%. However, it can be seen that the RBs in CLMET are smaller than those in CLM, and the CCs in the former are also higher than those in the latter, when comparing with MODIS or MTE in most continents (Table 4). For example, the evident ET underestimation occurs in Europe with an RB of −6.303% and an RMSE of 0.127 against MTE, and CLMET substantially reduces the RB and RMSE by 32.2% and 23.6%, respectively. Meanwhile, the CC is improved from 0.92 in CLM to 0.96 in CLMET. This is consistent with the result obtained from the validation against the GLEAM-based ET. Additionally, the latitudinal profiles of multi-year averaged RMSE values during the overlap period (2006-2018) between CLM-/CLMET-simulated ET and MODIS-/MTE-derived ET are provided in Figure A1. At the seasonal time scale, the CLM overestimates the globalaveraged ET in MAM and JJA, while it underestimates ET in the other seasons compared to MODIS-and MTE-based ET ( Figures A2 and A3 and Tables A1 and A2), which is also in line with the results obtained from the comparison with the GLEAM-based ET. CLMET reduces biases for all seasons except for MAM when the reference dataset is MTE. The difference between model-simulated ET and remote sensing-derived ET remains large during DJF, nevertheless, the improvement is still significant for most continents during this season.
to MODIS-and MTE-based ET ( Figures A2 and A3 and Tables A1 and A2), which is also in line with the results obtained from the comparison with the GLEAM-based ET. CLMET reduces biases for all seasons except for MAM when the reference dataset is MTE. The difference between model-simulated ET and remote sensing-derived ET remains large during DJF, nevertheless, the improvement is still significant for most continents during this season.   Furthermore, we analyze climatological seasonal cycles of ET from the two types of simulations and GLEAM over the global land and six continents to validate the improvement from CLM to CLMET, as shown in Figure 7. Globally, the CLMET simulation is closer to GLEAM compared with CLM, with an RMSE value reduced from 0.06 to 0.02 (Figure 7a), which demonstrates the efficient correction effect. The improvement can be attributed to CLMET's ability in mitigating the underestimation for SON and overestimation for JJA in most continents in the CLM simulation. The model performance varies with the continent. For instance, the improvement effectiveness is relatively low in the areas where the seasonal ET variability is small, such as Africa and Australia, whereas the effectiveness is high in the areas where the ET seasonality is strong, such as Asia, Europe, and North America. The time series of the simulated ET in the CLM and CLMET against GLEAM and the corresponding spatial RMSE values over the globe and six continents is illustrated in Figure 8. The correction effectiveness of the proposed method for overestimation existing in CLM is usually efficient over most continents and most seasons, which is consistent with the finding of Wang et al., (2017). However, the effectiveness for underestimations depends on whether the area is dominated by water limit regime or not, which can be verified by the comparison between South America and Australia. In South America (Figure 8g), ET is high all year around with the value of greater than 2.0 mm day −1 (Figure 7g) because of the abundant rainfall and radiation. With sufficient water supply, ET can be increased in CLMET without water limit, which corrects the ET underestimation in CLM. However, the adjustment from lower values to higher values are restricted by the water supply in water-limited regimes in Australia (Figure 8d). When this adjustment is implemented, the model checks whether water stored in the soil layer and vegetation canopy can satisfy the demand for elevating ET. The extent of increasing ET relies on the availability of water stored in these reservoirs. As a result, the limited bias correction for underestimation in Australia may result from the limited water supply. The correction effectiveness is largely determined by the water supply-controlled mechanism and varies with continents. Figure 9 shows the boxplots of RMSE of monthly ET simulated from CLM and CLMET against GLEAM during the period 2006-2018 globally and over six continents. Both medians and ranges of RMSE in CLM over six continents are very different. The median in Australia is 0.16 mm day −1 (4.87 mm month −1 ), whereas the value in South America is as large as 0.47 mm day −1 (14.30 mm month −1 ). The ranges in these two continents are 0.39 mm day −1 (11.86 mm month −1 ) and 1.33 mm day −1 (40.45 mm month −1 ), respectively. The values of the medians and RMSE are greatly reduced by CLMET. For instance, the range of RMSE in South America becomes 0.40 mm day −1 (12.17 mm month −1 ), which is a 71% reduction from CLM to CLMET. The improvement in ET simulations from CLM to CLMET against the MODIS or MTE ET is similar to the improvement with GLEAM as the reference data ( Figures A4-A6

Runoff
In this study, we use the total runoff retrieved from the University of New Hampshire-GRDC to assess the performance of CLM and CLMET in simulating runoff. The global-averaged total runoff simulated by CLM and CLMET is similar to the GRDC-based estimate, and the values are 0.88, 0.86, and 0.84 mm day −1 (321, 314, and 307 mm year −1 ) for CLM, CLEMT, and GRDC, respectively. Regionally, CLM evidently overestimates total runoff over more than half of the global land, such as central United States, Argentina, central Africa, Indian Peninsula, southern Europe, northeast China, and most areas of Australia, while underestimations occur over Alaska, central Amazon, south Africa, and northern Siberia (Figure 10e). CLMET effectively alleviates both the overestimations and underestimations, with a global average RB reduced from 18.66% to 16.77%, RMSE reduced from 0.661 mm day −1 (241 mm year −1 ) to 0.415 mm day −1 (151 mm year −1 ), and CC increased from 0.83 to 0.94. In terms of the continents, the effectiveness of improvement in CLMET varies with the continent, and a smaller RB and RMSE can be found in CLMET (Table 5). For example, in Europe, the substantial alleviation occurs where the value of RB and RMSE are reduced by 32.2% and 40.1% (from 25.52% in CLM to 17.30% in CLMET,

Runoff
In this study, we use the total runoff retrieved from the University of New Hampshire-GRDC to assess the performance of CLM and CLMET in simulating runoff. The globalaveraged total runoff simulated by CLM and CLMET is similar to the GRDC-based estimate, and the values are 0.88, 0.86, and 0.84 mm day −1 (321, 314, and 307 mm year −1 ) for CLM, CLEMT, and GRDC, respectively. Regionally, CLM evidently overestimates total runoff over more than half of the global land, such as central United States, Argentina, central Africa, Indian Peninsula, southern Europe, northeast China, and most areas of Australia, while underestimations occur over Alaska, central Amazon, south Africa, and northern Siberia (Figure 10e). CLMET effectively alleviates both the overestimations and underestimations, with a global average RB reduced from 18.66% to 16.77%, RMSE reduced from 0.661 mm day −1 (241 mm year −1 ) to 0.415 mm day −1 (151 mm year −1 ), and CC increased from 0.83 to 0.94. In terms of the continents, the effectiveness of improvement in CLMET varies with the continent, and a smaller RB and RMSE can be found in CLMET (Table 5). For example, in Europe, the substantial alleviation occurs where the value of RB and RMSE are reduced by 32.2% and 40.1% (from 25.52% in CLM to 17.30% in CLMET, and from 0.367 mm day −1 in CLM to 0.220 mm day −1 in CLMET, equivalent to from 134 mm year −1 to 80 mm year −1 ), respectively, and the value of CC is improved by 0.1 (from 0.85 to 0.95). Additionally, the evaluation result of runoff in Europe is consistent with the result of ET. The ET in CLM is underestimated over Europe, which leads to more runoff amount and overestimation of runoff. In contrast, CLMET alleviates the underestimation of ET by elevating the amount of ET, and consequently obtains more reasonable runoff. Because the in situ observed discharge data used in the University of New Hampshire (UNH)-GRDC do not have a consistent temporal coverage (most of them are from 1980-1999), only the multi-year mean annual and monthly runoff data are available for this runoff dataset. To show the difference of simulated runoff in CLM and CLMET, we select twelve regions with a variety of climate regimes, and investigate the seasonal cycles of runoff simulated by these two types of simulations ( Figure 11). The GRDC-based data show that the runoff seasonality is strong in West Siberia, East Siberia, Canada, and the Amazon basin (Figure 11a-d). CLM substantially underestimates the seasonality by simulating much lower runoff values in the rainy season in these regions. The low runoff in CLM is increased by CLMET, resulting in more realistic seasonal cycles of runoff. CLM reasonably simulates the seasonal cycle of runoff in Congo and India, but overestimates runoff with a similar magnitude in all seasons compared with the GRDCbased data. This overestimation is reduced in CLMET, leading to a closer runoff magnitude to the reference data. In Central Europe, Sahara-Arabia, and Australia, CLMET alleviates the overestimation of runoff in the dry season existing in CLM.

Soil Moisture
Changes in ET will inevitably influence the soil moisture. Using the SMAP-based soil moisture data as the reference, we compare the CLM-and CLMET-simulated soil moisture to assess the impact of the ET improvement on soil moisture. The comparisons of soil moisture within the surface layer (top 0-5 cm) during April 2015-December 2018 at the global and the continental scales are shown in Figure 12 and Table 6. Both CLM and CLMET realistically simulate the spatial pattern and the global average of surface soil moisture compared with the SMAP-based data (Figure 12a-c). However, substantial overestimation and underestimation at the regional scale do exist. Both two models overestimate surface soil moisture over the areas of the Southern Hemisphere, low-latitude, and most midlatitude of the Northern Hemisphere, while underestimations occur over the high latitudes of the Northern Hemisphere. However, these biases are reduced from CLM to CLMET, which is supported by a smaller RB average over the globe compared with SMAP-based soil moisture (16.03% versus 20.55%), as shown in Figure 12e-f. The improvement in soil moisture resulting from corrected ET in CLMET can also be found over each continent (Table 6) and in all seasons (Table A3). Overall, the reduction of RB values is mostly about 20% over different continents, which is not as great as that in ET and runoff. According to the water balance budget, ET and runoff are balanced by total precipitation at a longer time scale. Since total precipitation is kept the same in the offline CLM5, the impact of the correction method on runoff has a similar magnitude as the impact on ET. In the term of soil moisture, its change is more complicated. From the respective of water balance, the change of water content in soil with respect to the time step SM/∆t is balanced by precipitation-(ET + runoff). The sum of ET and runoff does not change much between these two types of models, as their change directions are usually opposite. As a result, the impact of the correction method on soil moisture is smaller compared with the impact on ET and runoff. result, the impact of the correction method on soil moisture is smaller compared with the impact on ET and runoff.
Soil moisture in the root zone soil layer (top 0-100 cm) derived from the SMAP data and simulated by two models, and the statistics of CLM and CLMET against the SMAP are shown in Figure A7, Tables 6 and A4. CLMET improves the simulations of root zone soil moisture at the global and continental scales, but the improvements are relatively smaller compared with the improvements in surface layer soil moisture.    Soil moisture in the root zone soil layer (top 0-100 cm) derived from the SMAP data and simulated by two models, and the statistics of CLM and CLMET against the SMAP are shown in Figure A7, Tables 6 and A4. CLMET improves the simulations of root zone soil moisture at the global and continental scales, but the improvements are relatively smaller compared with the improvements in surface layer soil moisture.

Discussion
In this study, we extend our previous study from CONUS to the globe. In our previous study (Wang et al. 2017), we found that the bias correction method reduces RMSE of CLM4.5 simulated annual ET by approximate 50% over CONUS with GLEAM ET as the reference data. This study reports a larger reduction in annual ET compared with GLEAM ET, which is as big as 65.9% from 0.346 to 0.116 mm day −1 (126 to 42 mm year −1 ) over the globe. The biggest improvement occurs in Asia where RMSE is reduced from 0.312 to 0.093 mm day −1 (114 to 34 mm year −1 ) for annual ET. Moreover, we found that the correction method is effective when underestimation in ET occurs in the areas with plenty of precipitation (e.g., Amazon) in this study, whereas the correction effectiveness for ET underestimation is undesirable due to the water limit in most CONUS areas in our previous study (Wang et al. 2017). This indicates that the effectiveness of the bias correction method is largely determined by the climate regime of the region, i.e., water limit regime or energy limit regime. Although the correction method significantly improves models in simulating ET in most areas over the globe, there are still several issues worth being discussed. Firstly, the scaling factor derived from GLEAM ET and CLM5 simulated ET is at the monthly scale instead of the daily or hourly scale. There are two reasons for this treatment. One is that the scaling factor at short time scales fluctuates very much, which may result in a poor correction effect. The other is that the long-term and high-quality remote sensingbased ET product at the daily or hourly scale is rare. Even for daily GLEAM ET data, further assessments are still needed. In contrast, the year-to-year variation of monthly ET is much smaller, and the remote-sensing data at the monthly scale is more reliable. Therefore, we decide to estimate the monthly scaling factors. When these factors are applied to the hourly simulation in CLM5, the corrected ET does not always become better due to the temporal scale mismatch between the scaling factor and the hourly simulation. However, the aggregated results at longer time scales (monthly or annually) with the correction method is superior to the original simulations. Secondly, we calculate the scaling factors by using the averaged ET over the entire calibration period, instead of decomposing averaged ET into the related terrestrial variables. ET is a complex hydrothermal process, which is in connection with many terrestrial variables (e.g., air temperature, air humidity, wind, runoff, soil moisture and LAI). The relationships between ET and these variables are usually nonlinear. Therefore, it is very challenging to deconvolute ET with these variables, which is worth being investigated in a future study. Thirdly, the entire period of 1980-2018 is evenly divided into the calibration period and the validation period, with 1980-1999 for the former and 2000-2018 for the latter. The assumption in applying this method is that the relationship between the model ET and observational ET remains unchanged from the calibration period to the validation period. This is a reasonable assumption, which is supported by the study of Wang et al., (2017) who found that changes in the scaling factor between two periods were within 10% over CONUS. However, under the increasingly intensified climate change and human activities in the future, further studies are needed to examine whether the time-invariant relationship can resonantly hold or not.
In this study, we select the GLEAM data as the remote sensing-based ET product to derive the ET scaling factor because of its relatively higher quality and longer record. Besides, the MODIS-based and MTE-based ET are also used as the reference data to validate the improvement effectiveness. All of these remote sensing-based ET products have a similar value to each other in terms of the global average. However, GLEAMbased ET shows an evidently higher ET than both MODIS-based and MTE-based ET over the Amazon region. This higher ET in GLEAM may introduce an "over-correction", leading to CLM-based "underestimations" turning to CLMET-based "overestimations" over the Amazon region. It can be indicated by the statistical metric values of the model performance against GLEAM, MODIS, and MTE over South America. The RB values in CLM before correction against GLEAM, MODIS, and MTE ET are 3.875%, −1.023%, and −3.186%, but they become 0.756%, 5.163%, and 3.319% in CLMET after correction, respectively. Generating higher quality ET products by blending multiple datasets might help in obtaining a robust correction effect with the method, which can be explored in a future study. In addition, it should be noted that the temporal coverage of runoff from the CLM is different from that of the UNH-GRDC dataset. The inconsistency in the temporal coverage may have an impact on the evaluation of runoff. Therefore, up-to-date and highquality reference datasets (e.g., runoff, soil moisture) are also required for the assessment of land surface models in a future study.
Accurate global hydrological simulations in CLM inevitably depend on the profound understanding of hydrology mechanisms and the realistic parameterizations of hydrological processes, which takes a long time to achieve the objective. The bias correction algorithm used in this study provides a simple alternative by taking advantage of highquality remote sensing-ET products without considering the physical mechanisms of CLM. The prominent correction effectiveness indicates that there is a great potential for CLM to improve hydrological simulations. Admittedly, over regions where the correction method does not improve the estimates of ET and other related variables, great and continuous efforts should be taken to understand the physical processes and to develop the associated parameterizations.

Conclusions
We apply a bias correction approach to the Community Land Model version 5.0 (CLM5) globally by utilizing the remote sensing-based ET dataset. Results reveal that the correction approach can alleviate both the overestimation and underestimation of ET by CLM5 over the globe. The adjustment to overestimation is generally effective, whereas the effectiveness for underestimation is determined by the ET regime, namely water limited or energy limited. In areas with abundant precipitation, the underestimation is effectively corrected by increasing ET without the water supply limit. In the areas with rare precipitation, however, increasing ET is limited by water supply, which leads to an undesirable correction effect. Compared with the ET simulated by CLM5, the bias correction approach can reduce the global-averaged relative bias (RB) and the root mean square error (RMSE) by 51.8% and 65.9% against Global Land Evaporation Amsterdam Model (GLEAM) ET data, respectively. Meanwhile, the correlation coefficient (CC) can also be improved from 0.93 to 0.98. Continentally, the most substantial ET improvement occurs in Asia, with the RB and RMSE decreased by 69.7% (from 7.04% to 2.14%) and 70.2% (from 0.312 mm day −1 to 0.093 mm day −1 , equivalent to from 114 mm year −1 to 34 mm year −1 ), and the CC increased from 0.92 to 0.99, respectively. Consequently, benefiting from the improvement of ET, the simulations of runoff and soil moisture are also improved over the globe and in each of six continents, and the improvement varies with region. This study highlights that the use of a satellite-based ET dataset is beneficial to hydrological simulations in land surface models over the globe, which will have a great impact on the development of earth system models in the future.                  Table A3. Spatial evaluations of simulated soil moisture in top 0-5 cm layer from two types of simulations (CLM and CLMET) against SMAP-based data over the globe and each continent seasonally during the period of April 2015 to December 2018.