1. Introduction
Evapotranspiration (ET) is a critical process of water and energy exchange among the hydrosphere, atmosphere, and biosphere, and constitutes an essential component of the regional water cycle [
1,
2]. The variations in ET not only directly influence runoff generation but also have significant impacts on regional climate regulation and vegetation productivity [
3,
4]. Therefore, accurate estimation of ET is essential for advancing understanding of hydrological processes [
5]. However, conventional ground-based observation methods face limitations due to sparse spatial coverage and inadequate site representativeness, especially in high-altitude regions characterized by complex topography and climatic conditions, thereby making the accurate estimation of ET still a major challenge [
6].
In recent years, remote sensing-based and reanalysis-driven ET data products have experienced rapid development, becoming important alternative methods for studying the spatial distribution and dynamic changes of regional ET, particularly in data-scarce regions [
7]. Presently, the widely used ET datasets include: MOD16 ET products derived from Moderate Resolution Imaging Spectroradiometer (MODIS) satellites [
8], GLEAM (Global Land Evaporation Amsterdam Model) products [
9], PML-V2 (Penman Monteith Leuning-V2) [
10], GLDAS-Noah (Global Land Data Assimilation System-Noah model) [
11], and ERA5 (5th Generation of European Reanalysis-Land part) [
12], among others. However, due to substantial differences in model structure, input data, and computational methods, these datasets exhibit considerable uncertainties in their ET estimation results across different regions and ecosystems [
13,
14]. For instance, Zuo et al. [
15] compared and analyzed the performance of six ET data products in China, revealing significant differences among them. Therefore, a systematic evaluation of the applicability of different ET products in specific regions not only helps to clarify their accuracy characteristics and application scopes but also provides important references for ET data fusion and hydrological process simulation [
16,
17]. Particularly under the backdrop of intensifying global climate change and the increasingly pronounced water supply-demand imbalance, obtaining reliable and suitable ET data has become an urgent requirement for water resource management, ecosystem maintenance, and water security in complex basins [
18].
The Yarlung Zangbo River is one of the major river systems on the Tibetan Plateau (TP) and an important transboundary river in South Asia. The Yarlung Zangbo River basin (YZB) is characterized by pronounced topographic undulations, a large altitude gradient, unique and highly variable climatic conditions, diverse vegetation types, and complex underlying surfaces. This complex environmental heterogeneity results in substantial spatial variations in ET within the basin, bringing considerable challenges for accurate ET estimation. Considering the distinct advantages and limitations of each product, along with their varying performances under specific environmental conditions, no consensus on exists regarding a universally superior product across all study areas [
19,
20].
Many studies have conducted evaluations of ET data in the TP. For example, the evaluation on eight major ET products by Meng et al. [
21] found that CLM-BGCDV (Community Land Model–biogeochemical dynamic vegetation) better represented the response of ET to climate change by incorporating dynamic vegetation processes, whereas the ITP (Institution of Tibetan Plateau Research) products (i.e., Han-ET in this study) significantly overestimated and MOD16 underestimated the seasonal amplitude of ET. Li et al. [
22] integrated terrestrial and atmospheric water balance approaches to evaluate four ET products (AVHRR: Advanced Very High Resolution Radiometer, GLEAM, MOD16, GLDAS-Noah) across the five major river basins of the TP. Their results revealed that in the YZB, all products exhibited substantial deviations from the water balance baseline, which was attributed to the general neglect of sublimation processes in current ET estimation methods. The study of Cheng et al. [
23] concluded that remote sensing products exhibited significantly lower uncertainty compared to land surface model and reanalysis products. They attributed performance differences to variations in algorithm structure and input data, noting that remote sensing products are better able to capture actual surface conditions, while model-based products are more sensitive to uncertainties in parameterization schemes. Liu [
24] systematically evaluated three remote sensing ET products (GLEAM, ZHANG, CSIRO) across 16 river basins in the TP. Their findings indicated that in the YZB, all products generally overestimated summer ET and demonstrated weaker performance in capturing interannual variations than in simulating multi-year mean patterns. Product performance was influenced not only by climatic conditions but also by local surface characteristics and algorithmic structure. Li et al. [
25] further demonstrated in their evaluation of ET products (JRA: Japanese 25 year Reanalysis, ZHANG, GLDAS, MODIS) in the TP that while the seasonal dynamics of each ET product were consistent, significant discrepancies in magnitude were presented. These deviations mainly originated from uncertainties in input data--for example, the overestimation of ET by MODIS in the upper reaches of the Yellow River resulted from excessive short-wave radiation inputs. From these studies, it is evident that the performance of ET products varies significantly across different regions of the TP due to the combined influence of regional climate conditions, underlying surface characteristics, algorithms, and driving data, with different dominant factors across regions. Substantial uncertainty exists among the various ET products, especially in areas with large glacier cover. However, systematic comparative analyses specifically focused on the YZB, a distinct geographical unit, remain limited. In particular, existing evaluation studies mostly focus on the overall scale of the TP, making it difficult to precisely capture the spatial variability of ET within the basin due to differences in topography, climate, and underlying surface characteristics. This limitation thus hinders more in-depth and detailed hydrological researches in the YZB.
Consequently, this study aims to conduct a systematic evaluation of the performance of multi-source ET datasets in the YZB. It will thoroughly investigate the accuracy and applicability of different ET datasets in the YZB, including terrestrial water balance-based estimates, remote sensing-based inversion products, and reanalysis products. The specific research contents are as follows: (1) estimating ET in the YZB using GRACE data and the terrestrial water balance method (TWB); (2) comparing the spatio-temporal variation characteristics of different ET datasets within the YZB; (3) quantitively evaluating the performance of these ET datasets in the YZB based on TWB-ET. The results of this study can provide reliable data support for the ET and hydrology research in the YZB, and offer valuable references for the selection and fusion of ET datasets in high-altitude and complex regions.
4. Discussion
4.1. Performance Differences
The comparisons indicates that notable differences exist in the performance of the ten datasets in the YZB. These differences primarily stem from the essential distinctions in model algorithms, input data, and the parameterization of key processes. Particularly, such differences become more pronounced in high-altitude regions with complex terrain, such as the YZB.
Specifically, the reanalysis products (ERA5-ET, MERRA2-ET) exhibit moderate overall performance, but perform relatively poorly in the high-altitude western regions. This may be attributed to inherent characteristics of the reanalysis product themselves and the complex terrain and climate conditions of the YZB. Their ET estimations rely on the assimilated meteorological fields. However, due to sparse ground-based observations and limited satellite data in the YZB, combined with complex terrain, the assimilation constraints are insufficient, potentially resulting in possible large errors in the meteorological forcing data. In comparison, ERA5-ET performs better, whereas the changing trend of MERRA2-ET differed significantly from other datasets. This may be because MERRA2 simulates and outputs ET through its land surface process module (Catchment CN) within the reanalysis system. This model is developed based on large-scale and uniform underlying surfaces. Its coarser spatial resolution (0.5°) along with uniform parameterization schemes may not accurately describe the land surface processes in complex terrain areas, thereby further increasing ET deviations. Overall, reanalysis products show limited accuracy in high-altitude mountainous regions, which is consistent with those reported by Liu et al. [
44] and Qian et al. [
45].
Remote sensing-based datasets, such as GLEAM-ET, MOD16-ET and GLASS-ET, generally perform well, while variations remain in the identification of change regions and the significance levels. GLEAM-ET estimates potential ET (PET) based on the Priestley-Taylor equation and reduce PET to ET using soil moisture stress factors in a simple water balance framework. With relatively fewer parameters, it better reflects the relationship between energy and water stress [
45]. This may be an important reason for its better correlation with TWB-ET. Many studies have also demonstrated that GLEAM-ET performs relatively well in the TP [
21,
22,
23,
24]. MOD16-ET exhibits a more accurate interannual changing trend but shows relatively weaker correlation in annual variation, consistent with findings in Meng et al. [
21] and Li et al. [
22]. MOD16-ET is derived from the Penman-Monteith (PM) equation, which effectively reflect long-term ET changes under climate-vegetation coupling. However, canopy-related parameters (e.g., LAI, FPAR) may have greater uncertainties at short-time scales [
46]. PML-ET display a significant upward trend across the entire basin, markedly differing from other datasets. This dataset introduces the dynamic coupling of photosynthesis and stomatal conductance into the PM equation. Although PML enables more accurate transpiration estimation, its parameterization scheme may be inadequately adapted to the low-pressure condition in the YZB, and strong radiation may lead to systematic overestimation of stomatal conductance, thereby causing inaccurate dynamic change estimates [
47]. GLASS-ET integrates five process-based algorithms for ET estimation, integrating the strengths of different approaches in energy and vegetation processes. This may be the reason for its superior overall performance.
In contrast, the overall performance of Ma-ET, Chen-ET and Han-ET was slightly poorer. Chen-ET estimates ET by deriving the Bowen ratio from the balance of surface temperature and humidity. These variables are susceptible to local microclimate influences, making it difficult to capture their long-term trends, thereby leading to significant deviations. Ma-ET is based on the nonlinear complementary relationship, which assumes dynamic feedbacks between ET and PET under drought-wet transition conditions. This approach performed well in humid regions but was prone to failure in high-cold region [
48]. Han-ET is based on the surface energy balance system (SEBS) model, which is sensitive to surface temperature, surface roughness and radiation distribution. However, MODIS-derived surface temperature and roughness parameters exhibit systematic biases in high-altitudes regions, leading to its overall poor performance in the YZB.
Jung-ET uses FLUXNET observations to establish a nonlinear mapping relationship, achieving higher accuracy in data-rich areas. However, due to the scarcity of flux towers in high-altitude regions, the representativeness of training samples is limited, thereby restricting its performance. Overall, it performs relatively well in the eastern YZB, but the deviations become significantly larger in the central and western regions, indicating that purely data-driven models still lack reliability in high-altitude regions.
In addition, the impacts of resolution differences on ET estimation across each dataset are also reflected in the evaluation. Generally, high-resolution datasets perform better than low-resolution datasets, as they can capture finer spatial variation, whereas low-resolution datasets tend to overlook detailed spatial features during the estimation process. For example, both MERRA2-ET (0.5°) and Jung-ET (0.5°) exhibited poorer performance compared to GLASS-ET and MOD16-ET.
In summary, potential weaknesses exist in all ET datasets. The performance of ET datasets is highly dependent on the regional characteristics, algorithms and forcing data accuracy. In the future, it is recommended to develop fusion products that integrate the advantages of both reanalysis and remote sensing data, with particular emphasis on improving the parameterization of physical mechanisms for high-altitude cold river basins.
4.2. TWB-ET Estimation
- (1)
The sources and impacts of TWB-ET uncertainty
Uncertainties exist in TWB-ET estimates in the YZB, which is a key limitation in our research. First, TWB-ET estimates are highly sensitive to precipitation, runoff and TWS. In high-altitude mountainous areas with sparse observational stations, input data uncertainty increases. Among all sources of uncertainty, precipitation uncertainty plays a dominant role. According to Miao et al. [
41], observed precipitation data may be significantly underestimated in the high-altitude environments, potentially leading to ET underestimation. Secondly, the impact of glacier mass balance has not been explicitly accounted for in TWB-ET estimates. Given the large glaciers areas in the YZB, glaciers melt contributions to runoff result in systematic deviations in TWB calculations, which may lead to an underestimation of TWB-ET. According to relevant studies [
49,
50], glacier melt contributes approximately 5% to annual runoff on average in the YZB, mainly occurs from June to September. Therefore, TWB-ET may be underestimated during these months. Considering this 5% contribution suggests that the overall underestimation of TWB-ET in areas with glaciers cover is approximately 4.1%. Thirdly, TWB-ET estimates rely on the assumption of a closed watershed, neglecting potential groundwater lateral flow. This may increase the deviations of TWB-ET. Due to limited observation data in the YZB, quantifying this uncertainty remains challenging. Li et al. [
22] similarly identified glacier effects, input data uncertainty, and unmonitored groundwater outflow as key contributors to uncertainty in TWB-ET estimates over the TP.
- (2)
The causes of negative TWB-ET estimates
The direct reason for the negative TWB-ET in winter may be the significant underestimation of precipitation in this basin, especially the underestimation of snowfall. Research by Miao et al. [
41] indicates that in the TP, observed precipitation has been significantly underestimated due to measurement errors caused by strong winds and representativeness errors resulting from sparse stations. This leads to a lower precipitation input in the TWB equation, especially in sub-basins with abundant snowfall. In contrast, GRACE-based TWS changes can effectively capture total water storage changes including snow accumulation. As a result, the TWB calculation derived negative ET in winter. The root cause lies in the fact that the solid (e.g., snow) and liquid (e.g., soil moisture) water storage changes was not effectively distinguished in the current TWB method. The uncertainty analysis of TWB-ET (
Table 3) also indicates that precipitation exhibits greater uncertainties (3–7 mm/month) and is a dominant source of TWB-ET uncertainty, compared to the uncertainty of GRACE-based TWS (0.12 mm/month). Moreover, in areas with more glacier and snow cover (e.g., B5, B7), the uncertainty is even greater, which is highly consistent with the seasons and regions where negative values occur. Li et al. [
22] also observed negative values in the TWB-ET estimation in the YZB, and attributed them to the same reasons.
- (3)
The reliability of TWB-ET estimates
TWB-ET showed negative values in some sub-basins during the winter months, indicating the unreliability of these values. Therefore, we mainly conducted evaluation on the annual scale. On the annual scale, the TWS changes approached 0, and the annual ET estimates was less affected by this. Additionally, this study mainly compared and evaluated the consistency of each ET datasets with TWB-ET, rather than comparing absolute quantities, thereby effectively reducing the impact of TWB-ET uncertainties on the evaluation conclusions. Meanwhile, the uncertainty analysis of TWB-ET also indicated that although there were individual negative values, the relative uncertainty of TWB-ET (σ
ET/TWB-ET) is mostly below 20%, reaching only 21.9% in B7 (
Table 3), well within the acceptable range (<30%) as defined by Li et al. [
22]. Therefore, compared to ET datasets that involve more uncertainty sources, TWB-ET estimates remain certain reliability for use as a baseline. For the YZB with scarce data, the correction of the calculation may bring new and greater uncertainties. Therefore, future research should combine atmospheric water balances and ground observations, and use multi-method integration to build a more reliable regional ET baseline.
4.3. Limitations
Several limitations should also be acknowledged. First, uncertainties in TWB-ET estimates limit a more detailed evaluation of ET with respect to spatio-temporal variations. Future studies should combine ground flux observations (such as eddy covariance data) to improve the evaluation. Secondly, the lack of runoff observation data in sub-basins B8 and B9 make it unable to calculate TWB-ET in these regions, thereby restricting quantitative evaluation of ET products in the lower reaches of the basin. The comprehensive score is restricted to the upper-middle reaches of the YZB. Additionally, the comprehensive score is calculated by assigning equal weights to the main aspects and lower weights to the remaining aspects. To identify datasets that perform better in specific aspects, the corresponding aspect scores can be directly examined. If different emphases are required, the weights can be adjusted accordingly, and the overall scores can be recomputed. Meanwhile, the comprehensive scores are based on relative performance across 10 datasets, implying that even top-performing datasets still exhibit notable limitations. For example, the GLASS-ET and GLEAM-ET, ranking among the top, still exhibit significant systematic deviation in ET estimation during winter in high-altitude and glacierized areas. GLEAM-ET also performs limited capability in capturing interannual variability. Therefore, practical applications should carefully consider specific regional characteristics, seasonal dynamics, and research objectives. Moreover, due to space limitations, certain datasets were not included in the analysis, such as data fusion products and land surface model outputs. Future studies should summarize and evaluate the applicability of a broader range ET datasets.
5. Conclusions
This paper estimated the actual evapotranspiration (ET) in the YZB using the terrestrial water balance approach (TWB-ET), incorporating GRACE-based terrestrial water storage change data, precipitation, and observed runoff data. The TWB-ET was then used to compare and evaluate the performance of ten different ET datasets in the YZB. The main conclusions are as follows: (1) Across the entire basin, all ten ET datasets exhibit a good correlation with TWB-ET in annual variation, with r values ranging from 0.78 to 0.90. Six of these datasets achieve an r value exceeding 0.85, although biases exist. The correlations between multi-source ET datasets and TWB-ET vary across the sub-basins. Overall, GLEAM-ET demonstrates the highest correlation with TWB-ET (r = 0.88) and the smallest bias (RMSE = 14.24 mm/month, Rbias = 18.55%) in annual variation. (2) The spatial distribution of the ten ET datasets is generally similar, showing a decreasing trend from southeast to northwest. However, significant differences are observed in the primary variation range and specific changes within sub-basins. Temporally, there are evident discrepancies in the specific regions and significance level of changing trends among multi-source ET datasets. (3) By comparing the ten ET datasets with TWB-ET in terms of annual variation, spatio-temporal variation in the YZB, it is found that GLASS-ET and GLEAM-ET perform relatively well, whereas Han-ET and Chen-ET exhibits larger differences in these aspects.