An Assessment of Global Precipitation and Evapotranspiration Products for Regional Applications

: Precipitation (P) and evapotranspiration (ET) are the key factors determining water availability for water resource management activities in river basins. While global P and ET data products have become more accessible, their performances in river basins with a diverse climate and landscape remain less discussed. This paper evaluated the performance of four representative global P (CHIRPS P , GLDAS P , TRMM P and Persiann P ) and ET products (CSIRO ET , GLDAS ET , MOD ET and TerraClimate ET ) against the reference data provided by the Australian Water Availability Project (AWAP) in the Murray Darling Basin (MDB) of Australia. The disparities among the data products both in the period from 2001 to 2016 and across the 22 catchments of MDB were related to a set of catchment characteristics (climate, terrain, etc.) to explore any possible contributors. The results show that the four global P products presented overall high consistency with AWAP P across the MDB catchments except in southeastern catchments with abundant rainfalls and large terrain variations. The Penman–Monteith algorithm based MOD ET underestimated ET in the MDB, especially in the arid, less vegetation covered catchments. While the CSIRO ET , which also estimated with the Penman–Monteith method, presented overall better estimations, which can be attributed to the better parameterization of the landscape in the simulation processes. The hydrological model based TerraClimate ET showed overall good consistency with AWAP ET except in the arid catchments, which might be attributed to the simpliﬁed water balance model it applied, however it did not adequately reﬂect the intensive ground water uses in these catchments. The ﬁndings indicated that basin and catchment characteristics had impacts on the accuracy of global products and therefore provided important implications for choosing appropriate product and / or conducting ﬁeld calibrations for potential users in large basins characterized with diverse rainfall, terrain variations and land use patterns. Lower Darling and Lower Murray) catchments where R 2 range from about 0.3 to 0.6. RMSE s associated with comparisons of CSIRO ET and GLDAS ET to AWAP ET are relatively uniform across the catchments (around 10 mm) while larger RMSEs are observed with MOD ET and TerraClimate ET , especially for northern catchments including Warrego, Condamine-Balonne, Moonie, Border Rivers, Gwydir and Namoi. Overall lower NSE values are also observed with the monthly ET proﬁles estimated with the four products, but signiﬁcant lower values are obtained by MOD ET and TerraClimate ET in most catchments except several southeastern catchments (including Upper Murray, Mitta Mitta, Goulburn Broken, Ovens and Campaspe).


Introduction
Precipitation (P) and evapotranspiration (ET) are the two basic components of the hydrological cycle, and the most important variables in river basin managements [1]. P accounts for the major freshwater input while ET accounts for approximately 70% of P that falls on the Earth's surface and transfers the water back to the atmosphere [1][2][3]. Accurate P and ET estimations are critical to river study are expected to provide important implications for the selection and use of these global products for water resource management at the catchment levels. The experiences on the estimation of the global P and ET products at the catchment scale obtained from this study are valuable for other large river basins with diverse rainfall, terrain and land use and land cover patterns.

The Murray Darling Basin
The study was conducted within the Murray Darling Basin (MDB) located in southeastern Australia ( Figure 1). The MDB covers an area of 1.06 ×10 6 km 2 , where most of the area is flat and lowlying land, with mountainous regions primarily focused in the eastern part of the basin ( Figure S1). The climate of the MDB is sub-tropical in the north, semi-arid in the west and mostly temperate in the south. A high annual rainfall up to 1500 mm/year is recorded in the eastern side of the MDB while the western side of the MDB is typically hot and dry with an annual rainfall of generally less than 300 mm/year ( Figure S2). In addition, the MDB is characterised by high ET levels, which account over 94% of the rainfall that falls in the basin (www.mdba.gov.au). Thus, water resources are the critical constraint for development of agriculture and conservation of natural environments in the MDB. The basin contains 22 catchments, which present substantial differences in terms of the climatic, terrain and level of human activities ( Table 1). The diversities in climate, landscape characteristics and water use across the MDB make it an ideal case for carrying out the proposed analysis.   Ideally, the well-distributed instrument data are a good reference for estimating the performance of the global productions at regional level. Even if in the river basins where there are few instrumented data available, this kind of exercise can at least help the river basin managers know the varying range of performances of the global products, their spatial distribution at the catchment level and temporal distribution in different hydrological months or years. Fortunately, in the MDB, a dataset estimated by the Australian Water Availability Project (AWAP), which is an operational data assimilation and modelling system that monitors the state and trend of the terrestrial water balance of the Australian continent at a spatial resolution of 5 km, has been developed. The system is relatively well calibrated and validated using independent datasets. Little bias is observed across the range from dry to wet catchments at both annual and monthly scales [20]. Therefore, this study adopted the P and ET datasets developed with the AWAP system as "truth data" to assess the global products. In the system, P (AWAP P ) performed as a major meteorological forcing, where a gridded daily rainfall dataset compiled by the Bureau of Meteorological and the Commonwealth Scientific and Industrial Research Organisation (CSIRO) was used. AWAP P has been used in a number of local studies since it provides a way to consistently characterize the variation of rainfall over space and time for large catchments across Australia [21]. Meanwhile, ET in the system (AWAP ET ) is the sum of its daily-modelled transpiration plus soil evaporation integrated to a monthly step. The dynamic water balance model ("WaterDyn") forced with P, downward solar irradiance and air temperature was used to simulate the changes in the shallow (thickness 0-0.7 m) and deep (0.2-1.5 m) soil layers and therefore water fluxes across the boundaries, with ET included. Previous studies have demonstrated its strength of spatial and temporal continuity [21][22][23].

Global P and ET Data Products
Four global P products are evaluated against AWAP P in this study ( Table 2). The selected products include (1) the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS P ), which incorporates 0.05 o resolution satellite imagery with in situ station data to create quasi-global scale gridded rainfall time series [24]; (2) the simulated P from the Global Land Data Assimilation System (GLDAS P ), which was forced with National Oceanic and Atmospheric Administration (NOAA)/Global Data Assimilation System (GDAS) atmospheric analysis fields, the disaggregated Global Precipitation Climatology Project (GPCP) precipitation fields, and the Air Force Weather Agency's AGRicultural METeorological modeling system (AGRMET) radiation fields [25]; (3) the estimated P by the Tropical Rainfall Measuring Mission (TRMM P ) through algorithmically merging microwave data from multiple satellites [26,27]; and (4) the Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (Persiann P ), which integrates gridded satellite infrared data and P observations from the Global Precipitation Climatology Project [28].
ET estimations from four global ET products were also evaluated. The products are (1) the CSIRO ET(CSIRO ET ) datasets estimated using an observation-driven Penman-Monteith-Leuning (PML) model; (2) the GLDAS simulated ET (GLDAS ET ) which incorporates satellite and ground-based observations; (3) MODIS ET (MOD ET ), which is widely known as the MOD16 data collection, is based on the logic of the Penman-Monteith equation with model inputs primarily derived from the satellite imagery [8,11]; and (4) TerraClimate ET (TerraClimate ET ) estimated using a modified Thornthwaite-Mather climatic water balance model and extractable soil water storage capacity data [29]. Details of the evaluated data products are listed in Table 2. GLDAS ET is a land surface model simulation in which the estimation is primarily based on empirical upscaling of space-and ground-based observations. Inputs in driving GLDAS including P, air temperate, downward shortwave and longwave radiation, humidity, surface pressure and wind speed [25].
MOD ET 500 m globally MOD ET is the terrestrial ET using a remote sensing-based Penman-Monteith algorithm [8,11]. Inputs include the MODIS derived land cover, LAI, fPAR and albedo products, as well as the meteorological reanalysis dataset from the Global Modelling and Assimilation Office of NASA (GMAO).
TerraClimate ET 2.5 minutes globally Monthly from 1958. Available at GEE.
TerraClimate ET is estimated using a one-dimensional modified Thornthwaite-Mather climatic water-balance model [13]. Inputs for the water balance calculation include precipitation and reference ET, as well as the plant extractable soil water capacity derived from satellite observations [29].

Data Processing
All collected data products were uniformly resampled to the same spatial resolution (1 km) and temporal resolutions (annual and monthly) to make the data products comparable. The original records were aggregated into both annual and monthly series. Monthly P and ET could reflect the water input and consumption dynamics and thus provide vital information for multiple water management purposes (e.g., reallocation, irrigation), while the annual series could provide additional information for long term management activities, including land and water resource plans for sustainable developments. The data series were also confined to the period from 2001 to 2016 (except for the CSIRO ET which stopped updating in 2013) to meet the purpose of comparisons between the products. Finally, the basin scale data was extracted for the 22 catchments within the MDB using the catchment boundaries, which were digitalized from https://www.mdba.gov.au/discover-basin/catchments. Downloading and resampling of the global P and ET products (except CSIRO ET ) were conducted in GEE. Manipulation of AWAP P , AWAP ET and CSIRO ET were conducted using the ArcGIS platform.

Methods
The data analysis conducted over the P and ET series include three core components: (1) Comparison between global products and AWAP estimations at the basin scale to check overall consistencies; (2) comparison at each catchment to identify the catchments where global products show low, moderate or high levels of disparities against AWAP; and (3) correlation analysis to test the possible contributions of catchment characteristics to the identified disparities.

Temporal Disparities at Basin Scale
Both annual and monthly P (and ET) at the basin scale aggregated from global data products were evaluated against AWAP P (and AWAP ET ) estimations using a series of statistical metrics. The selected metrics include the coefficient determination (R 2 ), root mean square errors (RMSE) and Nash Sutcliffe Efficiency index (NSE). R 2 examines the overall consistency (e.g., temporal variation patterns) between two data products. RMSE measures the average magnitude of the estimation errors, with lower RMSE indicating greater central tendencies and smaller extreme errors. NSE varies from minus infinity to one where the negative value means poor quality of the estimated values and values closer to one indicate better matches between reference and estimated values. The metrics are recommended in previous literature and can be determined according to the following equations [30].
where, V obs stands for the value derived from the AWAP data collection and V est stands for the estimations derived from the studied global data products.

Spatio-Temporal Disparities across the Catchments
Due to the varied terrain and climatic conditions across the MDB, it is possible that the level of temporal disparities in different catchments will be different as well. To reveal the differences, the collected datasets were aggregated respectively to generate the annual and monthly time series for the 22 catchments in the MDB. The data series were then evaluated against AWAP P and AWAP ET estimations by calculating the selected statistical metrics (R 2 , RMSE and NSE) within each catchment.

Impacts of Catchment Characteristics
To interpret the different levels of disparities between global products and AWAP estimations across the 22 MDB catchments, the calculated RMSE for each catchment was selected and overlaid with the catchment characteristics (e.g., average P and ET levels, terrain variations and land use compositions in each catchment as listed in Table 1) and the relationships between RMSE and characteristics were quantified using a Pearson correlation coefficient. A high correlation relationship might indicate a possible contribution of the identified catchment characteristic to the data disparities from truth data. For example, a positive correlation between RMSE (for annual ET between global product and AWAP ET ) and DEM implies that large uncertainties are to be expected with the global ET product in high elevation areas.
Calculation of the statistic metrics and the Pearson correlation coefficients were conducted in the R software package (R 3.5.1).

Precipitations
Overall, annual P for the entire MDB was averaged at 449.4 ± 127.2 mm/year, 448.9 ± 100.9 mm/year, 490.7 ± 135.3 mm/year, 487.2 ± 137.1 mm/year and 468.3 ± 137.9 mm/year estimated with AWAP P , CHIRPS P , GLDAD P , Persiann P and TRMM P , respectively (Table 3). Similar annual and monthly changing patterns were captured by different products, including the extremely dry years/months in 2002 and 2006 and wet years/months in 2010, 2011 and 2016 ( Figure 2). However, it seems CHIRPS P tends to record a relative narrow range of monthly P and is less sensitive to high and extreme P events across the studied period ( Figure 2 and Figure S3). The selected statistic metrics, with NSE greater than 0.87, R 2 close to 1 and RMSE less than 10% of the annual mean in all cases (Table 3), indicates a good consistency between the global products and the AWAP P at the annual scale. While at the monthly scale, the two categories (global and AWAP P ) also showed overall good consistency but with relatively larger estimation errors as indicated with the RMSE recorded up to 19.4% (for CHIRPS P ) of monthly mean P levels.

Evapotranspiration
Larger differences were observed among the ET products than in the above-obtained P series comparisons. Overall, the average annual ET levels within the MDB are 417.0 ± 75.5 mm/year, 404.0 ± 67.1 mm/year, 451.6 ± 84 mm/year, 278.5 ± 55.9 mm/year and 410.6 ± 98.0 mm/year, recorded by AWAPET, CSIROET, GLDASET, MODET, and TerraClimateET products, respectively (Table 4), where MODET estimation is substantially lower than the other four products. Overall similar annual

Evapotranspiration
Larger differences were observed among the ET products than in the above-obtained P series comparisons. Overall, the average annual ET levels within the MDB are 417.0 ± 75.5 mm/year, 404.0 ± 67.1 mm/year, 451.6 ± 84 mm/year, 278.5 ± 55.9 mm/year and 410.6 ± 98.0 mm/year, recorded by AWAP ET , CSIRO ET , GLDAS ET , MOD ET , and TerraClimate ET products, respectively (Table 4), where MOD ET estimation is substantially lower than the other four products. Overall similar annual changing patterns were observed with all products ( Figure S4), which lead to the high R 2 (> 0.84 for the 4 global products) between global ET products and AWAP ET . Only negative (for MOD ET ) to moderate positive NSE levels (for CSIRO ET , GLDAS ET and TerraClimate ET ) were observed due to the substantial differences in absolute ET levels obtained with different data products. The phenomenon was more apparent with the monthly profiles, where the temporal fluctuations differed among the data products, both in terms of the magnitude of absolute monthly ET and timing of peak ET levels of the year (Figure 3), which contributed to the overall decreased R 2 and NSE but increased RMSE levels.

Precipitations
When it comes to each catchment within the MDB, the P products showed varied performances ( Figure 4 and Figure 5). From an annual perspective, the four products present overall high correlations with AWAPP, with R 2 higher than 0.9 in most cases, except for the relatively low

Precipitations
When it comes to each catchment within the MDB, the P products showed varied performances (Figures 4 and 5). From an annual perspective, the four products present overall high correlations with AWAP P , with R 2 higher than 0.9 in most cases, except for the relatively low R 2 values observed with GLDAS P in Gwydir (R 2 = 0.74), with Persiann P in Gwydir (R 2 = 0.78) and Border Rivers (R 2 = 0.81). well. TRMMP performs better with lower RMSE levels compared to the other three products, a high RMSE with TRMMP was only observed in Ovens (341.15 mm) and Kiewa (133.69 mm). NSE further captured the variations within GLDASP and PersiannP, especially in the southern catchments, where GLDASP and PersiannP NSE values are substantially lower than those for CHIRPSP and TRMMP. Specifically, NSE values for GLDASP are lower than 0.     Figure 6 shows the comparison statistics when decomposing the basin scale annual ET into the catchment scale. Most comparisons between the four ET products and AWAPET obtained R 2 values greater than 0.8, which indicated that the products presented overall similar annual variation patterns. Mitta Mitta is observed to have the lowest R 2 values for CSIROET (0.39), GLDASET (0.52) and MODET (0.43). RMSE for the four products in different catchments showed that MODET has the highest RMSE levels, especially in the northern catchments (including Moonie, Border Rivers, Warrego, Namoi, Paroo and Gwydir) where RMSEs are greater than 200 mm. GLDASET is observed to have the second largest RMSE levels, with relatively higher values in southern catchments (e.g., Wimmera, Mid-Murray and Campaspe). Similarly, NSE values for MODET are significantly lower than the other three products, where negative NSE levels were observed in 16 out of the 22 catchments. Even though fewer negative NSE values are observed with CSIROET (2), GLDASET (8) and TerraClimateET (3), the NSE values for most of the comparisons are relatively low (e.g., lower than 0.5, although some of them have high R 2 ), through which we can infer that large variations exist within the ET products in the catchments which might be explained by their capabilities in capturing extremely high or low ET levels.  Figure 5 presented the comparisons of the detailed monthly trends of the four P products with the monthly AWAP P series. Overall, the comparison between monthly values presented lower R 2 and NSE values compared to the annual results, which might indicate a different capability of the products in capturing seasonal P variations. Specifically, CHIRPS P and TRMM P showed high consistency with monthly AWAP P , with R 2 values higher than 0.9 in most of the comparisons. While for GLDAS P and Persiann P , most of the R 2 values are lower than 0.9, RMSE results again showed that southeastern catchments showed relatively higher RMSE values, especially for GLDAS P (in Upper-Murray, Ovens, Mitta Mitta and Kiewa) and Persiann P (in Upper Murray, Ovens, Mitta Mitta and Kiewa, as well). NSE values for CHIRPS P ranged from 0.75 to 1, with relative fewer variations across the catchments. NSE for GLDAS P ranged from 0.37 to 0.92, which is much lower than the GLDAS P NSE values estimated with annual results. The lowest GLDAS P NSE values were observed in Kiewa (0.37), Mitta Mitta (0.54) and Wimmera (0.59). Monthly Persiann P also received much lower NSE values with lowest values observed in Kiewa (0.37), Mitta Mitta (0.58) and Upper Murray (0.58). Persiann P showed higher performance from the NSE aspect with NSE values greater than 0.86 in the 22 catchments. Figure 6 shows the comparison statistics when decomposing the basin scale annual ET into the catchment scale. Most comparisons between the four ET products and AWAP ET obtained R 2 values greater than 0.8, which indicated that the products presented overall similar annual variation patterns. Mitta Mitta is observed to have the lowest R 2 values for CSIRO ET (0.39), GLDAS ET (0.52) and MOD ET (0.43). RMSE for the four products in different catchments showed that MOD ET has the highest RMSE levels, especially in the northern catchments (including Moonie, Border Rivers, Warrego, Namoi, Paroo and Gwydir) where RMSEs are greater than 200 mm. GLDAS ET is observed to have the second largest RMSE levels, with relatively higher values in southern catchments (e.g., Wimmera, Mid-Murray and Campaspe). Similarly, NSE values for MOD ET are significantly lower than the other three products, where negative NSE levels were observed in 16 out of the 22 catchments. Even though fewer negative NSE values are observed with CSIRO ET (2), GLDAS ET (8) and TerraClimate ET (3), the NSE values for most of the comparisons are relatively low (e.g., lower than 0.5, although some of them have high R 2 ), through which we can infer that large variations exist within the ET products in the catchments which might be explained by their capabilities in capturing extremely high or low ET levels.  Figure 6 shows the comparison statistics when decomposing the basin scale annual ET into the catchment scale. Most comparisons between the four ET products and AWAPET obtained R 2 values greater than 0.8, which indicated that the products presented overall similar annual variation patterns. Mitta Mitta is observed to have the lowest R 2 values for CSIROET (0.39), GLDASET (0.52) and MODET (0.43). RMSE for the four products in different catchments showed that MODET has the highest RMSE levels, especially in the northern catchments (including Moonie, Border Rivers, Warrego, Namoi, Paroo and Gwydir) where RMSEs are greater than 200 mm. GLDASET is observed to have the second largest RMSE levels, with relatively higher values in southern catchments (e.g., Wimmera, Mid-Murray and Campaspe). Similarly, NSE values for MODET are significantly lower than the other three products, where negative NSE levels were observed in 16 out of the 22 catchments. Even though fewer negative NSE values are observed with CSIROET (2), GLDASET (8) and TerraClimateET (3), the NSE values for most of the comparisons are relatively low (e.g., lower than 0.5, although some of them have high R 2 ), through which we can infer that large variations exist within the ET products in the catchments which might be explained by their capabilities in capturing extremely high or low ET levels. Monthly comparisons between the ET products provided further insights as displayed in Figure 7. Overall, CSIRO ET and GLDAS ET presented better correlations with AWAP ET evidenced with higher R 2 values across all 22 catchments. Lower R 2 values with MOD ET and TerraClimate ET are typically observed in northern (e.g., Border Rivers, Moonie and Gwydir) and western (e.g., Barwon-Darling, Lower Darling and Lower Murray) catchments where R 2 range from about 0.3 to 0.6. RMSEs associated with comparisons of CSIRO ET and GLDAS ET to AWAP ET are relatively uniform across the catchments (around 10 mm) while larger RMSEs are observed with MOD ET and TerraClimate ET , especially for northern catchments including Warrego, Condamine-Balonne, Moonie, Border Rivers, Gwydir and Namoi. Overall lower NSE values are also observed with the monthly ET profiles estimated with the four products, but significant lower values are obtained by MOD ET and TerraClimate ET in most catchments except several southeastern catchments (including Upper Murray, Mitta Mitta, Goulburn Broken, Ovens and Campaspe). catchments (around 10 mm) while larger RMSEs are observed with MODET and TerraClimateET, especially for northern catchments including Warrego, Condamine-Balonne, Moonie, Border Rivers, Gwydir and Namoi. Overall lower NSE values are also observed with the monthly ET profiles estimated with the four products, but significant lower values are obtained by MODET and TerraClimateET in most catchments except several southeastern catchments (including Upper Murray, Mitta Mitta, Goulburn Broken, Ovens and Campaspe).

Impacts of Catchment Characteristics
Correlations of RMSE for each pair of global P products and AWAPP at both annual and monthly scales with catchment characteristics are summarized in Figure 8. The annual result indicated that data products showing higher RMSE values are generally associated with catchments with a high annual P level, and RMSE is highly correlated with the greater terrain fluctuations characterized with a high DEM as well as larger DEM variations (indicated with average slope levels). This is especially true for GLDASP and PersiannP, which indicated that these two data products are more sensitive to changes in elevation. While most land use types do not show high correlations with the statistics, the "1 conservation and natural environments" and "5 Intensive uses" presented a consistent moderate positive correlation with RMSEs associated with the four P products. The above phenomenon is more apparently indicated with the correlation analysis between RMSE derived from the monthly P series and catchment characteristics. Catchments characterized by relatively high P levels, high altitude and large terrain variations and more distribution of land use type 1 and 5 tend to deliver high RMSEs according to the correlation coefficients.

Impacts of Catchment Characteristics
Correlations of RMSE for each pair of global P products and AWAP P at both annual and monthly scales with catchment characteristics are summarized in Figure 8. The annual result indicated that data products showing higher RMSE values are generally associated with catchments with a high annual P level, and RMSE is highly correlated with the greater terrain fluctuations characterized with a high DEM as well as larger DEM variations (indicated with average slope levels). This is especially true for GLDAS P and Persiann P , which indicated that these two data products are more sensitive to changes in elevation. While most land use types do not show high correlations with the statistics, the "1 conservation and natural environments" and "5 Intensive uses" presented a consistent moderate positive correlation with RMSEs associated with the four P products. The above phenomenon is more apparently indicated with the correlation analysis between RMSE derived from the monthly P series and catchment characteristics. Catchments characterized by relatively high P levels, high altitude and large terrain variations and more distribution of land use type 1 and 5 tend to deliver high RMSEs according to the correlation coefficients.
The situation is more complex and revealed with the correlation analysis between statistics of ET products and catchments' characteristics ( Figure 9). From an annual perspective, RMSE levels for CSIRO ET , MOD ET and TerraClimate ET are observed to positively correlate with the location of the catchments (latitude and longitude), more eastern and northern catchments tend to have higher RMSE levels, while GLDAS ET presented the opposite trend. Additionally, RMSE values are positively correlated with P and ET levels for CSIRO ET and TerraCliamte ET , which indicated that catchments with higher P, and thus higher ET, would yield higher uncertainties for the ET products. High altitude located catchments also tends to have higher uncertainties supported with the positive correlation coefficients between RMSE and DEM (and Slope). The high positive correlation between MOD ET RMSE levels with latitude and longitude, well represented in the north eastern catchments, showing significant higher RMSE values as previously identified. Similar impacts of land use composites on CSIRO ET and TerraClimate ET are observed where catchments with more water (land use type 6) distributions tend to have lower RMSE levels. For GLDAS ET , RMSE in annual ET is negatively correlated with land use type 2 but positively correlated with land use type 3 and 4 areas, whereas for MOD ET , RMSE in annual ET is negatively correlated with land use type 1, 4, 5 and 6 areas. Similar correlations between RMSE and catchment characteristics are also observed with the monthly ET series. The situation is more complex and revealed with the correlation analysis between statistics of ET products and catchments' characteristics ( Figure 9). From an annual perspective, RMSE levels for CSIROET, MODET and TerraClimateET are observed to positively correlate with the location of the catchments (latitude and longitude), more eastern and northern catchments tend to have higher RMSE levels, while GLDASET presented the opposite trend. Additionally, RMSE values are positively correlated with P and ET levels for CSIROET and TerraCliamteET, which indicated that catchments with higher P, and thus higher ET, would yield higher uncertainties for the ET products. High altitude located catchments also tends to have higher uncertainties supported with the positive correlation coefficients between RMSE and DEM (and Slope). The high positive correlation between MODET RMSE levels with latitude and longitude, well represented in the north eastern catchments, showing significant higher RMSE values as previously identified. Similar impacts of land use composites on CSIROET and TerraClimateET are observed where catchments with more water (land use type 6) distributions tend to have lower RMSE levels. For GLDASET, RMSE in annual ET is negatively correlated with land use type 2 but positively correlated with land use type 3 and 4 areas, whereas for MODET, RMSE in annual ET is negatively correlated with land use type 1, 4, 5 and 6 areas. Similar correlations between RMSE and catchment characteristics are also observed with the monthly ET series.

Evaluation of Global P Products
Comparison of the four P products indicated broadly similar temporal variations and spatial distribution of rainfall within the MDB. Results show that CHIRPSP and TRMMP presented overall better consistency with AWAPP as indicated with higher R 2 , lower RMSE and high NSE associated with both annual and monthly data series. Both products (CHIRPSP and TRMMP) are generated with intensive information derived from microwave P sensors which seems to reproduce rainfall better across the MDB. Microwave sensors estimate rainfall from microwave radiation which is recognized as a more robust way of estimating rainfall [31]. This might also contribute to the fewer spatial disparities associated with the two products observed across the catchments. Conversely, the infrared sensor information, on which PersiannP is based, shows the most apparent differences (Figure 4 and Figure 5). Infrared sensors relate surface P to the brightness and temperature of the cloud tops, however there are complex processes and high uncertainties from the cloud information into rainfall especially for the regions with high cloudiness and abundant rainfall, which might cause these disparities. This is supported with the evidence that most catchments showing high RMSE and low NSE values with PersiannP are located in the south eastern part of the MDB, where the regions are

Evaluation of Global P Products
Comparison of the four P products indicated broadly similar temporal variations and spatial distribution of rainfall within the MDB. Results show that CHIRPS P and TRMM P presented overall better consistency with AWAP P as indicated with higher R 2 , lower RMSE and high NSE associated with both annual and monthly data series. Both products (CHIRPS P and TRMM P ) are generated with intensive information derived from microwave P sensors which seems to reproduce rainfall better across the MDB. Microwave sensors estimate rainfall from microwave radiation which is recognized as a more robust way of estimating rainfall [31]. This might also contribute to the fewer spatial disparities associated with the two products observed across the catchments. Conversely, the infrared sensor information, on which Persiann P is based, shows the most apparent differences (Figures 4 and 5). Infrared sensors relate surface P to the brightness and temperature of the cloud tops, however there are complex processes and high uncertainties from the cloud information into rainfall especially for the regions with high cloudiness and abundant rainfall, which might cause these disparities. This is supported with the evidence that most catchments showing high RMSE and low NSE values with Persiann P are located in the south eastern part of the MDB, where the regions are subjected to favourable rainfall topography with greater elevation changes and very likely anomalies exist at the cloud tops across the MDB. This statement is further supported with the overall high correlation between RMSE and terrain characteristics (DEM and slope, Figure 8) (which implies the challenge of capturing orographic precipitations for all products) where Persiann P presented relative higher correlations (i.e., more susceptible to terrain variations). Similar findings are also available in some recent publications where the authors concluded that microwave-based precipitation estimation outperform infrared-based estimations [32]. GLDAS P , which incorporates both microwave and infrared sensors derived P estimations (https://ldas.gsfc.nasa.gov/gldas/), has a performance located between those of CHIRPS P , TRMM P and Persiann P .
It is interesting to note that the land use type 1 (conservation and natural environment) tends to impact the performance of the P products. A possible explanation to this is the landscape patches, which are normally covered with moderate to dense forests distributed discontinuously across the catchments, impact local climate. It is also worth mentioning that, in remote catchments in the western part of the MDB (low-lying, less terrain variation and few vegetation cover, Table 1), all P products presented consistently high performances, which indicates the effectiveness of satellite-based P estimations in reproducing low surface P with few field observations.

Evaluation of ET Products
Relatively higher disparities in the dry months during the studied period and relatively good consistency between products in catchments with abundant P (e.g., Kiewa, Upper Murray, Mitta Mitta in Figure 7) but poor consistency in less humid catchments (e.g., Warrego, Barwon-Darling, Condamine-Balonne and other catchments in the northern part of the MDB in Figure 7) were observed. These findings indicate that the products with different capabilities in capturing ET values might be more sensitive in arid situations. Both annual and monthly MOD ET underestimated ET levels in almost every catchment. This is in agreement with several previous studies assessing the performance of MOD ET under various climatic conditions [19]. MOD ET gives priority to vegetation covered landscapes. This might also partly explain the positive correlation between R 2 (and NSE) derived with monthly MOD ET and percentage of natural vegetation coverage (land use type 1) within the catchments, where MOD ET captured water losses better in the surfaces that were well covered by vegetation.
As we discussed above, the ET products chosen in this study utilize two different methods: The hydrological method (TerraClimate ET and AWAP ET ) and the hydro-meteorological method (MOD ET and CSIRO ET ). There are substantial disparities in ET estimations between these two groups, especially evidenced by the monthly series across the 22 catchments (Figure 7), which indicated the importance of appropriately parameterizing the involved processes. Apparent disparities also exist between products created using the same method. For instance, TerraClimate ET used a similar water balance method as AWAP for deriving water budget related components. The difference between the two is; TerraClimate ET used a simplified one-dimensional Thornthwaite-Mather climatic water-balance model [13] while AWAP employed a two-layer model to better represent the intensive exchange of surface and deep water within the MDB [14]. This could partly explain the higher disparities between TerraClimate ET and AWAP ET (indicated with higher RMSE and negative NSE values) in the northeastern catchments (Figure 7) where ground water plays an important role in supporting local dry land agricultural activities (Land use type 3 in Figure 1). For the two products based on the Penman-Monteith algorithm, CSIRO ET outperforms MOD ET in almost all catchments at both the annual and monthly scale (Figures 6 and 7). This might be attributed to the input datasets applied in CSIRO ET which better reflected the characteristics of the Australian territory, especially in the application of a unique value of maximum conductance for Australia, which is a key factor in parameterizing the land surface [9].

Implications
Findings from the above evaluations have implications for both product developers and potential product users. In the case of developers, the identified disparities against reference data and the factors influencing them provide the problems associated with algorithm and/or model inputs, therefore indicating the direction of improvement. For example, for TerraClimate ET , as indicated by its apparent disparities with AWAP ET in arid regions due to its negligence of intensive groundwater uses, it would be invaluable to further test the efficiency of the simplified water balance model in similar conditions and calibrate the model to better reflect deep ground water exchanges. The model parameters of MOD ET in non-vegetation covered areas also need to be calibrated for better ET simulations for these regions.
The findings are of importance to river basin managers. P and ET jointly determined the water available for river basin management activities. There are regions with no detailed ground observations or well calibrated P and ET products, therefore tuning to readily available global products for regional management is an easy option. Our results indicated that the adoption of global products for regional applications without considering the algorithms behind the products, as well as the local climatic and terrain conditions, may lead to serious management failures. For long term basin management activities towards sustainable development, which normally require annual scale water availability assessments, it is relatively safe to use the studied data P and ET products, since the evaluations at basin and catchments scales in this study that covered wet and dry conditions proved their overall moderate to high accuracy with true P and ET levels. Data products generated with similar algorithms might be also considered but an inter comparison with the studied products is suggested. While management activities require monthly or finer scale data, such as irrigation and water reallocation, large uncertainties are expected when adopting global products, especially for ET. For monthly P, calibrations should be considered in high altitude located basins and regions subject to high P levels, whereas for monthly ET, global ET products should be used with caution in quantifying water loss at regional scales. Penman-Monteith algorithm-based products can be a relatively good option when supported with localized input data (e.g., CSIRO ET ). Water balance-based methods are easy to apply and require fewer data inputs, but the water balance analysis should be considered, extending to deep layer water exchanges, especially in areas with enormous ground water abstractions as revealed by the different performances of TerraClimate ET and AWAP ET . Moreover, compared with the P products, the ET products show more complex syntheses and are related to more landscape characteristics (Figure 9), and have their own most suitable geographic and climatic regions. Thus, it can be a good option to use different ET products in different regions.

Conclusions
Accurate estimates of P and ET are crucial for regional water resource management while the accuracy of adopting data products produced with global coverage for regional applications remains less discussed. This study provides new insights on the performance of four global P products (CHIRPS P , GLDASP, TRMM P , Persiann P ) and four global ET products (CSIRO ET , GLDAS ET , MOD ET , TerraClimate ET ) in the MDB by comparing them with the AWAP dataset and the contributions of regional climatic and landscape settings to the uncertainties within the global products through relating the product disparities to catchment characteristics. Through the comparison, the four P products yield overall high consistence across the MDB catchments except that (1) the microwave sensor based CHIRPS P and TRMM P presented better consistency with AWAP P than the infrared sensor based Persiann P ; and (2) the four products perform better in catchments with fewer P and less terrain variations but poor in high altitude located catchments with high P levels. The four ET products yield more disparities against AWAP ET , indicating greater uncertainties contributed to by the associated simulating methods and impacts from the diverse climatic and landscape characteristics. The Penman-Monteith algorithm based CSIRO ET, supported with localized land surface parameters, performs better across the catchments when compared with the other three products. TerraClimate ET showed overall similar performance as AWAP ET except it showed substantial disparities in arid catchments which indicated the necessity of calibrating the model to reflect local hydrological processes. Overall, the findings highlighted that regional climatic and terrain conditions would affect the performance of global data, therefore caution is needed when adopting global data products for regional use.