Using a Groundwater Adjusted Water Balance Approach and Copulas to Evaluate Spatial Patterns and Dependence Structures in Remote Sensing Derived Evapotranspiration Products

: This study aims to improve the standard water balance evapotranspiration (WB ET) estimate, which is typically used as benchmark data for catchment-scale ET estimation, by accounting for net intercatchment groundwater ﬂow in the ET calculation. Using the modiﬁed WB ET approach, we examine errors and shortcomings associated with the long-term annual mean (2002–2014) spatial patterns of three remote-sensing (RS) MODIS-based ET products from MODIS16, PML_V2, and TSEB algorithms at 1 km spatial resolution over Denmark, as a test case for small-scale, energy-limited regions. Our results indicate that the novel approach of adding groundwater net in water balance ET calculation results in a more trustworthy ET spatial pattern. This is especially relevant for smaller catchments where groundwater net can be a signiﬁcant component of the catchment water balance. Nevertheless, large discrepancies are observed both amongst RS ET datasets and compared to modiﬁed water balance ET spatial pattern at the national scale; however, catchment-scale analysis highlights that difference in RS ET and WB ET decreases with increasing catchment size and that 90%, 87%, and 93% of all catchments have ∆ ET < ± 150 mm/year for MODIS16, PML_V2, and TSEB, respectively. In addition, Copula approach captures a nonlinear structure of the joint relationship with multiple densities amongst the RS/WB ET products, showing a complex dependence structure (correlation); however, among the three RS ET datasets, MODIS16 ET shows a closer spatial pattern to the modiﬁed WB ET, as identiﬁed by a principal component analysis also. This study will help improve the water balance approach by the addition of groundwater net in the ET estimation and contribute to better understand the true correlations amongst RS/WB ET products especially over energy-limited environments.


Introduction
Evapotranspiration (ET) (or its energy equivalent term, latent heat flux LE) is the transfer of water from the Earth's surface to the atmosphere via soil and water surface evaporation and plant transpiration. Terrestrial ET amounts to approximately 70% of the total land surface's precipitation [1,2]. Thus, it constitutes a cornerstone variable in both the land surface energy and water budgets [3,4]. Evapotranspiration is difficult to observe and model over water-limited-and in particular, energy-limited environments; this is due to the complex interactions amongst the climate-land-surface variables and the plant-atmosphere system [5]. However, ET estimates are crucially important especially at larger scales for the assessment of water resources [6][7][8]. However, reliable and accurate the interaction of the modeled and observed hydrometeorological variables was properly described and captured either at upper-or lower density maxima using the bivariate empirical Copula. Both, however, highlighted the worth of use of Copula functions in hydrology and hydrometeorology as a valuable complement to traditional linear-based statistical analyses.
In the literature, ET datasets from the remote-sensing and surface water balance approaches have been compared and studied (e.g., [7,8,53,54]). For example, [17,20] compared water balance ET estimates to ET estimates based on hydrological modeling and RS methods over central Europe and Canada, respectively. Both studies concluded that evapotranspiration obtained by the water balance approach displayed the highest uncertainties. They also found that larger basins tend to have smaller errors. Similar results were also obtained by [5,55] over the contiguous United States. However, all of the previous studies mentioned above have investigated very large catchments at continental scales only, which hinders to address the spatial patterns of ET in more detail at regional and local scales. What is more, none of them have attempted to include the net intercatchment groundwater flows in the surface water balance evapotranspiration estimates.
Therefore, the present research aims at improving the WB ET estimation and evaluation of the RS ET products in high resolution at a regional scale with a focus on energy limited environments such as Denmark. For this, the standard WB ET estimate is modified using the net intercatchment groundwater flows (GW-net) and also the dependence structures of three remote sensing MODIS-based ET products from MODIS16 [40], PML_V2, [56] and the two-source energy balance model (TSEB) [57] all at 1 km spatial resolution as well as the surface water balance ET method are estimated, and all ET products are assessed for similarity ranking based on a principal component analysis (PCA). Further, RS ET spatial patterns are compared and evaluated using the surface modified WB ET over a 13-year period (2002-2014) at the national scale of Denmark, as a case study for the regions with mainly energy-limited environments. The analysis is conducted using 146 catchments with a varying catchment size from larger than 30 km 2 to smaller than 1000 km 2 . Therefore, the main objectives of this study include: (1) to improve the surface WB ET approach through add of GW-net data in the ET calculation; (2) to evaluate the spatial patterns in three MODIS-based RS ET products using the improved WB ET approach; (3) to estimate the dependence structures and similarity ranking amongst the ET products using Copula and PCA approaches.
Our analysis will also be extended to compare ET datasets at different spatial scales (full pixel resolution catchment-scale and aggregated-regional analyses), aiming at further exploring the performance-and ET uncertainties of remote sensing MODIS-based models. Finally, for the future applications, it is the aim to identify which of the satellite ET spatial patterns applied matches better with the reference WB ET approach over Denmark.

Study Area Characterization
This study was performed for the land phase of Denmark. Figure 1b shows the study area and 146 incorporated discharge stations which are differentiated into downstream and upstream sub-catchments. The topography is generally flat where the highest elevation is up to 171 m a.s.l, and agriculture with more than 70% coverage is the main land cover type together with patches of coniferous and deciduous forests across Denmark. The eastern part of the country is dominated by a high fraction of clay soil, that is, >20% clay fraction; however, the western part is mainly covered with sandy soil i.e., <5% clay fraction (Figure 2a). The land-use is predominantly agriculture and the hydro-climate is dominated by energy-limited conditions in winter and most of autumn and spring, while the summer months can be water-limited. While, the lowest mean leaf area index (LAI) values with less than 2 m 2 /m 2 are found in central Jutland [14] (Figure 2b

Data
3.1.1. Water Balance ET Approaches

Water Balance ET Approaches
From a total dataset of 305 discharge stations (catchment size > 10 km 2 ) including 126 downstream and 179 upstream sub-catchments in Denmark, 146 runoff gauges (Q-data) were considered in this study, which appropriately covers the entire country (Figure 1b). Observed runoff for a 13-year period of 2002-2014 was used in order to be in agreement with MODIS data. Our water balance ET analysis is also grouped according to the size of catchments (Figure 1c-e). A number of 68 out of 146 selected sub-catchments have <100 km 2 (Figure 1c), 48 catchments range from 101 to 200 km 2 (Figure 1d), and 30 are >200 km 2 (Figure 1e). Most of the large sub-catchments are found in central parts of Jutland (see Figure 1b). In addition, we also estimated the regional WB ET pattern through aggregating the sub-catchments to 10 regions. The criteria for aggregation of the subcatchments into regions were mainly based on the RS ET distributions to better understand similarities/dissimilarities between the WB/RS ET spatial patterns at a larger scale.
In terms of data availability, 73% (N = 107) of the catchments had a full length of Q-data (13 years). However, 9% (N = 13) and 18% (N = 26) had data-gaps ranging 2-6 years of records and 7-9 years of records, respectively. In addition, original Q values (m 3 s −1 ) were converted to water depth (mm) through dividing by the sub-catchment areas (m 2 ). Annual precipitation (P-data) for each sub-catchment was calculated using daily gridded data with a 10 km spatial resolution [58]. Due to the data comparability in water balance ET calculation, length of P-data corresponded to those of the runoff data for each year of records. The catchment-based P and Q data were used to calculate the standard surface WB ET by averaging over the years with available Q data. However, the standard WB ET method is modified by including the net intercatchment groundwater flow simulated by the national water resources model of Denmark (the DK-model) in the ET calculation. A detailed description on the DK-model is provided in Section 3.2.1.

Remote-Sensing ET Models
Three remote sensing ET estimates are used in this study, which are based on the moderate resolution MODIS data. With a spatial resolution of approx. 1 km and frequent data acquisition, the MODIS-based ET estimates can provide information at a relevant scale for Denmark and for optimization/evaluation of distributed hydrological models with a focus on spatial patterns of ET at the national scale for the future applications. Therefore, remote sensing-based ET models such as GLEAM [59] or the Meteosat-based LSA SAF ET [60] were not considered in this study as they do not provide the desired resolution. In addition, the focus of the current study is on the spatial patterns of ET, and therefore RS estimates that are not based on land surface models but confined to the available satellite observations and their spatial information content are preferred. The first of such estimate was obtained from the global MODIS16 ET product [61], which is originally based on the Penman-Monteith equation [62]; however, the model has been modified [40] to account for parameters not readily available from space [20]. The second one was obtained from the global PML_V2 ET product, which is a coupled diagnostic biophysical model [56]. The third was based on a two-source-energy-balance (TSEB) implementation for Denmark [14]. The TSEB ET product [57] is mainly driven by land surface temperature (LST), which is a key indicator of the evaporative state at the surface [63]. All datasets were resampled to 1 km grid cells, and monthly timescale over the period 2002-2014, covering the entire land phase of Denmark (see Figure 1b). We chose to neglect the short-term and inter-annual variability and instead investigate the spatial patterns of ET representing the average climatology, estimated as annual means. A summary of the remote-sensing ET products used in this study are given in Table 1.

Methods
The methodology employed to evaluate the spatial patterns and dependence structures in WB/RS ET products is presented as follows: 1.
Catchment water balance ET estimates from standard and modified approaches using groundwater net; 2.
Estimation of ET dependence structures using empirical Copula approach; and 3.
Assessment of ET similarity using a PCA analysis.

Catchment WB ET Estimates
We present two approaches for the ground-based ET calculation, i.e., standard and modified WB ET methods. In the standard WB ET approach, the ET rate for the years with P and Q data available for each sub-catchment was calculated. In this approach, both groundwater exchange (GW net ) and water storage changes (∆S) are ignored. On an annual to multi-annual timescale, ∆S can typically assumed to be negligible [17,22], however groundwater exchange flow can be considerable particularly for smaller catchments [25]. The standard WB ET (ET Q ) was defined as: In the modified WB ET approach, in addition to the discharge measurement, the DK-model was used to estimate the net intercatchment groundwater flow to be included in Equation (1). The modified WB ET (ET Q−GW ) was then defined as: where GW net [mm/year] is the net groundwater flow across the topographical catchment boundary, calculated as subtraction of the total groundwater boundary outflows from the total groundwater boundary inflows divided by the length of study period (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014). For further descriptions, the MIKE-SHE graphical layouts representing the interrelationships between the water balance components, especially the intercatchment boundary flows applied in Equation (2) are provided in Appendix A.
To our best of knowledge, no study has included groundwater flows so far for estimating the surface WB ET. In the present study, we obtained the biweekly net groundwater estimates for both inflows and outflows of each catchment using the DK-model, which is operated by GEUS-Geological Survey of Denmark and Greenland [64][65][66]. The model is based on the MIKE-SHE modeling code [67]. MIKE-SHE is a physically-based distributed modeling system covering the entire land phase of the hydrological cycle [67,68], which describes a full 3D finite difference groundwater flow, 1D unsaturated flow and 2D over-land flow in separate modules that are coupled two-ways in every time step. For detailed description on the model code, it is referred to (e.g., [65,68,69]).
Groundwater abstraction plays a major role in the groundwater hydrology of Denmark, and a total number of approximately 40,000 abstractions wells for domestic/industry and irrigation are included in the DK-model setup, accounting for the entire groundwater abstraction in Denmark. Therefore, our modified WB ET approach covers both human (anthropogenic activities on irrigation and ground water extraction) and natural (net interbasin groundwater flows) impacts on the water balance closure. However, the groundwater boundary flows may largely be driven by the natural part, i.e., geology, in Denmark. The DK-model is a fully distributed grid-based flow model setup in a 500 m resolution for the entire surface area of Denmark (43,000 km 2 ) and run on a daily time step. The model structure is based on a comprehensive hydro-stratigraphic model for Denmark and parameterized through calibration against objective functions based on 305 discharge stations and 25,981 hydraulic head observation wells for the period 2000-2010 (for the current application run for the period 2002-2014). The main objective functions are Kling-Gupta efficiency (KGE) [70] and percent bias (Pbias) for streamflow, and continuous ranked probability score (CRPS) values for groundwater heads [53]. Details on the model data, construction, calibration, and validation can be found in [71], while calibration statistics will be presented in brevity in the Result section.

Empirical Copula-Based ET Dependence Structures Analysis
To capture the underlying dependence structures amongst the WB/RS ET products, the empirical Copula function was applied. The Copula approach [72] states that any multivariate distribution function can be decomposed into the marginal distributions and a Copula, which the dependence structure between variables is appropriately estimated [73]. The separated analysis of marginal distributions and dependence structure (i.e., the Copula) is one of the major benefits of using Copula functions to describe joint distributions between variables [74].
In this study, it was essential to obtain the joint probability distribution function between the ET products used. While marginal distributions of such datasets are commonly known, their joint distributions are typically too complex to appropriately derive from the marginal distributions [75]. Therefore, empirical Copula was employed herein to estimate their joint distributions. Moreover, the pairwise dependence between such variables has been traditionally described using classical families of bivariate distributions such as normal, gamma, and Gaussian distributions. The main limitation of this approach is that the individual behaviors of the two variables or transformations thereof need to be then characterized by the same parametric family of univariate distributions [76]. As a result, distributions of these statistical-based methods remain symmetrical, meaning that it is not clear whether correlations and dependencies are stronger/weaker at low, high, or middle-range of values. Copula functions, however, avoid this restriction. This implies that Copula-based distributions are asymmetrical, in such a way it indicates which part of the datasets are highly/weakly correlated i.e., at lower-or upper tails.
Here, a two-step approach was applied to estimate the Copula density C. First,F x , F y of the marginal distributions was obtained using the empirical distribution function as an estimator. Second, the pseudo-observations (û,v) = F x (x),F y (y) were defined [77]. The Copula density, which is purely based on the real data [78], was finally estimated as the joint density of (û,v). If {r 1 (1), . . . , r 1 (n )} and {r 2 (1), . . . , r 2 (n )} denote marginal distributions-based rank space values, the empirical Copula was then defined as: where u = F x (x), v = F y (y) and 1(. . .) is the indicator function that takes a value of 1 if the argument (. . .) is true) and n being the sample size. For detailed information about Copulas, it is referred to e.g., [73,75].

PCA-Based Similarity Assessment of ET Products
To assess the similarity and rank WB/RS ET products used, a principal component analysis (PCA) was performed. It is a linear algebra-based multivariate procedure that summarizes the information in a dataset. PCA rotates the axes so that the new coordinate system aligns with the orientation of maximum variability available in the dataset. The rotation is a linear transform and the new coordinate axes are then a linear combination of the original axes. This approach allows to discard components that contribute insignificantly to describe the variability in the data. Therefore, insignificant dimensions can be removed from the rotated space (i.e., PCA-transformed space) without affecting relationship of the remaining components. The space is yet linear, but dimensionality of the multivariate dataset is reduced [12,79]. We achieved this by transforming the input ET datasets into a new small set of datasets without losing the most important information in the original data. These new datasets correspond to a linear combination of the originals and are called principal components (PCs). The PCs are ranked in that way that PC1 explains the largest fraction of the variance in a dataset, PC2 the second largest, etc. For a full description on the procedure of PCA technique, it is referred to e.g., [80,81].

Remote-Sensing Derived Evapotranspiration
The grid-based mean annual ET values derived from RS models of MODIS16, PML_V2, and TSEB, together with their probability density estimations (histograms), are shown in Figure 3. It can be seen that in general, the RS ET patterns vary to a large extent. Low values of the MODIS16 ET less than 450 mm/year are spatially limited, and mostly found in central Jutland (see Figure 1b for location of the region), while the rest of the country indicates a consistent pattern with higher ET rates, for which maximum values (>700 mm/year) are observed in southwestern Jutland ( Figure 3a). Likewise, PML_V2 ET low values (<350 mm/year) are mainly observed in central Jutland, and the higher ET values (>450 mm/year) seen in the north and southern parts of Jutland ( Figure 3b). Therefore, MODIS16 and PML_V2 show a very similar ET spatial patterns; however, they vary in terms of absolute values. On the contrary, TSEB indicates a different ET spatial pattern across the country (compared to MODIS16 and PML_V2). The low values (<450 mm/year) are distributed in most regions except for the eastern Jutland, where higher values above 600 mm/year are found. Some regions in the northeast part of Zealand (see Figure 1b for location of the region) also show high values ( Figure 3c). The highest ET difference between the three RS estimates are found over the eastern/western Jutland with a value of approx. ±350 mm/year (not shown). Overall, MODIS16 and PML_V2 represent a relatively uniform spatial pattern with a narrower range of values (440-757 and 186-711 mm/year, respectively) compared to TSEB, which shows a sharp contrast between eastern and western parts of the country with a broader range of values (248-1289 mm/year). This is demonstrated by the underlying distributions of the RS ET histograms, as well ( Figure 3).

Catchment Water Balance ET
The spatial distributions of annual mean precipitation and surface streamflow are shown in Figure 4a,b, respectively. It can be seen that the measured discharge demonstrates a similar pattern as the observed spatial pattern of precipitation, in such a way that both are decreasing from western (P > 900 mm/year and Q > 500 mm/year) to eastern (P < 600 mm/year and Q < 200 mm/year) parts of Denmark. The Q pattern is also captured by the runoff-ratio map (see Appendix B). Typically, the runoff ratio, which is the proportion of P that is not taken up by ET and which eventually generates runoff at the catchment outlet, is controlled to some extent by land surface-based variables such as soil type and soil texture, so that soils containing large fraction of clay infiltrate less water than sandy soils and thus produce higher runoff ratios. However, this is not the case in our study area. In fact, runoff generation in Denmark insignificantly affected by clay fraction, but mainly follows the input precipitation pattern (see Figure 2a for distribution of the clay fraction).

Catchment Water Balance ET
The spatial distributions of annual mean precipitation and surface streamflow are shown in Figure 4a,b, respectively. It can be seen that the measured discharge demonstrates a similar pattern as the observed spatial pattern of precipitation, in such a way that both are decreasing from western (P > 900 mm/year and Q > 500 mm/year) to eastern (P<600 mm/year and Q < 200 mm/year) parts of Denmark. The Q pattern is also captured by the runoff-ratio map (see Appendix A2). Typically, the runoff ratio, which is the proportion of P that is not taken up by ET and which eventually generates runoff at the catchment outlet, is controlled to some extent by land surface-based variables such as soil type and soil texture, so that soils containing large fraction of clay infiltrate less water than sandy soils and thus produce higher runoff ratios. However, this is not the case in our study area. In fact, runoff generation in Denmark insignificantly affected by clay fraction, but mainly follows the input precipitation pattern (see Figure 2a for distribution of the clay fraction).
In addition to P and Q data as routine components for calculating surface WB ET, in this study, we also considered the groundwater boundary flows in order to modify the standard ET calculation (see Equations (1) and (2)). Figure 4c displays the spatial pattern of annual mean values for the groundwater net data simulated by the DK-model. Further, 35% of the 146 sub-catchments show a loosing condition with values ranging from −25 to −200 mm/year; however, only 5% indicate high negative values less than −150 mm/year. On the contrary, 10% (3% > 100 mm/year) of the catchments have a positive net groundwater exchange with values ranging from 25 to 165 mm/year. Overall, most of the subcatchments (55% of all) show a rather small range of values i.e., ±25 mm/year. Interestingly, the larger catchments (>200 km 2 ) generally display lower inflow/outflow values compared to the smaller catchments. This implies that larger basins have a tendency to less errors and uncertainties in terms of water balance closure using the standard WB ET assumption. Similar results were also reported by e.g., [5]. Further discussion about the effect of catchment size on the water balance ET estimates is provided in the follow-up section.  To further investigate and demonstrate the importance of adding the long-term mean GW-net data in the surface water balance ET calculation, we illustrate a two-week-interval timeseries (Figure 5a,b) and long-term monthly mean cumulative distribution of the water balance components (Figure 5c,d) for two selected catchments, A and B (see Figure 1b for the location). Two criteria were applied for the selection of these catchments: i) having full data coverage (13-year), and ii) being extreme cases for both groundwater (GW) inflow (positive) and outflow (negative) conditions. The measured P timeseries shows limited seasonality and inter-annual variability. The measured Q followed P peaks in wintertime for both catchment A and catchment B. In Denmark, high discharge typically occurs during wintertime as a consequence of low potential ET and above average precipitation. It can be seen that temporal (annual/inter-annual) variations in the simulated GW-net timeseries are very low, for both inflow/positive and outflow/negative conditions ( Figure  5a,b, respectively). However, despite low values especially for net inflow (Figure 5a), GWnet data closely follow the seasonality of Q. This implies that, peak discharge values are appropriately corresponded to the high values of GW-net inflow.
In addition, the long-term monthly mean (2002-2014) water balance components of P, Q, GW-net, ET, and ET plus GW-net data used in the modified WB ET calculation (see Equation (2)) for inflow and outflow conditions are shown in Figure 5c,d, respectively. The surface ET estimate, as shown in catchment A, is greatly increased (approx. 17% total) In addition to P and Q data as routine components for calculating surface WB ET, in this study, we also considered the groundwater boundary flows in order to modify the standard ET calculation (see Equations (1) and (2)). Figure 4c displays the spatial pattern of annual mean values for the groundwater net data simulated by the DK-model. Further, 35% of the 146 sub-catchments show a loosing condition with values ranging from −25 to −200 mm/year; however, only 5% indicate high negative values less than −150 mm/year. On the contrary, 10% (3% > 100 mm/year) of the catchments have a positive net groundwater exchange with values ranging from 25 to 165 mm/year. Overall, most of the sub-catchments (55% of all) show a rather small range of values i.e., ±25 mm/year. Interestingly, the larger catchments (>200 km 2 ) generally display lower inflow/outflow values compared to the smaller catchments. This implies that larger basins have a tendency to less errors and uncertainties in terms of water balance closure using the standard WB ET assumption. Similar results were also reported by e.g., [5]. Further discussion about the effect of catchment size on the water balance ET estimates is provided in the follow-up section.
To further investigate and demonstrate the importance of adding the long-term mean GW-net data in the surface water balance ET calculation, we illustrate a two-week-interval timeseries (Figure 5a,b) and long-term monthly mean cumulative distribution of the water balance components (Figure 5c,d) for two selected catchments, A and B (see Figure 1b for the location). Two criteria were applied for the selection of these catchments: (i) having full data coverage (13-year), and (ii) being extreme cases for both groundwater (GW) inflow (positive) and outflow (negative) conditions. The measured P timeseries shows limited seasonality and inter-annual variability. The measured Q followed P peaks in wintertime for both catchment A and catchment B. In Denmark, high discharge typically occurs during wintertime as a consequence of low potential ET and above average precipitation. It can be seen that temporal (annual/inter-annual) variations in the simulated GW-net timeseries are very low, for both inflow/positive and outflow/negative conditions (Figure 5a,b, respectively). However, despite low values especially for net inflow (Figure 5a), GW-net data closely follow the seasonality of Q. This implies that, peak discharge values are appropriately corresponded to the high values of GW-net inflow. To examine the accuracy and robustness of the DK-model, its performance was evaluated for the period 2000-2010 against observations of discharge (Figure 6a-c) and hydraulics head wells (Figure 6d) using multiple statistical metrics across the study area (see Figure 1 for more details). For discharge, most of the catchments indicate a high Nash-Sutcliffe efficiency (NSE > 0.6) ( Figure 6a) and low Pbias < 25% (Figure 6b). The high capability of the DK-model to reproduce discharge is also confirmed by a high values of KGE > 0.7 (Figure 6c). Nevertheless, the model shows some limitations to accurately simulate the surface streamflow especially in smaller catchments, where the lowest/highest efficiency/error is observed; however, these errors/shortcomings can be insignificant as limited to fewer catchments at the national scale of Denmark. With regard to the hydraulics head wells, as shown in Figure 6d, the model shows a reasonable performance also. The majority of the head wells are well reproduced by the model with an approx. 1 m error; however, the error is rather high in some wells (> ±5 m) especially in eastern part of the Jutland, mainly due to a more complex geology and higher elevation gradient (see Figure 1b for location of the region). Overall, the DK-model indicates a good performance to model the surface/subsurface flows, mainly because of an abundant and rich observational input data used in the model setup (see Sections 3.1.1 and 3.2.1 for further details). In addition, the long-term monthly mean (2002-2014) water balance components of P, Q, GW-net, ET, and ET plus GW-net data used in the modified WB ET calculation (see Equation (2)) for inflow and outflow conditions are shown in Figure 5c,d, respectively. The surface ET estimate, as shown in catchment A, is greatly increased (approx. 17% total) after adding GW-net flow to WB ET calculation in inflow condition (green shaded area in Figure 5c). In contrast, it is notably decreased (approx. 21% total) when GW-net is subtracted from the WB ET in catchment B in outflow condition (green shaded area Figure 5d). Therefore, based on data presented in Figure 5, it becomes clear that the WB approach to estimate ET cannot be applied at shorter timescales than the long-term annual mean, as water balance closure cannot be granted at shorter timescales. This, thus highlights that the long-term annual mean GW-net data could have a considerable contribution to accurately estimate water balance ET, especially in small-scale catchments with a complex geology. To see a graphical water balance condition for the catchment A and catchment B, it is referred to Appendix A.
To examine the accuracy and robustness of the DK-model, its performance was evaluated for the period 2000-2010 against observations of discharge (Figure 6a-c) and hydraulics head wells (Figure 6d) using multiple statistical metrics across the study area (see Figure 1 for more details). For discharge, most of the catchments indicate a high Nash-Sutcliffe efficiency (NSE > 0.6) (Figure 6a) and low Pbias < 25% (Figure 6b). The high capability of the DK-model to reproduce discharge is also confirmed by a high values of KGE > 0.7 (Figure 6c). Nevertheless, the model shows some limitations to accurately simulate the surface streamflow especially in smaller catchments, where the lowest/highest efficiency/error is observed; however, these errors/shortcomings can be insignificant as limited to fewer catchments at the national scale of Denmark. With regard to the hydraulics head wells, as shown in Figure 6d, the model shows a reasonable performance also. The majority of the head wells are well reproduced by the model with an approx. 1 m error; however, the error is rather high in some wells (>±5 m) especially in eastern part of the Jutland, mainly due to a more complex geology and higher elevation gradient (see Figure 1b Figure 7 presents the catchment-based annual mean spatial patterns together with the regional estimations of ET datasets from standard WB ET and modified WB ET approaches as well as RS ET models of MODIS16, PML_V2, and TSEB. The spatial distribution of the standard water balance evapotranspiration approach (ET-WBQ), unlike the Q pattern, does not show a clear ET pattern across the study area (Figure 7a). In other words, ET-WBQ indicates a large spatial discrepancy with regards to Q over Denmark (see Figures  2b and 4b). High ET values (>600 mm/year) are mainly distributed in a few small-scale catchments in the central parts of Jutland (see Figure 1b for the location). However, based on the spatial information presented in Figures 1b and 4a, these high values cannot be merely due to high precipitation and underlying topography. In order to shed light into those extreme ET values in some sub-catchments, we modified the standard ET calculation (Equation (1)) through including the net groundwater data (Equation (2)). The spatial pattern of the modified water balance evapotranspiration approach (ET-WBQ-GW) is illustrated in Figure 7b. The result verifies the fact that the ET pattern is smoothed, meaning  Figure 7 presents the catchment-based annual mean spatial patterns together with the regional estimations of ET datasets from standard WB ET and modified WB ET approaches as well as RS ET models of MODIS16, PML_V2, and TSEB. The spatial distribution of the standard water balance evapotranspiration approach (ET-WB Q ), unlike the Q pattern, does not show a clear ET pattern across the study area (Figure 7a). In other words, ET-WB Q indicates a large spatial discrepancy with regards to Q over Denmark (see Figures 2b and 4b).

Spatial Comparisons of WB/RS ET Products
High ET values (>600 mm/year) are mainly distributed in a few small-scale catchments in the central parts of Jutland (see Figure 1b for the location). However, based on the spatial information presented in Figures 1b and 4a, these high values cannot be merely due to high precipitation and underlying topography. In order to shed light into those extreme ET values in some sub-catchments, we modified the standard ET calculation (Equation (1)) through including the net groundwater data (Equation (2)). The spatial pattern of the modified water balance evapotranspiration approach (ET-WB Q-GW ) is illustrated in Figure 7b. The result verifies the fact that the ET pattern is smoothed, meaning that very high (unusual) ET values for some smaller catchments are now modified. However, the DK-model indicates limitation to fully eliminate discrepancies in the ET patterns for all sub-catchments with unusual high and low ET values. Nevertheless, the results (Figure 7a,b) indicate the importance of adding groundwater data for an accurate water balance condition with a focus on ET estimation. This was also highlighted by other researchers e.g., [25,27,28]. The remote sensing MODIS-based ET datasets derived from MODIS16, PML_V2, and TSEB algorithms were evaluated using the modified WB ET. With reference to ET-WBQ-GW, the results (Figure 7c-e) indicate that the three RS ET spatial patterns largely vary compared to the reference ET, and practically, there is almost no consistency and relationship between RS and WB ET maps. Notable differences are mainly observed in eastern Jutland (see Figure 1b for the location) for TSEB with high values above 560 mm/year, whereas MODIS16 and PML_V2 are spatially distributed with values ranging 530-650 mm/year and 250-450 mm/year, respectively across the country. Overall, such spatial discrepancies in the ET datasets reveal high levels of uncertainties, not only for the three RS models, but also for the reference WB approach. However, even though the spatial patterns of catchment-based ET products are largely dissimilar at the national scale, their specific estimates for particular regions are fairly consistent. Compared to ET-WBQ-GW, both MODIS16 and TSEB are in relatively good agreement for some regions in northern/southwestern Jutland and northern/central Zealand (see Figure 1b for the locations). Accordingly, the regional ET spatial patterns also indicate a more similar pattern (regardless of the absolute values) amongst ET datasets when individual catchments (N = 146) were aggregated into a few larger regions (N = 10); this, is more evident between ET-WBQ- GW and MODIS16 (small panels in Figure 7b,c).
Besides the analysis of ET spatial patterns, ET differences (∆ET) between ET-WBQ-GW The remote sensing MODIS-based ET datasets derived from MODIS16, PML_V2, and TSEB algorithms were evaluated using the modified WB ET. With reference to ET-WB Q-GW , the results (Figure 7c-e) indicate that the three RS ET spatial patterns largely vary compared to the reference ET, and practically, there is almost no consistency and relationship between RS and WB ET maps. Notable differences are mainly observed in eastern Jutland (see Figure 1b for the location) for TSEB with high values above 560 mm/year, whereas MODIS16 and PML_V2 are spatially distributed with values ranging 530-650 mm/year and 250-450 mm/year, respectively across the country. Overall, such spatial discrepancies in the ET datasets reveal high levels of uncertainties, not only for the three RS models, but also for the reference WB approach. However, even though the spatial patterns of catchment-based ET products are largely dissimilar at the national scale, their specific estimates for particular regions are fairly consistent. Compared to ET-WB Q-GW , both MODIS16 and TSEB are in relatively good agreement for some regions in northern/southwestern Jutland and northern/central Zealand (see Figure 1b for the locations). Accordingly, the regional ET spatial patterns also indicate a more similar pattern (regardless of the absolute values) amongst ET datasets when individual catchments (N = 146) were aggregated into a few larger regions (N = 10); this, is more evident between ET-WB Q-GW and MODIS16 (small panels in Figure 7b,c).
Besides the analysis of ET spatial patterns, ET differences (∆ET) between ET-WB Q-GW and RS ET models were also investigated. The results are shown in Figure 8. It is observed that spatial distributions of ∆ET are weakly consistent among the three MODISbased ET algorithms. However, number of sub-catchments with very high ∆ET values >−250 mm/year are limited to approx. In addition, 3% of the catchments are mostly observed in central parts of Jutland and Zealand (see Figure 1b for the location). A closer examination revealed that large negative ∆ET is mainly found in the regions with rather high P and Q values (see Figure 4), for which resulted in a lower WB ET rate over the corresponding catchments (see Figure 7b). In contrast, many catchments have rather small ∆ET values ranging between −150 and 150 mm/year (~90% of all). This suggests that, despite a large discrepancy among the ET spatial patterns at the national scale, MODIS-based ET models agree fairly well with ET-WB Q-GW for a majority of the catchments. revealed that large negative ∆ET is mainly found in the regions with rather high P and Q values (see Figure 4), for which resulted in a lower WB ET rate over the corresponding catchments (see Figure 7b). In contrast, many catchments have rather small ∆ET values ranging between −150 and 150 mm/year (~90% of all). This suggests that, despite a large discrepancy among the ET spatial patterns at the national scale, MODIS-based ET models agree fairly well with ET-WBQ-GW for a majority of the catchments. Seemingly, as shown in Figure 8, largest ∆ET either negative biases (RS ET overestimates) or positive biases (RS ET underestimates) are mainly found for smaller catchments. It was further analyzed and illustrated in Figure 9, which ranked catchments by their ∆ET for varying catchment sizes. Interestingly, it turned out that ∆ET magnitudes between RS and WB ET methods are clearly decreased from small catchments below 100 km 2 (>-300 and <250 mm/year) to larger catchments above 200 km 2 (>-150 and <180 mm/year). MODIS16 (Figure 9a), however, shows a clearer decreasing trend (with an overestimated rate) compared to PML_V2 (Figure 9b) and TSEB (Figure 9c). Further, the aggregated regional-scale ∆ET magnitudes are even further decreased (compared to catchment sizebased ∆ET), in particular for MODIS16. This analysis, therefore, highlights that RS MODIS-based ET spatial errors and uncertainties can be much higher for small-scale, energy-limited areas (like over Denmark applied in this study) compared to much larger regions such as over the United States [e.g., 5] or Canada [e.g., 20], which both reported a high reliability of the RS ET products for catchment-based ET estimation at larger catchments. Seemingly, as shown in Figure 8, largest ∆ET either negative biases (RS ET overestimates) or positive biases (RS ET underestimates) are mainly found for smaller catchments. It was further analyzed and illustrated in Figure 9, which ranked catchments by their ∆ET for varying catchment sizes. Interestingly, it turned out that ∆ET magnitudes between RS and WB ET methods are clearly decreased from small catchments below 100 km 2 (>−300 and <250 mm/year) to larger catchments above 200 km 2 (>−150 and <180 mm/year). MODIS16 (Figure 9a), however, shows a clearer decreasing trend (with an overestimated rate) compared to PML_V2 (Figure 9b) and TSEB (Figure 9c). Further, the aggregated regional-scale ∆ET magnitudes are even further decreased (compared to catchment size-based ∆ET), in particular for MODIS16. This analysis, therefore, highlights that RS MODIS-based ET spatial errors and uncertainties can be much higher for small-scale, energy-limited areas (like over Denmark applied in this study) compared to much larger regions such as over the United States e.g., [5] or Canada e.g., [20], which both reported a high reliability of the RS ET products for catchment-based ET estimation at larger catchments.
based ∆ET), in particular for MODIS16. This analysis, therefore, highlights that RS MODIS-based ET spatial errors and uncertainties can be much higher for small-scale, energy-limited areas (like over Denmark applied in this study) compared to much larger regions such as over the United States [e.g., 5] or Canada [e.g., 20], which both reported a high reliability of the RS ET products for catchment-based ET estimation at larger catchments.

Dependence Structures of ET Products: Copula-Based Analysis
The empirical Copula densities amongst the WB ET and RS ET products are illustrated in Figure 10. In general, the RS ET products from MODIS16 ( Figure 10a) PML_V2 ( Figure 10b) and TSEB (Figure 10c) algorithms do not show a significant dependence structure against the modified WB ET. Instead, a nonlinear structure of the joint relationship between the ET products is observed with strongly varying densities in the different percentiles. The Copula density of WB and versus MODIS16 and PML_V2 indicates a complex, but partially positive relationship with multiple densities; in contrast, it is clearly asymmetrical with the highest densities in lower-right tail between the WB and TSEB showing a strong negative correlation. However, unlike catchment-based WB/RS ET dependencies (Figure 10a-c), the grid-based dependence structures of three remote-sensing ET datasets are stronger and clearer (Figure 10d-f). Copula densities between RS ET products are asymmetrical. This implies that, the highest dependency is seen in the lower-left corner, where the low values are found between MODIS16/TSEB ( Figure 10d) and PML_V2/TSEB (Figure 10e). Thus, they are strongly concordant in the lower ranks of the distribution only. Further, a second maximum density can be observed between higher and lower (and upper) ranges of TSEB and MODIS16 (and PML_V2). However, PML_V2 and MODIS16 indicates a significant symmetric dependence structure with maximum densities in the lower-left and upper-right corners. This means that, highest correlation is found at both low and extreme values (Figure 10f).

Dependence Structures of ET Products: Copula-based Analysis
The empirical Copula densities amongst the WB ET and RS ET products are illustrated in Figure 10. In general, the RS ET products from MODIS16 ( Figure 10a) PML_V2 ( Figure 10b) and TSEB (Figure 10c) algorithms do not show a significant dependence structure against the modified WB ET. Instead, a nonlinear structure of the joint relationship between the ET products is observed with strongly varying densities in the different percentiles. The Copula density of WB and versus MODIS16 and PML_V2 indicates a complex, but partially positive relationship with multiple densities; in contrast, it is clearly asymmetrical with the highest densities in lower-right tail between the WB and TSEB showing a strong negative correlation. However, unlike catchment-based WB/RS ET dependencies (Figure 10a-c), the grid-based dependence structures of three remote-sensing ET datasets are stronger and clearer (Figure 10d-f). Copula densities between RS ET products are asymmetrical. This implies that, the highest dependency is seen in the lower-left corner, where the low values are found between MODIS16/TSEB ( Figure 10d) and PML_V2/TSEB (Figure 10e). Thus, they are strongly concordant in the lower ranks of the distribution only. Further, a second maximum density can be observed between higher and lower (and upper) ranges of TSEB and MODIS16 (and PML_V2). However, PML_V2 and MODIS16 indicates a significant symmetric dependence structure with maximum densities in the lower-left and upper-right corners. This means that, highest correlation is found at both low and extreme values (Figure 10f).  . Panels a, b and c indicate the dependence structure pattern of the water balance ET against the three derived remote sensing ET products estimated at catchmentscale (as in Fig. 7); while panels d, e and f indicate the dependence structures amongst the three remote sensing ET estimated at pixel-by-pixel for the entire domain (as in Fig. 3).
Therefore, a complex and nonlinear dependence structure pattern achieved by the Copula analysis amongst WB/RS ET products, may partially explain large discrepancies observed in the ET spatial patterns (Figure 7) applied at the national scale of Denmark. Additionally, Copula-based dependence structures between the ET estimates against Figure 10. The empirical Copula densities between the satellite RS ET products and surface WB ET estimate. The sample size is 146 (20,654) data tuples in (a-c) in (d-f) panels for the ET datasets. Panels (a-c) indicate the dependence structure pattern of the water balance ET against the three derived remote sensing ET products estimated at catchment-scale (as in Figure 7); while panels (d-f) indicate the dependence structures amongst the three remote sensing ET estimated at pixel-by-pixel for the entire domain (as in Figure 3). Therefore, a complex and nonlinear dependence structure pattern achieved by the Copula analysis amongst WB/RS ET products, may partially explain large discrepancies observed in the ET spatial patterns (Figure 7) applied at the national scale of Denmark. Additionally, Copula-based dependence structures between the ET estimates against landsurface/climate variables were estimated also. Overall, the WB/RS ET products indicated a stronger correlation and clearer dependence structure pattern to the land-surface variables (i.e., clay fraction, LST, and LAI) compared to the meteorological variables (i.e., precipitation, air temperature, solar radiation, and wind velocity) over Denmark with a predominantly energy-limited environment. A further detailed description is provided in Appendix B.

Similarity Assessment of ET Products: PCA-Based Analysis
To assess the similarity ranking amongst the ET estimates, we performed a PCA analysis for both grid-based RS ET (Figure 11a) and catchment-based WB/RS ET (Figure 11b) products. The three remote-sensing ET datasets from MODIS16, PML_V2, and TSEB algorithms are positively correlated to PC1. This implies that, in general, the RS ET products show some degrees of similarity. However, MODIS16 and PML_V2 indicates a closer spatial pattern compared to TSEB; this becomes more apparent in PC2 (Figure 11a). The catchment-based PCA results also show that the water balance and three MODIS-based ET products, as illustrated in Figure 11b, differ to a large extent. Nevertheless, WB ET and MODIS16 ET estimates, unlike PML_V2 and TSEB, shows a closer spatial pattern. In both PC1 and PC2, TSEB shows a strong negative correlation to WB, indicating a higher spatial discrepancy when compared to MODIS16, PML_V2, and WB ET spatial maps (see Figure 7). These results are in good agreement with those captured by the Copula analysis (see Figure 10). products. The three remote-sensing ET datasets from MODIS16, PML_V2, and TSEB algorithms are positively correlated to PC1. This implies that, in general, the RS ET products show some degrees of similarity. However, MODIS16 and PML_V2 indicates a closer spatial pattern compared to TSEB; this becomes more apparent in PC2 (Figure 11a). The catchment-based PCA results also show that the water balance and three MODIS-based ET products, as illustrated in Figure 11b, differ to a large extent. Nevertheless, WB ET and MODIS16 ET estimates, unlike PML_V2 and TSEB, shows a closer spatial pattern. In both PC1 and PC2, TSEB shows a strong negative correlation to WB, indicating a higher spatial discrepancy when compared to MODIS16, PML_V2, and WB ET spatial maps (see Figure  7). These results are in good agreement with those captured by the Copula analysis (see Figure 10).

ET Uncertainties
For calculation of the surface water balance ET, we used both standard (ET = P − Q) and modified (ET = P -Q + GWnet) approaches. Generally, the errors associated with precipitation P and runoff Q may have large effects on WB ET estimates [5]. In this study, gridded daily P data with a 10 km spatial resolution [58] was used. The daily station series was quality tested by the Danish Meteorological Institute (DMI), however, no testing for homogeneity has been performed on P daily observations [82]. Additionally, discharge stations used are part of the Danish river monitoring network. Although this network covers Denmark appropriately and its monthly runoff data are qualified, it is not without limitations. For example, the streambed profile of the streams at some stations in central Jutland (see Figure 1 for the location) are considered unstable due to a significant growth of vegetation in summertime [83]. Therefore, these, might be some sources of error in in situ observations.
The annual mean WB ET estimated based on the modified approach (ET-WBQ-GW) (Figure 7b) reduces the spatial variability, seemingly improving the realism of the spatial pattern compared to the standard procedure (ET-WBQ) (Figure 7a), due to the inclusion of GW-net in the ET calculation. This accounting for GW-net, separates this study from the

ET Uncertainties
For calculation of the surface water balance ET, we used both standard (ET = P − Q) and modified (ET = P − Q + GW net ) approaches. Generally, the errors associated with precipitation P and runoff Q may have large effects on WB ET estimates [5]. In this study, gridded daily P data with a 10 km spatial resolution [58] was used. The daily station series was quality tested by the Danish Meteorological Institute (DMI), however, no testing for homogeneity has been performed on P daily observations [82]. Additionally, discharge stations used are part of the Danish river monitoring network. Although this network covers Denmark appropriately and its monthly runoff data are qualified, it is not without limitations. For example, the streambed profile of the streams at some stations in central Jutland (see Figure 1 for the location) are considered unstable due to a significant growth of vegetation in summertime [83]. Therefore, these, might be some sources of error in in situ observations. The annual mean WB ET estimated based on the modified approach (ET-WB Q-GW ) (Figure 7b) reduces the spatial variability, seemingly improving the realism of the spatial pattern compared to the standard procedure (ET-WB Q ) (Figure 7a), due to the inclusion of GW-net in the ET calculation. This accounting for GW-net, separates this study from the previous studies applying WB ET calculation at the catchments scale while neglecting GWnet by treating it as a source of uncertainty e.g., [5,17,55]. However, despite being included herein, the modified WB ET still shows a heterogeneous spatial pattern across the study area with extreme variability among neighboring catchments. This indicates that even when accounting for GW-net estimates from a calibrated regional/national groundwater flow model, the uncertainties associated with the WB ET method for relatively small catchments is still significant. Additionally, estimates of GW-net from the DK-model simulations are subject to error and uncertainty and as a consequence, an unrealistic spatial pattern of ET-WB Q-GW remained (Figure 7b). Hence, we speculate that errors have not been fully eliminated in the ET-WB Q-GW approach and that there is a limitation to the WB ET approach that increases with decreasing catchment size.
The uncertainty associated with remote-sensing ET estimates is generally difficult to identify due to different input data [84] leading to large uncertainties in forcing and ancillary data [85], different algorithms and methods for ET estimation [35], different parameterizations [5], and different spatiotemporal resolution of the input dataset [86]. However, despite the fact that all the three RS MODIS16, PML_V2, and TSEB ET algorithms used in this study are based on MODIS data (see Table 1 for details), they largely differ not only in their absolute values, but also spatial patterns at the national scale of Denmark. As a result, a large spatial discrepancy was observed between the WB ET and RS ET models applied ( Figure 7); this was also highlighted by the Copula analysis, of which a different underlying dependence structure was captured between the ET products ( Figure 10). These findings separate this study from most previous studies of similar nature, where typically strong correlation coefficients (best fit) for RS/WB ET datasets were reported, most probably due to the large spatial variability in climate, elevation, aridity etc. resulting from continental scale applications.
In addition, the catchment-scale errors in MODIS16, PML_V2, and TSEB ET estimates with respect to ET-WB Q-GW was found to be on average 18.7% (ranged between 0.04% and 147%), 20.4% (ranged between 0.56% and 72%), and 12.8% (ranged between 0.27% and 142%), respectively. Moreover, catchment-size and regional analyses indicated that errors and uncertainties of WB ET against RS ET methods are clearly decreased for larger catchments and regions (Figure 9), which was also reported by other researchers e.g., [5,20]. Therefore, these indicate that application of remote-sensing ET products, even from similar algorithms, could be challenging over small-scale, high-latitude, and energy-limited regions like Denmark applied in this study. Thus, satellite RS data may need a bias correction to be performed prior to application over such regions. As indicated by [5], in dry regions, ET is generally controlled by water availability as most input precipitation on the surface is converted to ET and only a small portion generates runoff, thus ET models can capture the spatiotemporal variability more accurately; by contrast, in humid areas ET is mainly controlled by energy and vegetation; and thus spatiotemporal variability of ET is lower and more difficult to capture. Overall, the spatial pattern of MODIS16 ET, as captured by the PCA analysis (Figure 11b), is closer to the reference water balance ET pattern as compared to the PML_V2 and TSEB patterns over Denmark.

Conclusions
The main conclusions of this research can be summarized as follows: 1.
The ET-WB Q spatial patterns indicated a significant inconsistency over Denmark, with an energy-limited environment; however, it was improved especially for smaller catchments, when GW-net data was included in the ET-WB Q-GW estimate. Nevertheless, there is still challenges in applying the WB ET approach to small regions.

2.
The TSEB, MODIS16, and PML_V2 ET estimates varied largely compared to ET-WB Q-GW ; as a result, a large discrepancy was observed amongst the ET products at the national scale of Denmark. However, many catchments had ∆ET < ±150 mm/year. Regional analysis also indicated that RS ET uncertainties decrease with increasing catchment size. 3.
The Copula analysis captured a nonlinear structure amongst the ET products with multiple densities (either in lower or upper ranks), showing a complex relationship between not only the RS ET datasets, but also between ET-WB Q-GW and MODIS-based ET products. The ET-WB Q-GW and MODIS16 ET showed a closer spatial pattern as identified by PCA analysis, indicating a good potential for future applications over the study area.

4.
Finally, it is recommended that for WB ET calculation, future studies should incorporate GW-net from surrogate hydrological models if available. In addition, Copula approach can be considered in other catchments worldwide for estimating true relationships amongst water balance (or energy balance) components in addition to traditional statistical analyses.  for catchment A ( Figure A1a) and catchment B ( Figure A1b) in the extreme positive and negative boundary flows conditions, respectively; see Figure 1b for the location of the catchments. The interrelationships between the various water balance components, in particular, the intercatchment boundary flows in the unsaturated zone UZ and saturated zone SZ are clearly visualized ( Figure A1). This highlights the importance of add of net intercatchment groundwater flow (GW-net) in the long-term annual mean water balance ET calculation, unlike to UZ/SZ storage changes that can be negligible. The runoff ratio map, which was calculated by dividing discharge ( Figure A2b) to precipitation ( Figure A2a) is shown in Figure A2d. It can be seen that the catchment-based spatial patterns of the calculated runoff ratio and the measured discharge are very similar. This indicates that the surface streamflow is mainly driven by the input precipitation pattern ( Figure A2a), but not much by the clay fraction content (see Figure 2a) over Denmark. It is noted that, Figures A2a-c are again represented here for convenience's sake, when all directly compared with the runoff ratio map.

B1. The Grid-Based RS ET Dependence Structures
The empirical Copula densities of RS ET datasets and meteorological variables are illustrated in Figure A3. A nonlinear structure of the joint relationship for TSEB ( Figure  A3a), MODIS16 ( Figure A3b), and PML_V2 ( Figure A3c), in relation to Prec can be observed with multiple densities. In all cases, the highest densities are found neither in upper-tail nor lower-tail corners, but in different percentiles of ET distributions. This indicates a complex dependence structure exists between the MODIS-based RS ET datasets and Prec over Denmark with an energy-limited environment. Copula densities for MODIS16 against AirT are almost asymmetrical ( Figure A3e). This implies that the highest density function is seen in the upper-right (lower-left) corner, where the extreme (very The empirical Copula densities of RS ET datasets and meteorological variables are illustrated in Figure A3. A nonlinear structure of the joint relationship for TSEB ( Figure A3a), MODIS16 ( Figure A3b), and PML_V2 ( Figure A3c), in relation to Prec can be observed with multiple densities. In all cases, the highest densities are found neither in upper-tail nor lower-tail corners, but in different percentiles of ET distributions. This indicates a complex dependence structure exists between the MODIS-based RS ET datasets and Prec over Denmark with an energy-limited environment. Copula densities for MODIS16 against AirT are almost asymmetrical ( Figure A3e). This implies that the highest density function is seen in the upper-right (lower-left) corner, where the extreme (very low) values are found indicating a positive correlation. In contrast, TSEB and PML_V2 show a nonlinear, negative dependence structures against AirT ( Figure A3d,f). Additionally, Rs and WindS variables represent a very similar pattern against the three MODIS-based ET products. This means that, PML_V2 ( Figure A3i,l) and especially MODIS16 ( Figure A3h,k) indicate a positive correlation against Rs and WindS; however, the Copula densities between TSEB versus these variables are negative, in particular for WindS ( Figure A3j). As shown in Figure A4a-c, the empirical Copula density of MODIS-based ET against ClayF is asymmetrical (except for PML_V2) with the highest density in the lower-left corner (low values) for all three RS ET products. It indicates that they are strongly concordant in the lower ranks of the distribution; therefore, the three MODIS-based ET estimates (especially TSEB) are mainly affected by low values of ClayF over Denmark. The Copula density of TSEB ( Figure A4d), MODIS16 ( Figure A4e), and PML_V2 ( Figure A4f) against LAI, unlike to ClayF, indicates a significant (positive) symmetric dependence structure with maximum densities in the lower-left and upper-right corners. This means that, highest correlation is found at the very low and extreme values. Amongst the land-surface/meteo-forcing variables applied, LST represents the most significant symmetrical dependence structure with TSEB ( Figure A4g), meaning that the highest correlation is seen at lower-right and upper-left corners, indicating a strong negative correlation. However, the density function for MODIS16 and PML_V2 versus LST is mainly asymmetric with maximum dependency in the upper-left corner, as shown in Figure A4h,i. These dependencies Figure A3. Grid-based empirical Copula densities between the remote-sensing ET datasets (TSEB ET: left panels, MODIS16 ET: middle panels, and PML_V2 ET: right panels) and meteorological variables of precipitation Prec, air temperature AirT, global shortwave solar radiation Rs, and wind speed WindS. The sample size is 19,712 data tuples for all the variables. In all the panels (a-l), remote sensing ET (y-axis) are plotted against meteorological variables (x-axis).
As shown in Figure A4a-c, the empirical Copula density of MODIS-based ET against ClayF is asymmetrical (except for PML_V2) with the highest density in the lower-left corner (low values) for all three RS ET products. It indicates that they are strongly concordant in the lower ranks of the distribution; therefore, the three MODIS-based ET estimates (especially TSEB) are mainly affected by low values of ClayF over Denmark. The Copula density of TSEB ( Figure A4d), MODIS16 ( Figure A4e), and PML_V2 ( Figure A4f) against LAI, unlike to ClayF, indicates a significant (positive) symmetric dependence structure with maximum densities in the lower-left and upper-right corners. This means that, highest correlation is found at the very low and extreme values. Amongst the land-surface/meteoforcing variables applied, LST represents the most significant symmetrical dependence structure with TSEB ( Figure A4g), meaning that the highest correlation is seen at lowerright and upper-left corners, indicating a strong negative correlation. However, the density function for MODIS16 and PML_V2 versus LST is mainly asymmetric with maximum dependency in the upper-left corner, as shown in Figure A4h,i. These dependencies are in accordance with the RS ET models for MODIS16, PML_V2, and TSEB, which all are driven by LAI, but only TSEB includes LST.
Remote Sens. 2021, 13, x FOR PEER REVIEW 21 of 25 Figure A4. Same as in Figure A3, but for the land-surface variables of clay fraction ClayF (top panels), leaf area index LAI (middle panels), and land surface temperature LST (bottom panels). In all the panels (a-i), remote sensing ET (y-axis) are plotted against land-surface variables (x-axis).

B2. The Catchment-Based WB ET Dependence Structures
The empirical Copula densities of WB ET against land-surface and meteorological variables are illustrated in Figure A5. The density function for WB ET and Prec is negatively asymmetrical. This implies that the maximum dependency is seen in the lower-right tail, where the extreme (minimum) values of Prec (WB ET) are found; however, a second maximum can be also observed in the approx. upper-left corner ( Figure A5a). As shown in Figure A5b,c, a nonlinear structure of the joint relationship for both AirT/Rs against WB ET is observed, with strongly varying densities in the different percentiles. The highest densities are mainly found for WB ET lower/higher values in the lower-left corners, whereas the upper-tails do not show a significant dependence structure in both AirT and Rs. This means that the highest WB ET values are not-or insignificantly affected by these variables. The distribution of WB ET against WindS is asymmetrical, with the highest density in the upper-left tail. It conveys that higher ranks of WB ET distribution are strongly concordant with the lowest WindS values, indicating a strong negative correlation. However, a weaker concordance is also observed in the lower-and middle range of values ( Figure A5d).
The empirical Copula densities of WB ET against land-surface variables are also estimated. The density function of WB ET and ClayF is largely asymmetrical with the highest correlation in the upper-right tail, where the extreme values are found ( Figure A5e). The Copula density between WB ET and LAI shows a strong symmetric dependence structure. The highest density is found for the upper-right tail, and a second maximum in the lower-left corner ( Figure A5f). This conveys that they show the highest (positive) correlations for both the extreme-and low values, respectively. Nevertheless, an insignificant negative relationship can be also seen for the middle range of values. The dependence structure between WB ET versus LST indicates that the highest Copula densities are found in the maximum-and middle range of WB ET, which is corresponded to the middle Figure A4. Same as in Figure A3, but for the land-surface variables of clay fraction ClayF (top panels), leaf area index LAI (middle panels), and land surface temperature LST (bottom panels). In all the panels (a-i), remote sensing ET (y-axis) are plotted against land-surface variables (x-axis).

Appendix B.2. The Catchment-Based WB ET Dependence Structures
The empirical Copula densities of WB ET against land-surface and meteorological variables are illustrated in Figure A5. The density function for WB ET and Prec is negatively asymmetrical. This implies that the maximum dependency is seen in the lower-right tail, where the extreme (minimum) values of Prec (WB ET) are found; however, a second maximum can be also observed in the approx. upper-left corner ( Figure A5a). As shown in Figure A5b,c, a nonlinear structure of the joint relationship for both AirT/Rs against WB ET is observed, with strongly varying densities in the different percentiles. The highest densities are mainly found for WB ET lower/higher values in the lower-left corners, whereas the upper-tails do not show a significant dependence structure in both AirT and Rs. This means that the highest WB ET values are not-or insignificantly affected by these variables. The distribution of WB ET against WindS is asymmetrical, with the highest density in the upper-left tail. It conveys that higher ranks of WB ET distribution are strongly concordant with the lowest WindS values, indicating a strong negative correlation. However, a weaker concordance is also observed in the lower-and middle range of values ( Figure A5d).
The empirical Copula densities of WB ET against land-surface variables are also estimated. The density function of WB ET and ClayF is largely asymmetrical with the highest correlation in the upper-right tail, where the extreme values are found ( Figure A5e). The Copula density between WB ET and LAI shows a strong symmetric dependence structure. The highest density is found for the upper-right tail, and a second maximum in the lowerleft corner ( Figure A5f). This conveys that they show the highest (positive) correlations for both the extreme-and low values, respectively. Nevertheless, an insignificant negative relationship can be also seen for the middle range of values. The dependence structure between WB ET versus LST indicates that the highest Copula densities are found in the maximum-and middle range of WB ET, which is corresponded to the middle range-and minimum values of LST, respectively. However, a third maximum density can be also observed for LST, indicating a negative correlation with WB ET ( Figure A5g).
Remote Sens. 2021, 13, x FOR PEER REVIEW 22 of 25 Figure A5. Catchment-based empirical Copula densities between the surface water balance ET estimate and meteorological variables (left and middle panels) and land-surface variables (right panels). The datasets are the same as in Figures A3 and A4. The sample size is 146 data tuples for all the variables. In all the panels (a-g), water balance ET (y-axis) are plotted against meteorological and land-surface variables (x-axis). Figure A5. Catchment-based empirical Copula densities between the surface water balance ET estimate and meteorological variables (left and middle panels) and land-surface variables (right panels). The datasets are the same as in Figures A3 and A4. The sample size is 146 data tuples for all the variables. In all the panels (a-g), water balance ET (y-axis) are plotted against meteorological and land-surface variables (x-axis).