Application of Copula Functions for Rainfall Interception Modelling

Rainfall interception is an important process of the water cycle that can have significant influence on surface runoff and groundwater storage. Since rainfall interception measurements are rare and time consuming, rainfall interception estimation can be made indirectly using different meteorological variables. Experimental data of rainfall interception for birch and pine trees was measured at an experimental plot located in an urban area of Ljubljana, Slovenia in this study. A copula model was applied to predict the rainfall interception using meteorological variables, namely air temperature and vapour pressure deficit data. The copula model performance was compared to some other models such as decision trees, multiple linear regressions, and exponential functions. Using random sampling, we found that the copula model where Khoudraji-Liebscher copula functions were used yielded slightly smaller root mean square error (RMSE) and mean absolute error (MAE) values than other tested methods (i.e., RMSE and MAE results for birch trees were 24.2% and 18.2%, respectively and RMSE and MAE results for pine trees were 25.0% and 19.6%, respectively). The results demonstrate that the copula-based proposed method and other tested models could be used for the prediction of rainfall interception at the considered plot and in the wider surroundings. Furthermore, these models could also be applied for the prediction of rainfall interception for these two tree species in other locations under similar vegetation and meteorological conditions.


Introduction
Rainfall interception by vegetation is recognized as an important process of the hydrological cycle by researchers worldwide, influencing surface runoff in a great manner, irrespective of whether they are conducted in natural or urban forests (e.g., [1][2][3][4][5][6]).Rainfall partitioning can be divided into three main components, namely throughfall, stemflow, and interception loss.Throughfall is a portion of precipitation that either reaches the ground directly without touching the canopy or reaches the ground at a later time by dripping from the saturated canopy.Stemflow is a portion of intercepted precipitation that reaches the ground by flowing down the branches and stem.Interception loss is a portion of rainfall that is intercepted by the vegetation and evaporated back into the atmosphere, never reaching the ground.The process of rainfall partitioning is influenced by different meteorological (e.g., rainfall amount, intensity, raindrop size, raindrop velocity, air temperature, air humidity, wind speed) and vegetation variables (e.g., tree type, canopy characteristics, stem roughness, leaf area index) [4,[7][8][9][10][11][12][13][14].Rainfall amount, duration and intensity are often recognised as the most influential meteorological variables since they decrease rainfall interception [4,[14][15][16][17].However, throughfall and stemflow, both of which also decrease rainfall interception, continue after the end of the rainfall event [13,18].In this extended period, the evapotranspiration, highly influenced by air humidity and air temperature, plays a significant role in the rainfall partitioning process [4,15,19].
Rainfall interception measurements are quite complicated and time consuming as they take years of fieldwork [14,20,21].However, meteorological variables such as air temperature and relative air humidity measurements that are needed to calculate the vapour pressure deficit are usually available from national meteorological stations, or they can be measured using fairly simple and cheap equipment.Thus, from a practical point of view, a model that can be used to predict rainfall interception loss for specific vegetation and climatic conditions using only climatic data is needed.However, at least some experimental measurements are needed to construct the model.More specifically, all vegetation periods (e.g., at least one leafed and one leafless period) should be recorded in order to detect the seasonal variability in rainfall interception.Different statistical methods (e.g., [9,12,17,22]) and other model types (e.g., [4,[23][24][25][26]) have already been used and developed to evaluate the influence of different variables on rainfall interception, and also to predict it.Some models are quite complex, requiring a large number of parameters.For example, the Gash interception model [27] requires parameters such as canopy cover fraction, canopy storage capacity, rainfall necessary to saturate the canopy and others (e.g., [4,25,27]).Moreover, as input parameters, throughfall and stemflow data need to be considered (e.g., [4,25,27]).Furthermore, the Gash model [27] was developed based on the Rutter model [28] and requires less data than the initial model version [25].Similarly, the two-layer stochastic model requires estimation of several parameters and consideration of additional input variables [23,24].To simplify rainfall interception evaluation and modelling, we decided to try copula models, which were not yet applied to the rainfall partitioning data.
De Michele and Salvadori [29] were the first to introduce copula functions for applications in hydrology.Since then, the number of papers that apply copula functions to different hydrological problems has increased.For example, a Web of Science database search using the "copula" topic in the Water Resources category indicates that in 2016 and 2017, more than 70 papers per year were published including this keyword, while in 2010, only 29 papers were published.Salvadori and De Michele were first to carry out multivariate frequency analysis via copula functions [30] and different multivariate design strategies were first discussed by Salvadori et al. [31].Moreover, copula functions have been used several times for rainfall analysis and modelling.Some examples of this include: for rainfall frequency analysis in order to estimate reliable design rainfall (e.g., [32][33][34][35][36][37]), for disaggregation of rainfall data (e.g., [38,39]), to construct intensity-duration-frequency (IDF) curves for different purposes (e.g., [40][41][42]), for rainfall generation and modelling (e.g., [43][44][45]), for drought analyses and characterisation of drought properties (e.g., [46][47][48][49][50][51][52][53][54][55][56][57]), and for other rainfall related analyses (e.g., [58][59][60][61][62][63][64]).However, one should bear in mind that research dealing with copula functions is also very intense in the Statistics Probability category (e.g., more than 250 papers were published in 2017 in this category according to the Web of Science database).This means that the topic is also relevant from the statistical and mathematical perspective, and that consequently, with the development of the research area, we can expect even more applications of copulas in the future.
The main aim of this study was to compare the performance of the copula model for the estimation of the rainfall interception loss using air temperature (T) and vapour pressure deficit (VPD) data with some other estimation techniques, namely exponential function, multiple linear regression, and decision trees.The idea was not to use rainfall measurements in the model construction, but rather other meteorological variables because in some cases, rainfall data is not available and relative humidity and air temperature can also be measured using easily available digital sensors, which means that fieldwork is not needed.The performance of the above-mentioned models is evaluated using 175 rainfall events from an experimental plot in Slovenia where birch and pine trees are present.The specific aims of the study are as follows: (i) to fit the copula and some other models to the data; (ii) to compare the performance of tested methods and (iii) to compare the evaluation results with respect to two different tree species (birch and pine tree).

Rainfall Partitioning Measurements
Rainfall in the open, throughfall and stemflow were measured in a park in the city of Ljubljana, Slovenia (46.04 • N, 14.49 • E) [13,14,65].The typical climate for the area is temperate continental with well-defined seasons.The mean long-term (1986-2016) total annual rainfall is approximately 1380 mm and the average annual temperature is 11 • C, which varies between -3 • C in the winter and 24 • C in the summer [66].The study plot has an area of approximately 600 m 2 and is covered with grass (Figure 1).In the western part of the plot, a group of birch trees (Betula pendula Roth) and a group of pine trees (Pinus nigra Arnold) grow.There is no overlapping between the two groups of trees.Measurements of interception by two different tree species in our study were conducted in an urban area.Therefore, the basic research on rainfall interception was focused on the benefits of urban trees as part of green infrastructure [14,67].One of the benefits of such urban trees is the storm water reduction [67].Although trees have an important role in regulating urbanised hydrology systems, there is still little known about the best trees to choose.Since, in addition to meteorological variables, vegetation variables also have a large influence on the interception, each new research on additional tree species is an important contribution to the field.
The height of birch trees is on average 15.7 m (±1.0 m) with a total projected crown area of 42.3 m 2 and an average diameter at breast height of 17.9 cm.The pine trees are on average 12.6 m (±0.6 m) high, have a total projected crown area of 22.7 m 2 and an average diameter at breast height of 19.0 cm (±2.3 cm).The birch tree canopy is characterized by upwards branch inclination (51 • on average from the stem to the branch) and a leaf area index (LAI) of 0.8 in leafless and 2.3 in the leafed period (measured with a LAI-2200 plant canopy analyser).The pine tree branches are inclined downwards (on average 98 • from the stem to the branch) and the LAI values are on average equal to 3.8 (±0.7).Birch trees have smoother bark with a storage capacity of 0.7 mm, while the bark of pine trees is rougher and more absorptive, with a 3.5 mm bark storage capacity [14].
Water 2018, 10, x FOR PEER REVIEW 3 of 17

Rainfall Partitioning Measurements
Rainfall in the open, throughfall and stemflow were measured in a park in the city of Ljubljana, Slovenia (46.04°N, 14.49° E) [13,14,65].The typical climate for the area is temperate continental with well-defined seasons.The mean long-term (1986-2016) total annual rainfall is approximately 1380 mm and the average annual temperature is 11 °C, which varies between -3 °C in the winter and 24 °C in the summer [66].The study plot has an area of approximately 600 m 2 and is covered with grass (Figure 1).In the western part of the plot, a group of birch trees (Betula pendula Roth) and a group of pine trees (Pinus nigra Arnold) grow.There is no overlapping between the two groups of trees.Measurements of interception by two different tree species in our study were conducted in an urban area.Therefore, the basic research on rainfall interception was focused on the benefits of urban trees as part of green infrastructure [14,67].One of the benefits of such urban trees is the storm water reduction [67].Although trees have an important role in regulating urbanised hydrology systems, there is still little known about the best trees to choose.Since, in addition to meteorological variables, vegetation variables also have a large influence on the interception, each new research on additional tree species is an important contribution to the field.
The height of birch trees is on average 15.7 m (±1.0 m) with a total projected crown area of 42.3 m 2 and an average diameter at breast height of 17.9 cm.The pine trees are on average 12.6 m (±0.6 m) high, have a total projected crown area of 22.7 m 2 and an average diameter at breast height of 19.0 cm (±2.3 cm).The birch tree canopy is characterized by upwards branch inclination (51° on average from the stem to the branch) and a leaf area index (LAI) of 0.8 in leafless and 2.3 in the leafed period (measured with a LAI-2200 plant canopy analyser).The pine tree branches are inclined downwards (on average 98° from the stem to the branch) and the LAI values are on average equal to 3.8 (±0.7).Birch trees have smoother bark with a storage capacity of 0.7 mm, while the bark of pine trees is rougher and more absorptive, with a 3.5 mm bark storage capacity [14].The measurements were conducted from 1 January 2014 to 30 June 2017.The rainfall in the open was measured with a tipping bucket (0.2 mm/tip) rain gauge (Onset RG2-M) with an automatic data logger (Onset HOBO Event).The equipment was located in the clearing at the study plot, 10 m from the nearest tree.In the observed period, a total of 413 rainfall events were detected.After further analysis, 175 rainfall events were taken into account, as snow events and events with incomplete data were excluded.
Throughfall was measured with through-and funnel-type gauges in order to consider throughfall spatial variability (Figure 1).Under each group of trees, two through gauges (catch area of 7.5 m 2 ) were positioned from the tree stem towards the edge of the canopy.One was connected to a tipping bucket flow gauge (Unidata 6506G; 50 mL/tip) and an automatic data logger (Onset HOBO Event), while the other was connected to a manually-read polyethylene barrel (60 L capacity), collected after each event.Under each group of trees, ten manually-read funnel-type gauges (catch area of 78.5 cm 2 ) were also randomly placed and moved around during the three and half years of measurements [65].
Stemflow was measured for one representative tree in each group.Around the stem, a halved rubber hose was spirally wrapped, capturing the water flowing down the stem.In the case of pine trees, it was collected in manually-read 1.5 L container, while in the case of birch trees, the rubber hose was connected to a tipping bucket (Onset RG2-M, 0.2 mm/tip) with an automatic data logger (Onset HOBO Event).The contribution area of the tree canopy was used to convert stemflow volume to its depth [17,68,69].

Meteorological Variables
According to the rainfall and throughfall measurements with an automatic data logger at the study plot, rainfall amount, duration, and intensity were determined for each rainfall event.Events were separated by at least a 4 h dry period.Based on the defined beginning and end of the rainfall event, the minimum air temperature (T MIN ), maximum air temperature (T MAX ), average air temperature (T) and average relative humidity (RH) per event were also calculated.We have used the half-hour data (air temperature and relative humidity) measured at the location of the Ljubljana-Bežigrad meteorological station in the Ljubljana city area [66], located 3 km from the study plot.According to the Slovenian Environment Agency, its measurements are representative for the whole city and its neighborhood due to its special location in the Ljubljana basin [70].The FAO Penman-Monteith methodology was applied for the vapour pressure deficit (VPD) calculations [71].The following equations were used to derive the vapour pressure deficit (VPD), calculated as e s -e a (e s is saturation vapour pressure and e a is actual vapour pressure) using the minimum air temperature (T MIN ), maximum air temperature (T MAX ) and average relative humidity (RH) data derived using half-hourly data from the beginning to the end of each rainfall event: e a = e s × RH/100 (5) Figure 2 shows the relationship between air temperature, vapour pressure deficit and interception loss for birch (Ib) and pine (Ip) trees for the 175 events that were analysed in this study.Events measured in the leafless and leafed period are indicated with red and black colour, respectively.About 20% of all considered events were measured in the leafless period, and 80% in the leafed period.The correlation between Ib and T, and Ip and T, is similar for the leafless and leafed periods.On the other hand, the correlation between Ib and VPD, and Ip and VPD, is higher for events measured in the leafless period than for events measured in the leafed period (Figure 2).Furthermore, correlation between T and VPD was higher during the events that were measured in the leafed period compared to the events measured in the leafless period (Figure 2).
Water 2018, 10, x FOR PEER REVIEW 5 of 17 Events measured in the leafless and leafed period are indicated with red and black colour, respectively.About 20% of all considered events were measured in the leafless period, and 80% in the leafed period.The correlation between Ib and T, and Ip and T, is similar for the leafless and leafed periods.On the other hand, the correlation between Ib and VPD, and Ip and VPD, is higher for events measured in the leafless period than for events measured in the leafed period (Figure 2).Furthermore, correlation between T and VPD was higher during the events that were measured in the leafed period compared to the events measured in the leafless period (Figure 2).

Copulas
Before fitting the copula model, the independence of consecutive events was examined.The Ljung-Box test was used for this purpose, whereas the Box.test function that is implemented in the R software was used [72].Copulas are functions that connect joint multivariate probability distributions with univariate marginal distribution functions [73].A detailed description of copula functions in general and their main characteristics is available in [74][75][76][77].The definition of the Khoudraji-Liebscher copula functions applied in this study can be found in [78,79].These copulas were first introduced and discussed in hydrology by [80,81], and are applied in this study based on the conclusions made by Bezak et al. [82], who have applied Khoudraji-Liebscher copula functions for the estimation of suspended sediment loads.The symmetric Archimedean copulas C1 and C2 were used (Equation ( 6)) for the estimation of the interception loss based on the vapour pressure deficit (VPD) and air temperature (T) data for the birch and pine trees, respectively.The suitability of this copula function was tested using the procedure described below, and several tests results are

Copulas
Before fitting the copula model, the independence of consecutive events was examined.The Ljung-Box test was used for this purpose, whereas the Box.test function that is implemented in the R software was used [72].Copulas are functions that connect joint multivariate probability distributions with univariate marginal distribution functions [73].A detailed description of copula functions in general and their main characteristics is available in [74][75][76][77].The definition of the Khoudraji-Liebscher copula functions applied in this study can be found in [78,79].These copulas were first introduced and discussed in hydrology by [80,81], and are applied in this study based on the conclusions made by Bezak et al. [82], who have applied Khoudraji-Liebscher copula functions for the estimation of suspended sediment loads.The symmetric Archimedean copulas C 1 and C 2 were used (Equation ( 6)) for the estimation of the interception loss based on the vapour pressure deficit (VPD) and air temperature (T) data for the birch and pine trees, respectively.The suitability of this copula function was tested using the procedure described below, and several tests results are presented in Section 3.2.
For the three-dimensional case study that was analysed here, the cumulative distribution function is as follows [78,79]: where α 1 , α 2 , α 3 ∈ [0, 1] are shape parameters, C 1 and C 2 are the two copula functions, and θ 1 and θ 2 are the parameters of these two copula functions (u 1 = F VPD (VPD), u 2 = F I (I) and u 3 = F T (T)).
The parameters of the copula functions were estimated using the maximum pseudo-likelihood method [83,84].As C 1 and C 2 are symmetric, Joe, Frank, Clayton and Gumbel-Hougaard copula functions were used (Table 1).Thus, in total, 16 different combinations of Khoudraji-Liebscher copulas were tested.More information about the selected Archimedean copula functions can be found in [76].The exchangeability was tested using the exchTest function that is part of the R software copula package (symmetric copulas can be used when variables are exchangeable) [72,84].Moreover, the Cramér-von Mises test Sn was applied to test the adequacy of different copula models [85].Furthermore, the xvCopula function implemented in the R software copula package was used to select the most suitable copula for the estimation of the interception loss [84,86].The k-fold cross-validation was applied to estimate the xvCopula results, which means that samples were divided into k equal sized samples.k-1 sub-samples were used as training data and one sub-sample was used for validation.This procedure was repeated k times to obtain model selection criterion results (each of the k sub-samples was used once for validation).In order to deal with ties in the data, the so-called "first" method was applied in the process of data ordering (i.e., this method uses a permutation with increasing values at each index set of ties [72]).The non-parametric distribution function defined by Hutson [87] and Serinaldi [88] was applied as a marginal distribution function.The term "copula model" is used for the combination of the non-parametric distributions and Khoudraji-Liebscher copulas in the next part of the paper.
Table 1.Definitions of the selected symmetric trivariate Archimedean copulas that were used as C 1 and C 2 in Equation (6).

Other Models Used for Estimation of Rainfall Interception and Performance Criteria
The copula model described in Section 2.3 was compared to several other models.Similarly to Bezak et al. [82], we also tested the exponential model (EXP) with two parameters (only the vapour pressure deficit variable was used as input in this model) and the multiple regression model (MLR) (VPD and T were used) considering three parameters.The parameters were estimated using the least-square method.The EXP method was selected due to its simplicity compared to other tested methods.The R software usdm package and vifstep function were used to test for multicollinearity [72,89].Additionally, we also tested the decision tree model.Decision trees are one of the data mining techniques that can be used to construct a model for the prediction of target variables based on selected influential variables.We used continuous numeric values of interception loss as the target variable and air temperature and vapour pressure deficit as influential variables.Constructed decision trees are composed from nodes and leaves [90].Additional information about tree structure is defined within tree nodes (i.e., usually the mean and variance of instances related to one specific node and values that are used to divide different nodes).Moreover, each leaf is associated with the predicted value of the target variable.The tree depth was limited to five levels and we did not split subsets smaller than five.The final number of nodes and leaves was not the same for the two tree species (birch and pine), and the fitted models are presented in Section 3. Using these regression models and information about nodes, one can calculate rainfall interception based on VPD and T variables.The Orange software [91] was applied for the construction of the decision tree.
For the evaluation of the tested models (copula, exponential, multiple linear regression and decision tree), random sampling was used.The training set size was 75% and test set size was 25%.The random sampling procedure was repeated 20 times (n = 20).Root mean square error (RMSE) and mean absolute error (MAE) criteria were used to compare the model results.Additionally, to check the sensitivity of the RMSE and MAE results regarding the training and test set size, we also repeated the previously described procedure using 50% of events for training and 50% for validation.For the copula model, 10,000 possible Ib and Ip values were generated based on known pairs of variables T and VPD (for the test set).Similarly to Bezak et al. [82], we used the median value of 10,000 realizations as the estimated rainfall interception value for birch and pine trees.

Rainfall Interception and Influencing Variables
From three and a half years of measurements, we selected 175 rainfall events with complete data for further analysis.The total rainfall amount for these events was 2049 mm and average intensity was 2.0 mm/h (±2.4 mm/h).The largest event in terms of rainfall amount was detected between 14 and 15 June 2016, with a total rainfall amount of 93 mm and a duration of 21.5 h.Table 2 shows the descriptive statistics for the analyzed variables used in this study.On average, birch trees intercepted 40% (±27%) of rainfall per event (Table 2).The interception loss was equal to 100% for 12 events with less than 1.4 mm of rainfall in the open (Figure 3).This was expected, since for the events with a rainfall amount smaller than the canopy storage capacity (1.1-3.5 mm for birch trees), the total rainfall amount is intercepted [14].Furthermore, for three events, throughfall under birch trees exceeded the amount of rainfall in the open, which resulted in negative values of rainfall interception (Table 2).This phenomenon is usually attributed to so called "drip points", where rain drops concentrate at the edge of the canopy e.g., [4,65].Pine trees intercepted on average 68% (±25%) of rainfall per event (Table 2).The entire rainfall amount in the open was intercepted during 29 events with less than 2.6 mm of rainfall (canopy storage capacity for pine trees varied between 0.9 and 2.9 mm) (Figure 3).In total, birch trees intercepted 484.8 mm of rainfall (24%) and pine trees intercepted 926.5 mm of rainfall (45%).Compared to similar deciduous trees, the rainfall interception of birch trees was in the range of 20% and 29% of rainfall measured in oak and birch deciduous forest in the UK [92], 25% and 28% measured in mixed Mediterranean deciduous forest in Slovenia [4] and between 16% and 25% measured in a temperate deciduous forest in Maryland, USA [93].Interception losses by pine trees in the present study are similar to 41% of gross precipitation measured in mature coniferous forest in British Columbia, Canada [94], 52% rainfall intercepted by Douglas-fir and 61% intercepted by western red cedar in an urban area of North Vancouver, Canada [95].Air humidity and temperature were also recognized as two important influencing variables on all three components of rainfall partitioning in other studies [4,9,12,14].

Selection of the Most Suitable Copula Function
The main idea of the copula application was to estimate the rainfall interception loss using only meteorological variables, namely T and VPD data.Before fitting any model to the data autocorrelation, the marginal data was tested.Ljung-Box test results were 3.8 (p-value 0.07), 0.001 (p-value: 0.99), 2.1 (p-value: 0.15) and 0.25 (p-value: 0.62) for Ib, Ip, T and VPD, respectively.This means that there was no significant autocorrelation in the data, which indicates that data transformation is not needed.Table 3 shows the calculated Kendall's correlation coefficients values for the following pairs of variables: I-T, I-VPD and T-VPD for birch and pine trees, respectively.Using the Fisher r-to-z transformation, we also tested the significance of the difference between two correlation coefficients for birch and pine trees.The calculated test statistics and p-values were −0.75 (p-value 0.45), 1.08 (p-value 0.28) and 0.11 (p-value 0.90) for I-T, I-VPD and T-VPD, respectively.This means that the calculated correlation coefficients were not significantly different for all three pairs of variables with the selected significance level of 0.05.In the process of the copula model construction, we focused on the Khoudraji-Liebscher copula functions where we used symmetric Archimedean copulas C1 and C2 from Equation ( 6) [78][79][80][81][82].Moreover, exchangeability test results for birch trees were 0.09 (p-value 0.05), 0.07 (p-value 0.85) and 0.03 (p-value 0.98) for Ib-T, Ib-VPD and T-VPD, respectively.On the other hand, the exchangeability results for pine trees were 0.12 (p-value 0.84), 0.19 (p-value 0.77) and 0.03 (p-value 0.98) for Ip-T, Ip-VPD and T-VPD, respectively.This means that variables are exchangeable and the copulas shown in Table 1 can be used as C1 and C2 functions.Air humidity and temperature were also recognized as two important influencing variables on all three components of rainfall partitioning in other studies [4,9,12,14].

Selection of the Most Suitable Copula Function
The main idea of the copula application was to estimate the rainfall interception loss using only meteorological variables, namely T and VPD data.Before fitting any model to the data autocorrelation, the marginal data was tested.Ljung-Box test results were 3.8 (p-value 0.07), 0.001 (p-value: 0.99), 2.1 (p-value: 0.15) and 0.25 (p-value: 0.62) for Ib, Ip, T and VPD, respectively.This means that there was no significant autocorrelation in the data, which indicates that data transformation is not needed.Table 3 shows the calculated Kendall's correlation coefficients values for the following pairs of variables: I-T, I-VPD and T-VPD for birch and pine trees, respectively.Using the Fisher r-to-z transformation, we also tested the significance of the difference between two correlation coefficients for birch and pine trees.The calculated test statistics and p-values were −0.75 (p-value 0.45), 1.08 (p-value 0.28) and 0.11 (p-value 0.90) for I-T, I-VPD and T-VPD, respectively.This means that the calculated correlation coefficients were not significantly different for all three pairs of variables with the selected significance level of 0.05.In the process of the copula model construction, we focused on the Khoudraji-Liebscher copula functions where we used symmetric Archimedean copulas C 1 and C 2 from Equation ( 6) [78][79][80][81][82].Moreover, exchangeability test results for birch trees were 0.09 (p-value 0.05), 0.07 (p-value 0.85) and 0.03 (p-value 0.98) for Ib-T, Ib-VPD and T-VPD, respectively.On the other hand, the exchangeability results for pine trees were 0.12 (p-value 0.84), 0.19 (p-value 0.77) and 0.03 (p-value 0.98) for Ip-T, Ip-VPD and T-VPD, respectively.This means that variables are exchangeable and the copulas shown in Table 1 can be used as C 1 and C 2 functions.
Different combinations of Khoudraji-Liebscher copulas were tested where Joe, Clayton, Gumbel-Hougaard and Frank copulas were applied as C 1 and C 2 (Section 2.3).Thus, in total, 16 combinations were tested for both trees using the Cramér-von Mises test.For birch trees, the following combinations of C 1 -C 2 could not be rejected with the selected significance level of 0.05: Gumbel-Hougaard-Joe, Gumbel-Hougaard-Gumbel-Hougaard, Joe-Gumbel-Hougaard, Joe-Joe, Frank-Gumbel-Hougaard, Clayton-Gumbel-Hougaard and Clayton-Joe.For all these combinations, the model selection criteria that is based on k-fold cross-validation was used (Section 2.3), and the following results (i.e., the crossvalidated log likelihood defined by [84,86]) were 50.88, 52.95, 60.62, 51.17, 45.27, 59.54 and 37.81 for the Gumbel-Hougaard-Joe, Gumbel-Hougaard-Gumbel-Hougaard, Joe-Gumbel-Hougaard, Joe-Joe, Frank-Gumbel-Hougaard, Clayton-Gumbel-Hougaard and Clayton-Joe combinations, respectively.Based on the presented results, the Joe-Gumbel-Hougaard copula was selected as the most suitable for the birch tree data (the maximum model selection criterion result was obtained for this combination).For this copula function, the estimated copula parameters (θ 1 and θ 2 ) were 2.59 and 1.74, and the shape parameters (α 1 , α 2 , α 3 ) were 0.24, 0.92 and 0.07.For the complete copula model (non-parametric distribution function and Joe-Gumbel-Hougaard copulas as C 1 and C 2 ), we also visually checked the fit between the measured data and data generated using the previously mentioned model (Figure 4).Different combinations of Khoudraji-Liebscher copulas were tested where Joe, Clayton, Gumbel-Hougaard and Frank copulas were applied as C1 and C2 (Section 2.3).Thus, in total, 16 combinations were tested for both trees using the Cramér-von Mises test.For birch trees, the following combinations of C1-C2 could not be rejected with the selected significance level of 0.05: Gumbel-Hougaard-Joe, Gumbel-Hougaard-Gumbel-Hougaard, Joe-Gumbel-Hougaard, Joe-Joe, Frank-Gumbel-Hougaard, Clayton-Gumbel-Hougaard and Clayton-Joe.For all these combinations, the model selection criteria that is based on k-fold cross-validation was used (Section 2.3), and the following results (i.e., the cross-validated log likelihood defined by [84,86]) were 50.88, 52.95, 60.62, 51.17, 45.27, 59.54 and 37.81 for the Gumbel-Hougaard-Joe, Gumbel-Hougaard-Gumbel-Hougaard, Joe-Gumbel-Hougaard, Joe-Joe, Frank-Gumbel-Hougaard, Clayton-Gumbel-Hougaard and Clayton-Joe combinations, respectively.Based on the presented results, the Joe-Gumbel-Hougaard copula was selected as the most suitable for the birch tree data (the maximum model selection criterion result was obtained for this combination).For this copula function, the estimated copula parameters ( and ) were 2.59 and 1.74, and the shape parameters (α1, α2, α3) were 0.24, 0.92 and 0.07.For the complete copula model (non-parametric distribution function and Joe-Gumbel-Hougaard copulas as C1 and C2), we also visually checked the fit between the measured data and data generated using the previously mentioned model (Figure 4).Graphical fit for birch trees between measured data (red crosses) and 10,000 random realizations (grey circles) using the copula model where the Joe copula was selected as C1 and the Gumbel-Hougaard copula was selected as C2.
A similar procedure was also carried out for the pine tree data.Using the Cramér-von Mises test, the following combinations of Archimedean copulas as C1 and C2 could not be rejected with the selected significance level of 0.05: Gumbel-Hougaard-Joe, Gumbel-Hougaard-Gumbel-Hougaard, Joe-Gumbel-Hougaard, Joe-Joe, Frank-Joe, Clayton-Gumbel-Hougaard and Clayton-Joe.Furthermore, the most suitable copula was selected using the model selection criteria presented in Section 2.3.The calculated results (i.e., the cross-validated log likelihood defined by [84,86]  Graphical fit for birch trees between measured data (red crosses) and 10,000 random realizations (grey circles) using the copula model where the Joe copula was selected as C 1 and the Gumbel-Hougaard copula was selected as C 2 .
A similar procedure was also carried out for the pine tree data.Using the Cramér-von Mises test, the following combinations of Archimedean copulas as C 1 and C 2 could not be rejected with the selected significance level of 0.05: Gumbel-Hougaard-Joe, Gumbel-Hougaard-Gumbel-Hougaard, Joe-Gumbel-Hougaard, Joe-Joe, Frank-Joe, Clayton-Gumbel-Hougaard and Clayton-Joe.Furthermore, the most suitable copula was selected using the model selection criteria presented in Section 2.3.The calculated results (i.e., the cross-validated log likelihood defined by [84,86]) were 17.29, 17.84, 31.55,21.81, 13.35, 30.61 and 21.04 for the Gumbel-Hougaard-Joe, Gumbel-Hougaard-Gumbel-Hougaard, Joe-Gumbel-Hougaard, Joe-Joe, Frank-Joe, Clayton-Gumbel-Hougaard and Clayton-Joe combinations, respectively.Similarly, as for the birch trees, the same combination of Archimedean copula functions were used as C 1 and C 2 (Joe-Gumbel-Hougaard was used because the maximum model selection criterion result was obtained for this combination).The estimated copula parameters for θ 1 and θ 2 were 2.73 and 1.86, respectively.Furthermore, the estimated shape parameters were 0.21, 0.99 and 0.06 for the α 1 , α 2 and α 3 , respectively.Figure 5 shows the graphical comparison between generated data for pine trees using the fitted copula model and measured data.Archimedean copula functions were used as C1 and C2 (Joe-Gumbel-Hougaard was used because the maximum model selection criterion result was obtained for this combination).The estimated copula parameters for and were 2.73 and 1.86, respectively.Furthermore, the estimated shape parameters were 0.21, 0.99 and 0.06 for the α1, α2 and α3, respectively.Figure 5 shows the graphical comparison between generated data for pine trees using the fitted copula model and measured data.Graphical fit for pine trees between measured data (red crosses) and 10,000 random realizations (grey circles) using the copula model where the Joe copula was selected as C1 and the Gumbel-Hougaard copula was selected as C2.

Comparison of Copula Results with Other Models
In the next step of the study, we compared copula model performance with some other methods that are described in Section 2.4.Firstly, the parameters of the EXP and MLR methods were estimated.The EXP and MLR models of rainfall interception for birch trees (Ib) were Ib = exp(4.2) × VPD 0.32 and Ib = 31.34+ VPD × 41.62 − T × 0.15, respectively.Moreover, the EXP and MLR models of rainfall interception for pine trees (Ip) were Ip = exp(4.39)× VPD 0.10 and Ip = 67.47+ VPD × 24.92 − T × 0.41, respectively.The variance inflation factor (VIF) and the test for multicollinearity (implemented in the usdm package [89] that is part of the R software [72]) were used to check if there is any collinearity issues in the data.The results indicate that no input variable has a collinearity problem (VIF values were 1.05, 1.94 and 2.0 for Ip, T and VPD, respectively; VIF values were 1.17, 1.93 and 2.12 for Ib, T and VPD, respectively).A VIF value larger than 10 would indicate that a model has a collinearity issue [72,89].In the next step, a decision tree was fitted to the data.In total (at all five levels), 25 nodes and 13 leaves were determined for birch trees and 27 nodes and 14 leaves were determined for pine trees (Figure 6).According to the decision trees, the most influential variable on rainfall interception by both considered tree species was VPD as it was the top split variable in both cases (Figure 6).However, the splitting values differ significantly for pine and birch trees.More specifically, much lower splitting values of VPD were determined in the case of pine trees (0.08 kPa) than in the case of birch trees (0.40 kPa).These values were determined in the process of decision tree calibration and can be seen in Figure 6.Additionally, rainfall interception is, on average, lower with lower VPD values for both tree species.Graphical fit for pine trees between measured data (red crosses) and 10,000 random realizations (grey circles) using the copula model where the Joe copula was selected as C 1 and the Gumbel-Hougaard copula was selected as C 2 .

Comparison of Copula Results with Other Models
In the next step of the study, we compared copula model performance with some other methods that are described in Section 2.4.Firstly, the parameters of the EXP and MLR methods were estimated.The EXP and MLR models of rainfall interception for birch trees (Ib) were Ib = exp(4.2) × VPD 0.32 and Ib = 31.34+ VPD × 41.62 − T × 0.15, respectively.Moreover, the EXP and MLR models of rainfall interception for pine trees (Ip) were Ip = exp(4.39)× VPD 0.10 and Ip = 67.47+ VPD × 24.92 − T × 0.41, respectively.The variance inflation factor (VIF) and the test for multicollinearity (implemented in the usdm package [89] that is part of the R software [72]) were used to check if there is any collinearity issues in the data.The results indicate that no input variable has a collinearity problem (VIF values were 1.05, 1.94 and 2.0 for Ip, T and VPD, respectively; VIF values were 1.17, 1.93 and 2.12 for Ib, T and VPD, respectively).A VIF value larger than 10 would indicate that a model has a collinearity issue [72,89].In the next step, a decision tree was fitted to the data.In total (at all five levels), 25 nodes and 13 leaves were determined for birch trees and 27 nodes and 14 leaves were determined for pine trees (Figure 6).According to the decision trees, the most influential variable on rainfall interception by both considered tree species was VPD as it was the top split variable in both cases (Figure 6).However, the splitting values differ significantly for pine and birch trees.More specifically, much lower splitting values of VPD were determined in the case of pine trees (0.08 kPa) than in the case of birch trees (0.40 kPa).These values were determined in the process of decision tree calibration and can be seen in Figure 6.Additionally, rainfall interception is, on average, lower with lower VPD values for both tree species.According to all models, higher VPD values in general increase rainfall interception regardless of the tree species.This finding is in accordance with Staelens et al. [9], who demonstrated that vapour pressure deficit significantly increased rainfall interception by a single beech throughout the year.However, in the case of stemflow, a high negative correlation with VPD was observed for yellow poplar and American beech [12].According to the results of the MLR model, increasing air temperature decreased rainfall interception, whereas the results of the decision trees showed a different response of rainfall interception to changes in air temperature according to the VPD values.According to the decision trees, the rainfall interception for both considered tree species was negatively correlated with temperature for VPD higher than 0.32 kPa and 0.40 kPa for pine and birch trees, respectively.Previous analysis with the regression trees also showed that the interception by pine trees decreased with an increase in the temperature [14].However, larger stemflow amounts and negative rainfall interception at low temperatures were observed for a single beech [9], while warm wind and increasing air temperature in a Mediterranean deciduous forest decreased throughfall and therefore increased rainfall interception [4].Some differences in the observed rainfall interception in response to the air temperature may be also the consequence of a different number of the variables considered in the studies.
Table 4 shows results of the RMSE and MAE model evaluation criteria that were used to compare the tested models.The results indicate relatively similar behaviour of the tested models, where the differences among models are relatively small compared to the accuracy of the models.However, a slightly better performance according to the RMSE and MAE criteria was observed for the copula model.Better results could be obtained with the inclusion of additional variables in the models.However, additional variables would also increase the number of parameters used (e.g., one additional variable would increase the number of parameters for the MLR method from three to four, and for the copula method from five to six).The number of parameters for the decision tree is mainly constrained by the selected minimum tree depth and other parameters.On the other hand, in the exponential model, it is not possible to include additional variables.Furthermore, the exponential function also yielded similar results than the other tested models despite its simplicity (two parameters, one input variable).Interestingly, for the decision tree model, RMSE and MAE results were slightly higher than for the copula model despite a larger number of parameters being involved.Moreover, besides firstly applied random sampling (the training set size 75% and test set size 25%), which was used in the study, we also calculated RMSE and MAE results using 50% of events for calibration and 50% for validation.The calculated results were slightly worse than using procedure shown in Table 4.However, the maximum differences were up to 6% (i.e., 6% higher RMSE and MAE values were obtained).Moreover, we were also not able to detect any significant differences among the tested methods.This could indicate that the results presented in Table 4 are not sensitive to the relationship between the size of the training and test sets.According to all models, higher VPD values in general increase rainfall interception regardless of the tree species.This finding is in accordance with Staelens et al. [9], who demonstrated that vapour pressure deficit significantly increased rainfall interception by a single beech throughout the year.However, in the case of stemflow, a high negative correlation with VPD was observed for yellow poplar and American beech [12].According to the results of the MLR model, increasing air temperature decreased rainfall interception, whereas the results of the decision trees showed a different response of rainfall interception to changes in air temperature according to the VPD values.According to the decision trees, the rainfall interception for both considered tree species was negatively correlated with temperature for VPD higher than 0.32 kPa and 0.40 kPa for pine and birch trees, respectively.Previous analysis with the regression trees also showed that the interception by pine trees decreased with an increase in the temperature [14].However, larger stemflow amounts and negative rainfall interception at low temperatures were observed for a single beech [9], while warm wind and increasing air temperature in a Mediterranean deciduous forest decreased throughfall and therefore increased rainfall interception [4].Some differences in the observed rainfall interception in response to the air temperature may be also the consequence of a different number of the variables considered in the studies.
Table 4 shows results of the RMSE and MAE model evaluation criteria that were used to compare the tested models.The results indicate relatively similar behaviour of the tested models, where the differences among models are relatively small compared to the accuracy of the models.However, a slightly better performance according to the RMSE and MAE criteria was observed for the copula model.Better results could be obtained with the inclusion of additional variables in the models.However, additional variables would also increase the number of parameters used (e.g., one additional variable would increase the number of parameters for the MLR method from three to four, and for the copula method from five to six).The number of parameters for the decision tree is mainly constrained by the selected minimum tree depth and other parameters.On the other hand, in the exponential model, it is not possible to include additional variables.Furthermore, the exponential function also yielded similar results than the other tested models despite its simplicity (two parameters, one input variable).Interestingly, for the decision tree model, RMSE and MAE results were slightly higher than for the copula model despite a larger number of parameters being involved.Moreover, besides firstly applied random sampling (the training set size 75% and test set size 25%), which was used in the study, we also calculated RMSE and MAE results using 50% of events for calibration and 50% for validation.The calculated results were slightly worse than using procedure shown in Table 4.However, the maximum differences were up to 6% (i.e., 6% higher RMSE and MAE values were obtained).Moreover, we were also not able to detect any significant differences among the tested methods.This could indicate that the results presented in Table 4 are not sensitive to the relationship between the size of the training and test sets.

Conclusions
The presented paper shows the results of rainfall interception modelling using only meteorological variables usually available from national meteorological stations, namely air temperature and vapour pressure deficit data.The interception loss is an important process of the hydrological cycle and has an important influence on the surface runoff and groundwater storage, among others.Since rainfall interception measurements are complicated and very time consuming, a methodology that can be used to model rainfall interception for specific vegetation and climatic conditions using only climatic data is needed.However, at least some experimental measurements are needed to calibrate the model.More specifically, all vegetation periods (e.g., at least one leafed and one leafless period) should be recorded in order to detect the seasonal variability in rainfall interception.Based on the presented results, the following conclusions can be made: 1.
The Khoudraji-Liebscher copula model, which has previously been used in a relatively similar application by Bezak et al. [82], can be successfully applied for the estimation of rainfall interception based on air temperature and vapour pressure deficit data.

2.
The performance of the copula model is relatively similar to the performance of other tested methods.However, according to the RMSE and MAE criteria, slightly better results are obtained using copula functions compared to the other tested methods in this study.The performance of the models could be further improved with the inclusion of other additional variables in the models; however, this would generally increase the complexity of the models.For the EXP method, additional variables cannot be added.For the MLR and copula models, one additional variable would mean one additional parameter.For the decision tree method, the number of parameters is not only connected with the number of variables used, but also with other settings such as maximal tree depth.

3.
The copula method yielded similar performance for both birch and pine trees despite the fact that the correlation (Table 3) between I-T and I-VPD was smaller for pine trees compared to birch trees.However, correlations between I-T and I-VPD for birch and pine trees were not significantly different with a significance level of 0.05.The constructed models could also be applied for the prediction of rainfall interception under similar vegetation and meteorological conditions in other locations where these two tree species are present.In the case of a longer data series, a different model could be constructed for the leafless and leafed periods.

Figure 1 .
Figure 1.Location of Slovenia on the map of Europe, and a photo from the measuring plot showing the measuring equipment used.Figure 1. Location of Slovenia on the map of Europe, and a photo from the measuring plot showing the measuring equipment used.

Figure 1 .
Figure 1.Location of Slovenia on the map of Europe, and a photo from the measuring plot showing the measuring equipment used.Figure 1. Location of Slovenia on the map of Europe, and a photo from the measuring plot showing the measuring equipment used.

Figure 2 .
Figure 2. Graphical presentation of the relationship among interception loss for pine trees (Ip), vapour pressure deficit and air temperature (T) (upper panel) and interception loss for birch trees (Ib), vapour pressure deficit and air temperature (T) (lower panel).Events measured in the leafless and leafed period are indicated with red and black colour, respectively.

Figure 2 .
Figure 2. Graphical presentation of the relationship among interception loss for pine trees (Ip), vapour pressure deficit and air temperature (T) (upper panel) and interception loss for birch trees (Ib), vapour pressure deficit and air temperature (T) (lower panel).Events measured in the leafless and leafed period are indicated with red and black colour, respectively.

Figure 3 .
Figure 3. Rainfall interception by birch (Ib) and pine (Ip) trees according to the amount of rainfall in the open.

Figure 3 .
Figure 3. Rainfall interception by birch (Ib) and pine (Ip) trees according to the amount of rainfall in the open.

Figure 4 .
Figure 4. Graphical fit for birch trees between measured data (red crosses) and 10,000 random realizations (grey circles) using the copula model where the Joe copula was selected as C1 and the Gumbel-Hougaard copula was selected as C2.

Figure 4 .
Figure 4. Graphical fit for birch trees between measured data (red crosses) and 10,000 random realizations (grey circles) using the copula model where the Joe copula was selected as C 1 and the Gumbel-Hougaard copula was selected as C 2 .

Water 2018 ,
10, x FOR PEER REVIEW 10 of 17

Figure 5 .
Figure 5. Graphical fit for pine trees between measured data (red crosses) and 10,000 random realizations (grey circles) using the copula model where the Joe copula was selected as C1 and the Gumbel-Hougaard copula was selected as C2.

Figure 5 .
Figure 5. Graphical fit for pine trees between measured data (red crosses) and 10,000 random realizations (grey circles) using the copula model where the Joe copula was selected as C 1 and the Gumbel-Hougaard copula was selected as C 2 .

Figure 6 .
Figure 6.Decision trees for rainfall interception by birch (a) and pine (b) trees; only the first three levels are presented.Values shown in boxes indicate mean and variance of instances related to this specific node.Numbers written outside boxes indicate values that are used to divide different nodes.

Figure 6 .
Figure 6.Decision trees for rainfall interception by birch (a) and pine (b) trees; only the first three levels are presented.Values shown in boxes indicate mean and variance of instances related to this specific node.Numbers written outside boxes indicate values that are used to divide different nodes.

Table 2 .
Descriptive statistics (minimum, maximum, mean, median and coefficient of variation (CV)) for average air temperature (T), average relative humidity (RH), vapour pressure deficit (VPD) and rainfall interception by birch (Ib) and pine (Ip) trees.

Table 3 .
Kendall's correlation coefficients between pairs of the following variables: interception loss (I), air temperature (T) and vapour pressure deficit (VPD) for birch and pine trees calculated using 175 events.p-value is shown in brackets.

Table 3 .
Kendall's correlation coefficients between pairs of the following variables: interception loss (I), air temperature (T) and vapour pressure deficit (VPD) for birch and pine trees calculated using 175 events.p-value is shown in brackets.

Table 4 .
Performance criteria results (root mean square error (RMSE) and mean absolute error (MAE)) for the different tested models (random sampling (n = 20, training set 75%, test set 25%) was used to calculate the results for birch and pine trees).