Estimating Current and Future Rainfall Erosivity in Greece Using Regional Climate Models and Spatial Quantile Regression Forests

A future variation of precipitation characteristics, due to climate change, will affect the ability of rainfall to precipitate soil loss. In this paper, the monthly and annual values of rainfall erosivity (R) in Greece are calculated, for the historical period 1971–2000, using precipitation records that suffer from a significant volume of missing values. In order to overcome the data limitations, an intermediate step is applied using the calculation of monthly erosivity density, which is more robust to the presence of missing values. Spatial Quantile Regression Forests, a data driven algorithm that imitates kriging without the need of strict statistical assumptions, was utilized and validated, in order to create maps of R and its uncertainty using error propagation. The monthly average precipitation for the historical period 1971–2000 estimated by five (5) Global Circulation Models-Regional Climatic Models were validated against observed values and the one with the best performance was used to estimate projected changes of R in Greece for the future time period 2011–2100 and two different greenhouse gases concentration scenarios. The main findings of this study are: (a) the mean annual R in Greece is 1039 MJ·mm/ha/h/y, with a range between 405.1 and 3160.2 MJ·mm/ha/h/y. The highest values are calculated at the mountain range of Pindos and the lowest at central Greece; (b) the monthly R maps adhere to the spatiotemporal characteristics of precipitation depth and intensities over the country; (c) the projected R values, as an average over Greece, follow the projected changes of precipitation of climatic models, but not in a spatially homogenous way.


Introduction
Rainfall erosivity concerns the ability of rainfall to precipitate soil loss [1], as it supplies energy to the mechanical processes of soil erosion. Decertification has been identified as one of the most serious issues facing Mediterranean European countries, including Greece [2], and a possible increase in future rainfall, due to climate change, will aggravate this process, as soil erosion increases at a greater rate [3]. Unanticipatedly, a decrease in future rainfall and a possible decrease of biomass production may also lead to higher erosion rates [4].
Higher erosion rates in conjunction with unsustainable land management and increasing human pressure can lead to soil degradation [5], and consequently a disrupted ecological balance, a decreasing agricultural production and income [6] and even the reduction of effectiveness of adaptation options [7]. Several issues may arise due to accelerated soil losses on achieving of the Sustainable Development Goals of the United Nations [8], as these goals are dependent on a healthy biophysical environment in which the soil is the base [9]. In order to predict these soil erosion future changes it is necessary in Western Germany [34], Belgium [35] and in the Czech Republic [36]. Other studies in various parts of the world used GCMs in conjunction with empirical equations that predict R using annual precipitation [37,38], monthly [39,40] and daily rainfall indices [41,42]. A different approach estimated projected R changes, using a weather generator with spatial and temporal downscaled precipitation values coming from various GCMs [43].
Random Forests [44] is a data-driven algorithm in the area of supervised learning which tries to fit a model using a set of paired input variables and their associated output response and can be used in classification and regression problems. Quantile Regression Forests (QRF) [45], is an extension of RF that is able to compute prediction intervals of the output response in regression problems. RF has been used for spatial prediction in various domains [46][47][48][49][50] and recently, Hengl et al. [51] presented RF for a spatial predictions framework, that can make equally accurate predictions as kriging, without the need of statistical assumptions.
The aim of this work is to calculate the current and estimate the future changes of R values in Greece. The latest methodologies developed and presented with RUSLE2 are used, taking into account the presence of missing values in precipitation records. The first objective of the analysis is to create monthly precipitation and ED maps, as intermediate datasets, and to estimate the uncertainty of their predictions and their errors using cross-validation. Consequently, the current R values in Greece are computed from the interpolated surfaces and error propagation is used to estimate approximately their uncertainty. Finally, downscaled precipitation from GCMs-RCMs is validated and used, along with ED, to estimate the potential changes of R in Greece for the years 2040, 2070 and 2100 using two future greenhouse gases concentration trajectories/pathways (i.e., RCPs). This type of analysis is a novel approach and it has not been presented, until now, in the international literature.

Materials and Methods
The methodology that was applied in the study, is presented in the flowchart of Figure 1, where data flows from the left to the right. Blue symbolizes the data that were used as input (pointwise and raster), orange the intermediate raster datasets that were created and green the final results, also in raster format, that were computed using different sets of input and intermediate datasets.
Water 2020, 12,687 3 of 20 precipitation [37,38], monthly [39,40] and daily rainfall indices [41,42]. A different approach estimated projected R changes, using a weather generator with spatial and temporal downscaled precipitation values coming from various GCMs [43]. Random Forests [44] is a data-driven algorithm in the area of supervised learning which tries to fit a model using a set of paired input variables and their associated output response and can be used in classification and regression problems. Quantile Regression Forests (QRF) [45], is an extension of RF that is able to compute prediction intervals of the output response in regression problems. RF has been used for spatial prediction in various domains [46][47][48][49][50] and recently, Hengl et al. [51] presented RF for a spatial predictions framework, that can make equally accurate predictions as kriging, without the need of statistical assumptions.
The aim of this work is to calculate the current and estimate the future changes of R values in Greece. The latest methodologies developed and presented with RUSLE2 are used, taking into account the presence of missing values in precipitation records. The first objective of the analysis is to create monthly precipitation and ED maps, as intermediate datasets, and to estimate the uncertainty of their predictions and their errors using cross-validation. Consequently, the current R values in Greece are computed from the interpolated surfaces and error propagation is used to estimate approximately their uncertainty. Finally, downscaled precipitation from GCMs-RCMs is validated and used, along with ED, to estimate the potential changes of R in Greece for the years 2040, 2070 and 2100 using two future greenhouse gases concentration trajectories/pathways (i.e., RCPs). This type of analysis is a novel approach and it has not been presented, until now, in the international literature.

Materials and Methods
The methodology that was applied in the study, is presented in the flowchart of Figure 1, where data flows from the left to the right. Blue symbolizes the data that were used as input (pointwise and raster), orange the intermediate raster datasets that were created and green the final results, also in raster format, that were computed using different sets of input and intermediate datasets.

Data Acquisition and Processing
Point precipitation data from 237 meteorological stations across Greece (Figure 2), was used in the analysis. The data consisted of: • Pluviograph data for 87 meteorological stations that were taken from the Greek National Bank of Hydrological and Meteorological Information (NBHM) [52]. The time series

Data Acquisition and Processing
Point precipitation data from 237 meteorological stations across Greece (Figure 2), was used in the analysis. The data consisted of:

•
Pluviograph data for 87 meteorological stations that were taken from the Greek National Bank of Hydrological and Meteorological Information (NBHM) [52]. The time series comprised a total of 2273 years of 30-min-records and 394 years of five-min-records for the time period from 1953 to 1997, with a mean length of 30 years per station. The timeseries coverage was 62.8% on average. • Mean monthly precipitation data for 150 meteorological stations that were taken from the Hellenic National Meteorological Service (HNMS) [53]. These are homogenized data and available for the time period from 1971 to 2000. These data were used to overcome the limitations of precipitation data from NBHM.

•
Five different monthly GCM-RCM raster datasets were downloaded for a number of experiments and time periods (Table 1). These data were computed in the framework of the EURO-CORDEX [27,54], had a horizontal resolution 0.11 • × 0.11 • and were remapped using bilinear interpolation to a 30" × 30" resolution grid using the Climate Data Operator (CDO) software [55]. Table 1. Global Circulation Models (GCMs)-Regional Climate Models (RCMs) used in the analysis (i.e., data retrieved from EURO-CORDEX). comprised a total of 2273 years of 30-min-records and 394 years of five-min-records for the time period from 1953 to 1997, with a mean length of 30 years per station. The timeseries coverage was 62.8% on average. • Mean monthly precipitation data for 150 meteorological stations that were taken from the Hellenic National Meteorological Service (HNMS) [53]. These are homogenized data and available for the time period from 1971 to 2000. These data were used to overcome the limitations of precipitation data from NBHM. • Five different monthly GCM-RCM raster datasets were downloaded for a number of experiments and time periods (Table 1). These data were computed in the framework of the EURO-CORDEX [27,54], had a horizontal resolution 0.11° × 0.11° and were remapped using bilinear interpolation to a 30'' × 30'' resolution grid using the Climate Data Operator (CDO) software [55].

RCM Institution
The GCM-RCM monthly precipitation timeseries data were: • Historical, for the time period from 1971 to 2000 (like the ones coming from HNMS) as they were driven by the boundary conditions provided by the GCMs. • Future, for the forcing scenarios RCP4.5 (where greenhouse gases emissions peak around 2040 and then decline) and RCP8.5 (where greenhouse gases emissions continue to rise throughout the 21st century), using the control ensemble of the CORDEX climate projection experiments. The dataset period was from 2011 to 2100.
Digital elevation raster data were downloaded from the NASA Shuttle Radar Topography Mission (SRTM) [56], and aggregated to the same 30'' × 30'' resolution grid as the one used in the GCM-RCMs datasets.   The GCM-RCM monthly precipitation timeseries data were:

•
Historical, for the time period from 1971 to 2000 (like the ones coming from HNMS) as they were driven by the boundary conditions provided by the GCMs. • Future, for the forcing scenarios RCP4.5 (where greenhouse gases emissions peak around 2040 and then decline) and RCP8.5 (where greenhouse gases emissions continue to rise throughout the 21st century), using the control ensemble of the CORDEX climate projection experiments. The dataset period was from 2011 to 2100.
Digital elevation raster data were downloaded from the NASA Shuttle Radar Topography Mission (SRTM) [56], and aggregated to the same 30" × 30" resolution grid as the one used in the GCM-RCMs datasets.

Monthly Erosivity Density Calculation
The erosivity of a single erosive rainfall event, EI 30 (MJ·mm/ha/h), given the product of the kinetic energy of rainfall and its maximum 30 min intensity, was computed using the pluviograph records from NBHM [15]: where e r is the kinetic energy per unit of rainfall (MJ/ha/mm), v r the rainfall depth (mm) for the time interval r of the hyetograph, which has been divided into r = 1, 2, . . . , s time sub-intervals and I 30 is the maximum rainfall intensity for a 30 min duration during that rainfall.
The quantity e r was calculated for each time sub-interval, r, applying the kinetic energy equation that was used in RUSLE2 [57], which was recently evaluated in Italy, a nearby country to Greece, and had the best performance among alternative literature expressions [58]: where i r is the rainfall intensity (mm/h). An individual rainfall event was extracted from the continuous pluviograph data, if its cumulative depth for a duration of 6 h at a certain location was less than 1.27 mm. A rainfall event was considered to be erosive if it had a cumulative rainfall depth greater than 12.7 mm. Only the screened events with a return period of less than 50 years were used in the calculations.
On the grounds that the use of coarser fixed time intervals to a finer one can lead to an underestimation of the value of erosivity [59,60], monthly conversion factors c m were computed using the five-min-time-step timeseries: where n m is the number of storms at month m, (EI 30 ) m, ts=5 min is the erosivity of a storm using the five min time step and (EI 30 ) m, ts=30 min the erosivity of the same storm when the timeseries was aggregated using a 30 min time step. These conversion factors were applied to the values of erosivity that were estimated from 30-min pluviograph data. After the computation of EI 30 values, the average monthly rainfall erosivity density ED m (MJ/ha/h) per station was calculated: where st m is the number of storms during the month m, (EI 30 ) k the erosivity of storm k, P m the monthly precipitation height and n the number of years.

Spatial Quantile Regression Forests
Random Forests (RF) [44] is one of the most successful methods used in Machine Learning [61], among other reasons because of: (a) its robustness to outliers [62] and overfitting [63], (b) its ability to perform feature selection [64] and (c) the fact that its default parameters, as implemented in software, give satisfactory results [61,65]. Examples of RF open source software are the R's language packages randomForest [66] and its faster alternative, ranger [67].
In summary, RF consists of a number of decision trees [68]. For each tree, a random set of the dataset is created via bootstrapping [69] and in each node of the tree a random set of n input variables from the p variables of the dataset is considered to pick the best split [70]. The prediction of the output response in regression problems is the mean value of the estimations of these random decision trees. The estimate of the out-of-sample error is computed using the out-of-bag error [71], without the need of cross-validation.
Quantile Regression Forests (QRF) is an extension of RF that provides information about the full conditional distribution of the output response and not only about its mean [45], as is the case in plain RF. In this way, it is possible to provide prediction intervals and measures of uncertainty.
The use of QRF as a framework for the modeling of spatial variables was introduced by Hengl et al. [51], where the distances among observation locations are used as variables in QRF in order to incorporate geographical proximity effects. In this way, using these buffer distances, spatial QRF imitate kriging's spatial correlation. Spatial QRF (spQRF) produces comparable results, in terms of accuracy, compared to the state-of-the-art kriging methods, with the advantage of no prior assumptions about the distribution or stationarity of the response variable [51].
In this work, spQRF were used for the spatial prediction of monthly precipitation and ED: • The 12 monthly ED models (one model for each month of the year) were trained using as input variables the buffer distances for each one of the 87 stations of NBHM in a chained procedure [72].
In that procedure, at the first run, only the distances are used as input variables and at the nth both distances and the monthly results from the previous step, with the exception of the month that is the output response. This procedure stops when the out-of-sample error estimated by RFs in the form of out-of-the-bag error ceases to decrease.

•
The 12 monthly precipitation models were trained using as input variables the buffer distances for each one of the 150 stations of HNMS and elevation data from the SRTM.
As a measure of the out-of-sample error, the average Root Mean Squared Error (RMSE), the coefficient of determination R 2 and Lin's Concordance Correlation Coefficient (CCC) were computed using 10-fold cross validation: where n is the total number of cross-validation locations,ŷ (s i ) is the predicted value of y (s i ) at a cross-validation location s i (i.e., the coordinates longitude and latitude of location i), µŷ, µ y , σ 2 y , σ 2 y are the means and variances ofŷ (s i ) and y (s i ) , respectively, and σŷ y is the covariance ofŷ (s i ) and y (s i ) .
R 2 describes the ratio of variance that is explained by a model and may be negative, among other reasons, if an inappropriate model is used [73]. CCC combines measures of both precision and accuracy and examines how farŷ deviate from the line of perfect concordance (the line of 45 degrees on a square scatterplot) and ranges from 0 to ±1 [74,75].
The uncertainty of the predictions from the models, in the form of prediction error standard deviation, was computed using a dense level of all the quantiles per 1% of the output response and for every location [51]: whereŷ p (s i ) is the pth percentile of the distribution of the response variable at location s i and µŷ (s i ) the mean value ofŷ p (s i ) for all the percentiles.
The quantity z-score, which quantifies the error of prediction errors, was calculated at cross validation locations [76]: where z, ideally, should have a mean equal to zero and variance equal to one. On the contrary: • If variance(z) >> 1, the model underestimates the actual prediction uncertainty.

Regional Climate Models Historical Precipitation Validation
In order to validate and select one of the GCMs-RCMs for the projected erosivity calculations, at first, a multi-layer raster dataset was computed for each of the five models with the overall 30-year-mean-monthly precipitation values, using the historical time period from 1971 to 2000. Then, using the monthly precipitation spQRF models to create raster datasets with the same 30" × 30" resolution grid, RMSE, CCC and R 2 errors metrics were computed, setting in Equations (5)-(7) asŷ (s i ) the values estimated by GCMs-RCMs and as y (s i ) the values calculated by the spQRF models.

Current and Projected Erosivity Calculation
The estimation of current and future monthly erosivity was made under the assumption of future temporal stationarity of ED values, due to the fact that ED is related to seasonal rainfall intensity [15] and EURO-CORDEX GCMs-RCPs models does not project statistically significant trends in most areas of Greece [27]. Temporal stationarity of ED values already has been documented in Greece for the historical period [30].
The current monthly erosivity was calculated as the product of predictions of the trained spQRF models of monthly ED m and precipitation P m for each month m: The prediction error standard deviation of monthly R m was calculated using error propagation [77]: In order for Equation (11) to hold true,P m (s i ) andÊ D m (s i ) were considered as the true values on locations s i and that were independent of each other. More specifically, these values were considered, to a good approximation, as not correlated on the basis of the following remarks: •P m (s i ) came from monthly average data andÊ D m (s i ) from random proportions of pluviograph data due to missing values.

•
Using the approximation that kinetic energy in Equation (1) can be replaced with a constant monthly value [79], then as is proved in RUSLE2 [15], monthly ED is directly proportional to 30-min rainfall intensity without the effect of precipitation depth.
The annual rainfall erosivity and its error standard deviation were calculated, respectively: The projected values of monthly erosivity were computed using the ratio δ f ,m of the future 30-years-mean P where f is the future year in which long term average monthly values refer to and: The projected annual erosivity was estimated with: And the ratio of future to current values: With Equations (14)- (17), the projected annual erosivity values were calculated preserving the relative projected changes of precipitation without the direct use of simulated values coming from a GCM-RCM and the need to apply a bias-correction method. However, the need to use a statistical downscaling technique to apply bias correction to the GCM + RCM output will be tested.

Precipitation and ED Pointwise Values
The monthly conversion factors c m that were computed specifically for Greece using the five-min-time-step timeseries have a mean value 1.22, close to the one calculated for Europe [59], but had different seasonality with their maximum values in the period from May to October and minimum during March (Table 2). The first four central moments (mean, standard deviation, skew and kurtosis) and other statistical properties were used to describe the calculated monthly values of ED (87 stations from NBHM, Table 3) and the average monthly precipitation (150 stations from HNMS, Table 4). Monthly precipitation and ED had different monthly temporal patterns (Figure 3). The temporal pattern of precipitation is typical for a Mediterranean area with the minimum precipitation occurring during summer and the maximum precipitation observed in November and December (Figure 3a). This temporal pattern of monthly precipitation is typical for Greece [80]. The second had an almost bimodal shape with its two peaks at July and October. The values of ED were slightly different from the ones already reported in our previous work [30], due to the application of the seasonal monthly conversion factors c m to a constant one.

Spatial Models of Precipitation and ED
The training and cross validation of the spatial models was made using the implementation of QRF in the ranger [67] package of language R [82]. The number of trees was set to 1000 and the finetuning of the parameters of the model was made using the tuneRanger [83] package.
The cross validation error metrics of the models for monthly precipitation were satisfactory (Table 5), especially comparing them to recent precipitation models for Greece that showed weak correlation during the spring season [17] or equally good estimations [23]. As most months had a zscore variance smaller than one, these monthly models overestimated the actual prediction uncertainty (i.e., the error of prediction the error from the models in cross-validation locations). The same cross validation error metrics of the models for monthly ED did not show equally good performance (Table 6), as most months had moderate results with the exception of summer months that had poor results. These poor results were coming from the scarcity of the stations used and the predictions of high ED values that the models underestimated them (Figure 4b), affecting especially RMSE and R 2 metrics that are sensitive to outliers. Given the values of z-score's variance, in general, also, most of the monthly ED models overestimated the actual prediction uncertainty.

Spatial Models of Precipitation and ED
The training and cross validation of the spatial models was made using the implementation of QRF in the ranger [67] package of language R [82]. The number of trees was set to 1000 and the fine-tuning of the parameters of the model was made using the tuneRanger [83] package.
The cross validation error metrics of the models for monthly precipitation were satisfactory (Table 5), especially comparing them to recent precipitation models for Greece that showed weak correlation during the spring season [17] or equally good estimations [23]. As most months had a z-score variance smaller than one, these monthly models overestimated the actual prediction uncertainty (i.e., the error of prediction the error from the models in cross-validation locations). The same cross validation error metrics of the models for monthly ED did not show equally good performance (Table 6), as most months had moderate results with the exception of summer months that had poor results. These poor results were coming from the scarcity of the stations used and the predictions of high ED values that the models underestimated them (Figure 4b), affecting especially RMSE and R 2 metrics that are sensitive to outliers. Given the values of z-score's variance, in general, also, most of the monthly ED models overestimated the actual prediction uncertainty.

Rainfall Erosivity and Its Uncertenty
The produced maps (Figures 5 and 6) illustrate the spatiotemporal distribution of R in Greece. The eastern, dryer part of the country has lower values than the wetter western part for the period during autumn and winter. During summer the convective activity over norther Greece produces higher R values than southern Greece, with the largest values occurring in the area of Thrace. The monthly temporal patterns illustrated in Figure 5, are compatible to the three areas that were identified in our previous study [30].
The lowest annual values of R were computed at the central mainland area of Greece (west and central Macedonia, Thessaly and Attica). The uncertainty map of annual R (Figure 7) follows the variation of the predicted annual erosivity and the scarcity of stations. The mean annual R (Table 7, Figure 6) has a value of 1039.0 MJ·mm/ha/h/y, with a range between 405.1 and 3160.2 MJ·mm/ha/h/y. The annual mean prediction error standard deviation is 116.9 MJ·mm/ha/h/y and its range is from 46.7 to 353.4 MJ·mm/ha/h/y (Table 7, Figure 7). The ratio of the mean annual error to the mean annual R is 11.25%, a value that was probably overestimated, given the cross-validation results of z-scores (Tables 5,6). These errors are not considered very high, bearing in mind the spatially sparse data and the data-sets limitations. Although these errors may affect the reliability of the assessment of the potential impacts of climate change, their effect is rather small when compared with the large uncertainties in GCMs projections in monthly and daily precipitation and rainfall intensity [84].

Rainfall Erosivity and Its Uncertenty
The produced maps (Figures 5 and 6) illustrate the spatiotemporal distribution of R in Greece. The eastern, dryer part of the country has lower values than the wetter western part for the period during autumn and winter. During summer the convective activity over norther Greece produces higher R values than southern Greece, with the largest values occurring in the area of Thrace. The monthly temporal patterns illustrated in Figure 5, are compatible to the three areas that were identified in our previous study [30].
In the above cited, older, study [31], the spatial distribution and annual values of R were affected by the used interpolated precipitation dataset. As a result, the highest values were calculated at the northwest corner of Greece. However, in this study, the maximum observed precipitation from HNMS's stations were recorded at the mountain range of Pindos, which is, also, the area with the maximum annual R values in our study, indicating that the two variables are consistent.    (Figure 7) follows the variation of the predicted annual erosivity and the scarcity of stations. The mean annual R (Table 7, Figure 6) has a value of 1039.0 MJ·mm/ha/h/y, with a range between 405.1 and 3160.2 MJ·mm/ha/h/y. The annual mean prediction error standard deviation is 116.9 MJ·mm/ha/h/y and its range is from 46.7 to 353.4 MJ·mm/ha/h/y (Table 7, Figure 7). The ratio of the mean annual error to the mean annual R is 11.25%, a value that was probably overestimated, given the cross-validation results of z-scores (Tables 5  and 6). These errors are not considered very high, bearing in mind the spatially sparse data and the data-sets limitations. Although these errors may affect the reliability of the assessment of the potential impacts of climate change, their effect is rather small when compared with the large uncertainties in GCMs projections in monthly and daily precipitation and rainfall intensity [84].

Projected Rainfall Erosivity Changes
In order to cope with the uncertainties in climatic models, five (5) GCMs-RCMs were evaluated in terms of RMSE, R 2 and CCC using the monthly precipitation maps that were created in this study. The EC-EARTH-KNMI-RACMO22E GCM-RCM (Table 8) gave the best results for each one of months and for all the error metrics. Given its performance, this GCM-RCM seemed to represent better monthly precipitation in Greece, for the historical time period and consequently used to estimate futures changes for the RCP forcing scenarios RCP4.5 and RCP8.5 and the years 2040, 2070 and 2100. However, the large errors presented in Table 8 indicate that statistical downscaling techniques have to be applied to the outputs of GCMs-RCMs for bias correction and reduction of errors.  The largest monthly mean R value are observed in November with 182.5 MJ·mm/ha/h/y and in December with 151.5 MJ·mm/ha/h/y and the lowest ones during summer months which have values about 44 MJ·mm/ha/h/y. The computed R values, due to the fact that R is linearly underestimated as the missing values ratio increases [30], were larger than the values reported by Panagos et al. [31] (i.e., the mean annual R was underestimated by 28.8%, the mean minimum by 381% and the maximum by 12%). On the contrary, the range of R values in this study is smaller (405.1-3160.2 MJ·mm/ha/h/y) than the respective values reported in [31] (was 84.2-2825 MJ·mm/ha/h/y).
In the above cited, older, study [31], the spatial distribution and annual values of R were affected by the used interpolated precipitation dataset. As a result, the highest values were calculated at the northwest corner of Greece. However, in this study, the maximum observed precipitation from HNMS's stations were recorded at the mountain range of Pindos, which is, also, the area with the maximum annual R values in our study, indicating that the two variables are consistent.

Projected Rainfall Erosivity Changes
In order to cope with the uncertainties in climatic models, five (5) GCMs-RCMs were evaluated in terms of RMSE, R 2 and CCC using the monthly precipitation maps that were created in this study. The EC-EARTH-KNMI-RACMO22E GCM-RCM (Table 8) gave the best results for each one of months and for all the error metrics. Given its performance, this GCM-RCM seemed to represent better monthly precipitation in Greece, for the historical time period and consequently used to estimate futures changes for the RCP forcing scenarios RCP4.5 and RCP8.5 and the years 2040, 2070 and 2100. However, the large errors presented in Table 8 indicate that statistical downscaling techniques have to be applied to the outputs of GCMs-RCMs for bias correction and reduction of errors. Using the RCP4.5 scenario, the ratio of the mean projected R to current values were −3.7%, −4.3% and −0.7% for 2040, 2070 and 2100, respectively. The same values for the RCP8.5, scenario were −9.0%, +0.01% and −14.6%. Annual R followed the annual ratio δ  Using the RCP4.5 scenario, the ratio of the mean projected R to current values were −3.7%, −4.3% and −0.7% for 2040, 2070 and 2100, respectively. The same values for the RCP8.5, scenario were −9.0%, +0.01% and −14.6%. Annual R followed the annual ratio δ ( ) of projected to 30-years-mean historical precipitation, which fluctuated around zero for RCP4.5 (Figure 8a) and had a negative trend for the years 2040-2100 for RCP4.5. Despite the decrease of the projected R values, using RCP4.5, R, spatially, has a tendency to increase at the central parts of Macedonia and Thessaly that have the highest agricultural productivity in the country and the northern Thrace (Figure 9a). For RCP8.5 (Figure 9b) projected erosivity increases on 2070, having a hotspot at Thessaly and then decreases, following the precipitation trend (Figure 8b). In the previously reported study concerning Europe, that did not use information about rainfall intensities in the applied model, the projected change of R values for Greece on 2050 was +14.8% using the GCM HadGEM2 and RCP4.5 [10]. Despite the decrease of the projected R values, using RCP4.5, R, spatially, has a tendency to increase at the central parts of Macedonia and Thessaly that have the highest agricultural productivity in the country and the northern Thrace (Figure 9a). For RCP8.5 (Figure 9b) projected erosivity increases on 2070, having a hotspot at Thessaly and then decreases, following the precipitation trend (Figure 8b). In the previously reported study concerning Europe, that did not use information about rainfall intensities in the applied model, the projected change of R values for Greece on 2050 was +14.8% using the GCM HadGEM2 and RCP4.5 [10]. The results of projected changes of R in both RCP scenarios are coherent to the precipitation trends that are reported by EURO-CORDEX, in which precipitation decreases by a larger magnitude in RCP8.5 than RCP4.5, from 1971-2000 to 2071-2100 [27]. This change in precipitation dominates in projected R's calculations and is depicted in Figure 9b for the year 2100.

Conclusions
The estimation of mean annual and monthly R values over Greece using precipitation records that suffered from a significant volume of missing values was the main result of this paper, utilizing as an intermediate step the creation of monthly precipitation and ED QRF models. The models of monthly precipitation had better performance than the ones of monthly ED in terms of prediction accuracy, mostly due to the scarcity of stations with calculated ED values. Validating the error of uncertainty reported from the models on cross validation locations, showed that the models, on average, overestimate the actual prediction error (i.e., the error of error predictions). More specifically, the findings of the present study can be summarized as follows: 1. The mean annual R in Greece is 1039 MJ·mm/ha/h/y, with a range between 405.1 and 3160. 2 MJ·mm/ha/h/y, during the historical period 1971-2000. The highest values are calculated at the mountain range of Pindos and the lowest at central Greece's mainland. The results of projected changes of R in both RCP scenarios are coherent to the precipitation trends that are reported by EURO-CORDEX, in which precipitation decreases by a larger magnitude in RCP8.5 than RCP4.5, from 1971-2000 to 2071-2100 [27]. This change in precipitation dominates in projected R's calculations and is depicted in Figure 9b for the year 2100.

Conclusions
The estimation of mean annual and monthly R values over Greece using precipitation records that suffered from a significant volume of missing values was the main result of this paper, utilizing as an intermediate step the creation of monthly precipitation and ED QRF models. The models of monthly precipitation had better performance than the ones of monthly ED in terms of prediction accuracy, mostly due to the scarcity of stations with calculated ED values. Validating the error of uncertainty reported from the models on cross validation locations, showed that the models, on average, overestimate the actual prediction error (i.e., the error of error predictions). More specifically, the findings of the present study can be summarized as follows: 1.
The mean annual R in Greece is 1039 MJ·mm/ha/h/y, with a range between 405.1 and 3160.2 MJ·mm/ha/h/y, during the historical period 1971-2000. The highest values are calculated at the mountain range of Pindos and the lowest at central Greece's mainland.

2.
The calculated monthly mean R values follow the already documented spatiotemporal characteristics of precipitation depth and intensity over the country. 3.
The climatic model EC-EARTH-KNMI-RACMO22E from the Royal Netherlands Meteorological Institute better reproduces the monthly precipitation for the historical period 1971-2000 in Greece than the other four GCMs-RCMs used and tested in this study. 4.
The projected mean annual erosivity, R, as an average over Greece, follows, in general, the projected changes of precipitation from the selected GCM-RCM model but not in a spatially homogenous way.
The results about future values of R inherit a set of uncertainties that have to do with the limitations of climatic models in general and the assumptions about future temporal stationarity of ED that had been made in our study, based on the observed stationarity of ED for the historical time. This holds true, as well as for the calculated current ED values due to missing values from the utilized timeseries. Future research will eventually provide more robust climatic models, as computational power increases and research continues, and hopefully we will also have high quality, high density, observed precipitation data for longer durations and more stations to estimate more accurately current and projected rainfall erosivity in the country.